xpra icon
Bug tracker and wiki

Opened 6 years ago

Closed 6 years ago

Last modified 6 years ago

#384 closed enhancement (fixed)

cuda csc

Reported by: Antoine Martin Owned by: SmO
Priority: major Milestone: 0.11
Component: core Version:
Keywords: csc Cc:

Description (last modified by Antoine Martin)

Pointers:

See also:

Note: on Fedora 18, you need this fix

Attachments (3)

csc_nvcuda-v7.patch (46.3 KB) - added by Antoine Martin 6 years ago.
work in progress pycuda code: done rgb to yuv
csc_nvcuda-v8.patch (50.8 KB) - added by Antoine Martin 6 years ago.
works both ways now
csc_nvcuda-withcustomkernel.patch (33.4 KB) - added by Antoine Martin 6 years ago.
use code similar to #370 (custom kernels) instead of the useless nvidia npp

Download all attachments as: .zip

Change History (17)

comment:1 Changed 6 years ago by Antoine Martin

Description: modified (diff)

comment:2 Changed 6 years ago by ahuillet

As of r3943 this is integrated in Xpra. Needed:

  • setup.py uses hardcoded paths on my distro
  • video pipeline module doesn't pick up nvcuda at all, or perhaps not a lot - anyway, it doesn't seem to be called as often as it should
  • performance seems OK
  • no crash on my Geforce 9800GT, further tests required
  • disabled by default, obviously

comment:3 Changed 6 years ago by Antoine Martin

r3932 added hardcoded paths, r3961 replaces this with a call to pkgconfig, just place your machine/distro specific paths in that file instead, here is mine for CUDA 5.5 on x86_64 with an install prefix of /opt/cuda (symlinked from /opt/cuda-5.5):

prefix=/opt
exec_prefix=${prefix}
libdir=/opt/cuda/lib64
includedir=/opt/cuda/include

Name: cuda
Description: CUDA
Version: 1.0
Requires: 
Conflicts:
Libs: -L${libdir} -L/usr/lib64/nvidia/ -lcuda -lnppc -lnppi -lnpps -lcudart
Cflags: -I${includedir}

Looks to me like this another module that should probably use an existing glue layer (PyCUDA). The NPP functions can be called as documented in NPP and OpenCV webcam app in Python

Last edited 6 years ago by Antoine Martin (previous) (diff)

Changed 6 years ago by Antoine Martin

Attachment: csc_nvcuda-v7.patch added

work in progress pycuda code: done rgb to yuv

Changed 6 years ago by Antoine Martin

Attachment: csc_nvcuda-v8.patch added

works both ways now

comment:4 Changed 6 years ago by Antoine Martin

pycuda version replaces this code in r4269, adding benchmark data to wiki/CSC

Changed 6 years ago by Antoine Martin

use code similar to #370 (custom kernels) instead of the useless nvidia npp

comment:5 Changed 6 years ago by Antoine Martin

Note: this is broken at present, but can be made to work using similar code to the one used in #370 - see patch above.
The common cuda/kernel bits should be moved to a cuda support module.

comment:6 Changed 6 years ago by Antoine Martin

r4429 uses pycuda and custom kernels.

This isn't very fast and I'm not sure it even converts properly (mostly untested), but at least it runs and can be fixed, unlike the NPP version.

It is much slower than the opencl version (#422) - see wiki/CSC:

init_cuda(0) compiling kernel RGB_to_YUV444P
convert_image(<class 'xpra.codecs.image_wrapper.ImageWrapper'>\
    (RGBX:(0, 0, 2560, 1600, 32):PACKED)) planes=0, pixels=<type 'bytearray'>, size=16384000
allocation and upload took 6.9ms
RGB_to_YUV444P took 14.9ms
read back took 9.6ms, total time: 31.8
convert_image(<class 'xpra.codecs.image_wrapper.ImageWrapper'>\
    (RGBX:(0, 0, 2560, 1600, 32):PACKED)) planes=0, pixels=<type 'bytearray'>, 
size=16384000
allocation and upload took 5.9ms
RGB_to_YUV444P took 14.7ms
read back took 11.1ms, total time: 32.2

RGBX    to YUV444P at  2560x1600        : 22 MPixels/s
Last edited 6 years ago by Antoine Martin (previous) (diff)

comment:7 Changed 6 years ago by Antoine Martin

For TLS issues (if any), see ticket:422#comment:12.

comment:8 Changed 6 years ago by Antoine Martin

Updated pkgconfig file (we no longer need npp - if someone needs it, it should probably go in its own pkgconfig file):

prefix=/opt
exec_prefix=${prefix}
libdir=/opt/cuda/lib64
includedir=/opt/cuda/include

Name: cuda
Description: CUDA
Version: 1.0
Requires: 
Conflicts:
Libs: -L${libdir} -L/usr/lib64/nvidia/ -lcuda -lcudart
Cflags: -I${includedir}

comment:9 Changed 6 years ago by Antoine Martin

  • r4726 implements kernel pre-compilation and caching
  • r4695 adds device info to "xpra info"
  • r4736 ensures we don't access the device without the context
  • r4737 + r4738: misc fixes

comment:10 Changed 6 years ago by Antoine Martin

Owner: changed from ahuillet to SmO

Minor fix in r4910.

Just like #422: looks good to me and the documentation is here: wiki/CSC

Any feedback or updated performance data?

Does this work if you suspend-resume the PC? (see ticket:422#comment:18)

Last edited 6 years ago by Antoine Martin (previous) (diff)

comment:11 Changed 6 years ago by Smo

Resolution: fixed
Status: newclosed

Works good for me closing the ticket for now.

comment:12 Changed 6 years ago by Antoine Martin

This module never worked properly (I get far too much corruption and random junk) and has now been removed in r6114.

I should probably try to get some fame and post an exploit showing how to use CUDA to read the browser's currently rendered page data as pixels (works with google chrome!)

comment:13 Changed 6 years ago by alas

Is it feasible to wait until there's a contest, or something, with prizes to go with the posting? Might as well get a little fortune as well, no?

comment:14 Changed 6 years ago by Antoine Martin

Not a bad idea. Sadly, nvidia (see http://www.nvidia.com/object/product-security.html) is one of the few companies that do not have a bug bounty program.

If I have time, I'll send a quick post to the new full disclosure list instead.

Note: See TracTickets for help on using tickets.