Xpra: Ticket #595: latest nvidia drivers break YUV444 encoding

Not sure which versions work and which versions do not (this seems to correspond to the changes which also changed the list of license keys, see NVENC developer key?), so r6741 allows us to turn it off via the XPRA_NVENC_YUV444P=0 env var.

Since the YUV444P mode is completely undocumented, and nvidia developers even claimed that it wasn't possible to use it (it is, well it was...) - this is not going to be fun.



Sun, 15 Jun 2014 13:34:48 GMT - Antoine Martin: owner, status changed

r6781 added nvidia driver version logging to make it easier to instantly see which version is loaded. Maybe we can also use it to give warnings about which versions are known to break and / or disable YUV444 when we know it isn't going to work.

More tests with Fedora 20, kernel 3.14.7-200.fc20.x86_64 and a GTX 760 OC:


Mon, 16 Jun 2014 04:45:59 GMT - Antoine Martin: status changed; resolution set

r6811 disables YUV444 when we detect a buggy nvidia driver version. Can also be re-enabled via env var if needed.


Tue, 17 Jun 2014 16:19:22 GMT - Antoine Martin: status changed; resolution deleted

Looks like other versions break... re-opening.


Tue, 17 Jun 2014 16:21:11 GMT - Antoine Martin: owner, status changed

r6831 blacklists some more versions and disables YUV444 mode.

Please check:

We may need to blacklist some more versions..


Thu, 19 Jun 2014 20:53:06 GMT - Smo:

Confirmed

331.79 Works with new key 340.17 Fails completely with same error

Will test with 331.79 for now


Thu, 17 Jul 2014 20:08:36 GMT - Antoine Martin:

smo: 331.79 is not as good as older versions: no YUV444...


Tue, 29 Jul 2014 10:56:40 GMT - Antoine Martin: priority changed

(raising: blocker for release)


Mon, 04 Aug 2014 22:14:03 GMT - Smo:

Testing some new drivers from

http://www.nvidia.ca/object/unix.html

Trying

340.24

Produces this output

2014-08-04 18:12:57,188 nvenc: found nvidia kernel module version 340.24
2014-08-04 18:12:57,206 CUDA initialization (this may take a few seconds)
2014-08-04 18:12:59,396 CUDA 6.0.0 / PyCUDA 2013.1.1, found 2 device(s):
2014-08-04 18:12:59,564   + GeForce GTX 750 Ti @ 0000:83:00.0 (memory: 98% free, compute: 5.0)
2014-08-04 18:12:59,689   + GeForce GTX 650 @ 0000:09:00.0 (memory: 97% free, compute: 3.0)
2014-08-04 18:13:00,111 pulseaudio server started with pid 5087
2014-08-04 18:13:00,120 started child 'xterm -fg white -bg black' with pid 5089
2014-08-04 18:13:00,148 xpra server version 0.14.0 (r7043)
2014-08-04 18:13:00,148 running with pid 5049
2014-08-04 18:13:00,368 xpra is ready.
2014-08-04 18:13:00,546 failed to get preset config for default (6BC82762-4E63-4CA4-AA85-1E50F321F6BF / B2DFB705-4EBD-4C49-9B5F-24A777D3E587): This indicates that an invalid struct version was used by the client.
2014-08-04 18:13:00,546 failed to get preset config for low-latency (6BC82762-4E63-4CA4-AA85-1E50F321F6BF / 49DF21C5-6DFA-4FEB-9787-6ACC9EFFB726): This indicates that an invalid struct version was used by the client.
2014-08-04 18:13:00,546 failed to get preset config for hp (6BC82762-4E63-4CA4-AA85-1E50F321F6BF / 60E4C59F-E846-4484-A56D-CD45BE9FDDF6): This indicates that an invalid struct version was used by the client.
2014-08-04 18:13:00,547 failed to get preset config for hq (6BC82762-4E63-4CA4-AA85-1E50F321F6BF / 34DBA71D-A77B-4B8F-9C3E-B6D5DA24C012): This indicates that an invalid struct version was used by the client.
2014-08-04 18:13:00,547 failed to get preset config for bd (6BC82762-4E63-4CA4-AA85-1E50F321F6BF / 82E3E450-BDBB-4E40-989C-82A90DF9EF32): This indicates that an invalid struct version was used by the client.
2014-08-04 18:13:00,547 failed to get preset config for low-latency-hq (6BC82762-4E63-4CA4-AA85-1E50F321F6BF / C5F733B9-EA97-4CF9-BEC2-BF78A74FD105): This indicates that an invalid struct version was used by the client.
2014-08-04 18:13:00,547 failed to get preset config for low-latency-hp (6BC82762-4E63-4CA4-AA85-1E50F321F6BF / 67082A44-4BAD-48FA-98EA-93056D150A58): This indicates that an invalid struct version was used by the client.
2014-08-04 18:13:00,547 failed to get preset config for None (6BC82762-4E63-4CA4-AA85-1E50F321F6BF / 7ADD423D-D035-4F6F-AEA5-50885658643C): This indicates that an invalid struct version was used by the client.
2014-08-04 18:13:00,547 nvenc: found some unknown presets: 7ADD423D-D035-4F6F-AEA5-50885658643C
2014-08-04 18:13:00,548 failed to get preset config for hp (6BC82762-4E63-4CA4-AA85-1E50F321F6BF / 60E4C59F-E846-4484-A56D-CD45BE9FDDF6): This indicates that an invalid struct version was used by the client.
2014-08-04 18:13:00,583 Warning: nvenc video encoder failed: could not find preset hp

Wed, 06 Aug 2014 19:08:31 GMT - Smo:

Tried today with new beta driver 343.13 on CentOS 7

Identical messages as comment:8


Sun, 17 Aug 2014 14:19:55 GMT - Antoine Martin: status changed; resolution set

I believe the version blacklisting works.

If we find newer versions of the driver that need to be added to the blacklist, add them here.


Tue, 19 Aug 2014 03:32:12 GMT - Antoine Martin: milestone changed

Was done for milestone 0.14


Thu, 21 Aug 2014 03:54:59 GMT - Antoine Martin:

Worth noting that the CUDA 6.5 SDK is not compatible with the 331.79 drivers (6.0 is fine).

You get:

UserWarning: Failed to import the CUDA driver interface, with an error message  \
indicating that the version of your CUDA header does not match the version of your CUDA driver

Mon, 13 Oct 2014 04:56:38 GMT - Antoine Martin:

My findings so far:

Before I put this on the wiki somewhere, I want to record my testing with all the combinations of CUDA, kernel, drivers, SDK... compatibility issues:


For testing, use the test_nvenc3 and test_nvenc4 scripts.

Note: pycuda must be rebuilt against the version of CUDA being tested - which takes time..

As of r7944, we can build both nvenc modules (v3 and v4) at the same time with:

./setup.py --with-nvenc3 --with-nvenc3

Which should be auto detected already since you will need the pkgconfig file for each SDK. To choose which CUDA SDK to build against use:

./setup.py --with-cuda=6.5

You will need at least 6.5 to build the nvenc4 module, nvenc3 works with 5.0 onwards.


Testing with a GTX 750 Ti.


Testing with a GTX 760 OC:


Will edit this ticket more after reboots and code updates..


Sat, 23 Jan 2021 05:00:23 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/595