xpra icon
Bug tracker and wiki

Opened 5 weeks ago

Closed 27 hours ago

Last modified 26 hours ago

#1550 closed defect (fixed)

NVENC7 active but not used?

Reported by: DocMAX Owned by: DocMAX
Priority: major Milestone: 2.1
Component: encodings Version: trunk
Keywords: nvenc Cc:

Description (last modified by Antoine Martin)

I did everything like described on the website to make NVENC7 work. (NVENC8 seems to be not supported yet)
Everything looks like NVENC is up and running, but it seems hardware acceleration is not done. (CPU is at 100% on server, framerate on client is at about 10fps).

Maybe i need license keys? But would i get a notice on xpra start about this?

Versions Client + Server:

xpra v2.1 (svn)
Linux game 4.11.4-1-ARCH #1 SMP PREEMPT Fri Jun 9 07:46:48 CEST 2017 x86_64 GNU/Linux
Geforce GTX 970

Cmdline Server:

xpra start :100 --auth=none --daemon=no --video-encoders=nvenc

Cmdline Client:

xpra attach ssh:game:100

Diags Server:

video.encoding.video-encoder.ffmpeg=disabled
video.encoding.video-encoder.nvenc=active
video.encoding.video-encoder.vpx=disabled
video.encoding.video-encoder.x264=disabled
video.encoding.video-encoder.x265=disabled
2017-06-15 17:50:23,872 NVidia driver version 381.22
2017-06-15 17:50:23,872 NVENC license keys:
2017-06-15 17:50:23,880 * version common: 0 key(s)
2017-06-15 17:50:23,880 * version 7: 0 key(s)
Jun 15 17:54:51 game xpra[14208]: X.Org X Server 1.19.3
Jun 15 17:54:51 game xpra[14208]: Release Date: 2017-03-15
Jun 15 17:54:51 game xpra[14208]: X Protocol Version 11, Revision 0
Jun 15 17:54:51 game xpra[14208]: Build Operating System: Linux 4.9.11-1-ARCH x86_64
Jun 15 17:54:51 game xpra[14208]: Current Operating System: Linux game 4.11.4-1-ARCH #1 SMP PREEMPT Fri Jun 9 07:46:48 CEST 2017 x86_64
Jun 15 17:54:51 game xpra[14208]: Kernel command line: initrd=\kernel\arch\initramfs-linux.img root=LABEL=arch rw intel_iommu=on loglevel=3 modprobe.blacklist=nouveau
Jun 15 17:54:51 game xpra[14208]: Build Date: 07 April 2017  05:42:48PM
Jun 15 17:54:51 game xpra[14208]:  
Jun 15 17:54:51 game xpra[14208]: Current version of pixman: 0.34.0
Jun 15 17:54:51 game xpra[14208]:         Before reporting problems, check http://wiki.x.org
Jun 15 17:54:51 game xpra[14208]:         to make sure that you have the latest version.
Jun 15 17:54:51 game xpra[14208]: Markers: (--) probed, (**) from config file, (==) default setting,
Jun 15 17:54:51 game xpra[14208]:         (++) from command line, (!!) notice, (II) informational,
Jun 15 17:54:51 game xpra[14208]:         (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
Jun 15 17:54:51 game xpra[14208]: (++) Log file: "/var/run/user/1000/xpra/Xorg.:100.log", Time: Thu Jun 15 17:54:51 2017
Jun 15 17:54:51 game xpra[14208]: (++) Using config file: "/etc/xpra/xorg.conf"
Jun 15 17:54:51 game xpra[14208]: (==) Using system config directory "/usr/share/X11/xorg.conf.d"
Jun 15 17:54:51 game xpra[14208]: Warning: some of the sockets are in an unknown state:
Jun 15 17:54:51 game xpra[14208]:  /run/user/1000/xpra/game-100
Jun 15 17:54:51 game xpra[14208]:  /var/run/xpra/game-100
Jun 15 17:54:51 game xpra[14208]:  please wait as we allow the socket probing to timeout
Jun 15 17:54:57 game xpra[14208]: created unix domain socket: /run/user/1000/xpra/game-100
Jun 15 17:54:57 game xpra[14208]: created unix domain socket: /var/run/xpra/game-100
Jun 15 17:54:58 game xpra[14208]: Warning: webcam forwarding is disabled
Jun 15 17:54:58 game xpra[14208]:  the virtual video directory '/sys/devices/virtual/video4linux' was not found
Jun 15 17:54:58 game xpra[14208]:  make sure that the 'v4l2loopback' kernel module is installed and loaded
Jun 15 17:54:58 game xpra[14208]: found 0 virtual video devices for webcam forwarding
Jun 15 17:54:58 game xpra[14208]: pulseaudio server started with pid 14284
Jun 15 17:54:58 game xpra[14208]: GStreamer version 1.12.0 for Python 2.7.13 64-bit
Jun 15 17:54:58 game xpra[14208]: D-Bus notification forwarding is available
Jun 15 17:54:58 game xpra[14208]: xpra X11 version 2.1 64-bit
Jun 15 17:54:58 game xpra[14208]:  uid=1000 (docmax), gid=100 (users)
Jun 15 17:54:58 game xpra[14208]:  running with pid 14208 on Linux
Jun 15 17:54:58 game xpra[14208]:  connected to X11 display :100 with 24 bit colors
Jun 15 17:54:58 game xpra[14208]: xpra is ready.
Jun 15 17:54:59 game xpra[14208]: printer forwarding enabled using postscript and pdf
Jun 15 17:54:59 game xpra[14208]: 11.8GB of system memory

Change History (17)

comment:1 Changed 5 weeks ago by DocMAX

Resolution: invalid
Status: newclosed

looking in -d loader looks like the module pycuda is missing, just because in arch it doesnt work with cuda-7.5. will try further..

comment:2 Changed 5 weeks ago by DocMAX

NVEC is now initialized successfully.
But still my client is way too slow!
Whats the problem here?

comment:3 Changed 5 weeks ago by DocMAX

Checked the log again... do i really need a license key? Or is it a bug? How do i know?

2017-06-16 01:21:32,987 pycuda_info
2017-06-16 01:21:32,987 CUDA initialization (this may take a few seconds)
2017-06-16 01:21:33,093 CUDA 8.0.0 / PyCUDA 2017.1, found 1 device:
2017-06-16 01:21:33,093   + GeForce GTX 970 @ 0000:01:00.0 (memory: 93% free, compute: 5.2)
2017-06-16 01:21:33,122 * version                         : 2017.1
2017-06-16 01:21:33,122   - text                          : 2017.1
2017-06-16 01:21:33,122 cuda_info
2017-06-16 01:21:33,122 * driver
2017-06-16 01:21:33,122   - driver_version                : 8000
2017-06-16 01:21:33,122   - version                       : 8.0.0
2017-06-16 01:21:33,122 preferences:
2017-06-16 01:21:33,122 * blacklist                       : GTX 10

2017-06-16 01:17:36,174 init_cuda failed
Traceback (most recent call last):
  File "xpra/codecs/nvenc7/encoder.pyx", line 1492, in xpra.codecs.nvenc7.encoder.Encoder.init_context (xpra/codecs/nvenc7/encoder.c:12593)
  File "xpra/codecs/nvenc7/encoder.pyx", line 1657, in xpra.codecs.nvenc7.encoder.Encoder.init_nvenc (xpra/codecs/nvenc7/encoder.c:17379)
  File "xpra/codecs/nvenc7/encoder.pyx", line 1672, in xpra.codecs.nvenc7.encoder.Encoder.init_encoder (xpra/codecs/nvenc7/encoder.c:17676)
  File "xpra/codecs/nvenc7/encoder.pyx", line 1335, in xpra.codecs.nvenc7.encoder.raiseNVENC (xpra/codecs/nvenc7/encoder.c:9505)
NVENCException: initializing encoder - returned 8: This indicates that one or more of the parameter passed to the API call is invalid.
2017-06-16 01:17:36,174 encoder nvenc(BGRA/BGRX/H264 - low-latency-hq - 1920x1080) failed: initializing encoder - returned 8: This indicates that one or more of the parameter passed to the API call is invalid.
2017-06-16 01:17:36,174 error during NVENC encoder test: initializing encoder - returned 8: This indicates that one or more of the parameter passed to the API call is invalid.
2017-06-16 01:17:36,174  a license key may be required

comment:4 Changed 5 weeks ago by Antoine Martin

Component: androidencodings
Description: modified (diff)

Looks to me like this bug: ticket:1260#comment:16, we blacklisted GTX 10x0 cards - looks like this now affects other cards. Using an older / newer driver version may help.

(or maybe you do need a license key - unlikely, I think you should always get 2 contexts on consumer cards)

comment:5 Changed 5 weeks ago by Antoine Martin

Oh, and btw, xpra v2.1 (svn) is not a full version number, always include the full version with the exact svn revision.

comment:6 in reply to:  5 Changed 5 weeks ago by DocMAX

well, its the result of xpra --version. i dont know where else i can see the build number.

Last edited 5 weeks ago by DocMAX (previous) (diff)

comment:7 Changed 5 weeks ago by Antoine Martin

well, its the result of xpra --version. i dont know where else i can see the build number.

When building from an svn checkout, the svnrevision should be included automatically. (captured from the output of svnversion where the code is built)
When using packages, the svn version should be included already and in any case it is included in the package filename. See wiki/ReportingBugs

Last edited 5 weeks ago by Antoine Martin (previous) (diff)

comment:8 Changed 5 weeks ago by Antoine Martin

Resolution: invalid
Status: closedreopened

NVENC SDK8 supported added in #1552 - this makes no difference as this new version adds almost nothing.

Will test using my GTX 970 when I get back.

comment:9 Changed 4 weeks ago by DocMAX

any updates?
i'm stuck with

error during NVENC encoder test: initializing encoder - returned 8: This indicates that one or more of the parameter passed to the API call is invalid.

comment:10 Changed 4 weeks ago by DocMAX

oh, and i'm at r16116

Last edited 6 days ago by Antoine Martin (previous) (diff)

comment:11 Changed 4 days ago by Aynur Shakirov

xpra 16362: the problem still exist, tested with Nvidia Quadro M2000 (driver version is 381.22) and nvenc v8. H265 codec is successfully initialize, but h264 initialization is fail.

^[[36m2017-07-16 14:44:45,201 get_preset(H264) speed=100, quality=50, lossless=False, pixel_format=BGRX, options={160: [('hq', '34DBA71D-A77B-4B8F-9C3E-B6
^[[36m2017-07-16 14:44:45,201 using preset 'low-latency-hq' for speed=100, quality=50, lossless=0, pixel_format=BGRX^[[0m
^[[36m2017-07-16 14:44:45,201 init_params(H264) using preset=low-latency-hq^[[0m
^[[36m2017-07-16 14:44:45,201 9 input format types:^[[0m
^[[36m2017-07-16 14:44:45,201 * 0x1^[[0m
^[[36m2017-07-16 14:44:45,201  + 0x1 : NV12_PL^[[0m
^[[36m2017-07-16 14:44:45,201 * 0x10^[[0m
^[[36m2017-07-16 14:44:45,201  + 0x10 : YV12_PL^[[0m
^[[36m2017-07-16 14:44:45,201 * 0x100^[[0m
^[[36m2017-07-16 14:44:45,201  + 0x100 : IYUV_PL^[[0m
^[[36m2017-07-16 14:44:45,201 * 0x1000^[[0m
^[[36m2017-07-16 14:44:45,202  + 0x1000 : YUV444_PL^[[0m
^[[36m2017-07-16 14:44:45,202 * 0x1000000^[[0m
^[[36m2017-07-16 14:44:45,202  + 0x1000000 : ARGB^[[0m
^[[36m2017-07-16 14:44:45,202 * 0x10000000^[[0m
^[[36m2017-07-16 14:44:45,202  + 0x10000000 : ABGR^[[0m
^[[36m2017-07-16 14:44:45,202 * 0x4000000^[[0m
^[[36m2017-07-16 14:44:45,202  + 0x4000000 : AYUV^[[0m
^[[36m2017-07-16 14:44:45,202 * 0x2000000^[[0m
^[[36m2017-07-16 14:44:45,202  + 0x2000000 : ARGB10^[[0m
^[[36m2017-07-16 14:44:45,202 * 0x20000000^[[0m
^[[36m2017-07-16 14:44:45,202  + 0x20000000 : ABGR10^[[0m
^[[36m2017-07-16 14:44:45,208 init_cuda failed
Traceback (most recent call last):
  File "xpra/codecs/nvenc/encoder.pyx", line 1510, in xpra.codecs.nvenc.encoder.Encoder.init_context (xpra/codecs/nvenc/encoder.c:12595)
  File "xpra/codecs/nvenc/encoder.pyx", line 1675, in xpra.codecs.nvenc.encoder.Encoder.init_nvenc (xpra/codecs/nvenc/encoder.c:17376)
  File "xpra/codecs/nvenc/encoder.pyx", line 1690, in xpra.codecs.nvenc.encoder.Encoder.init_encoder (xpra/codecs/nvenc/encoder.c:17673)
  File "xpra/codecs/nvenc/encoder.pyx", line 1353, in xpra.codecs.nvenc.encoder.raiseNVENC (xpra/codecs/nvenc/encoder.c:9507)
NVENCException: initializing encoder - returned 8: This indicates that one or more of the parameter passed to the API call is invalid.^[[0m
^[[36m2017-07-16 14:44:45,209 encoder nvenc(BGRA/BGRX/H264 - low-latency-hq - 1920x1080) failed: initializing encoder - returned 8: This indicates that on
^[[33m2017-07-16 14:44:45,209 error during NVENC encoder test: initializing encoder - returned 8: This indicates that one or more of the parameter passed
^[[36m2017-07-16 14:44:45,209  a license key may be required^[[0m

comment:12 Changed 3 days ago by Antoine Martin

I still get complete system lockups with my GTX 1070.
It's been broken for months, time to get on it.
Other NVENC tickets we should close for 2.1: #1519, #1347, #1317.


Looking at the cuda example that comes with nvenc 8, this is how they do it now in pseudo-code:

  • InitCuda
    • cuDeviceGet
    • cuCtxCreate
    • cuModuleLoadDataEx
    • cuModuleGetFunction "InterleaveUV"
    • cuCtxPopCurrent
  • AllocateIOBuffers:
    • cuMemAlloc * 2
    • cuMemAllocHost * 3 (input buffers)
    • then for each encode buffer:
      • cuMemAllocPitch
      • NvEncRegisterResource
      • NvEncCreateBitstreamBuffer
  • ReleaseIOBuffers:
    • cuMemFree
    • cuMemFreeHost
    • for each encode buffer:
      • NvEncUnregisterResource
      • NvEncDestroyBitstreamBuffer
  • FlushEncoder:
    • NvEncFlushEncoderQueue
    • NvEncUnmapInputResource
    • wait for any pending buffers using {{{ProcessOutput
  • Deinitialize
    • NvEncDestroyEncoder
  • ConvertYUVToNV12:
    • cuMemcpyHtoD
    • cuLaunchKernel
  • EncodeMain:
    • InitCuda
    • GetPresetGUID
    • AllocateIOBuffers
    • for each frame:
      • load the frame's YUV pixel data from file
      • ConvertYUVToNV12
      • NvEncMapInputResource
      • NvEncEncodeFrame
    • FlushEncoder
    • Deinitialize
  • ProcessOutput:
    • nvEncLockBitstream
    • nvEncUnlockBitstream

The only major differences that I can see:

  • they use multiple input buffers and if there isn't one available, they wait for one to become free via ProcessOutput + NvEncUnmapInputResource
  • maybe somehow we're not using a low-latency preset and so we block waiting for a frame that never comes since we feed them one at a time

The low latency example sets these options:

    encodeConfig.endFrameIdx = INT_MAX;
    encodeConfig.bitrate = 5000000;
    encodeConfig.rcMode = NV_ENC_PARAMS_RC_2_PASS_QUALITY;
    encodeConfig.gopLength = NVENC_INFINITE_GOPLENGTH;
    encodeConfig.deviceType = 0;
    encodeConfig.codec = NV_ENC_H264;
    encodeConfig.fps = 30;
    encodeConfig.qp = 28;
    encodeConfig.i_quant_factor = DEFAULT_I_QFACTOR;
    encodeConfig.b_quant_factor = DEFAULT_B_QFACTOR;  
    encodeConfig.i_quant_offset = DEFAULT_I_QOFFSET;
    encodeConfig.b_quant_offset = DEFAULT_B_QOFFSET; 
    encodeConfig.presetGUID = NV_ENC_PRESET_LOW_LATENCY_HQ_GUID;
    encodeConfig.pictureStruct = NV_ENC_PIC_STRUCT_FRAME;
    encodeConfig.numB = 0;
    m_stCreateEncodeParams.encodeGUID = inputCodecGUID;
    m_stCreateEncodeParams.presetGUID = pEncCfg->presetGUID;
    m_stCreateEncodeParams.encodeWidth = pEncCfg->width;
    m_stCreateEncodeParams.encodeHeight = pEncCfg->height;

    m_stCreateEncodeParams.darWidth = pEncCfg->width;
    m_stCreateEncodeParams.darHeight = pEncCfg->height;
    m_stCreateEncodeParams.frameRateNum = pEncCfg->fps;
    m_stCreateEncodeParams.frameRateDen = 1;
    m_stCreateEncodeParams.enableEncodeAsync = 0;

    m_stCreateEncodeParams.enablePTD = 1;
    m_stCreateEncodeParams.reportSliceOffsets = 0;
    m_stCreateEncodeParams.enableSubFrameWrite = 0;
    m_stCreateEncodeParams.encodeConfig = &m_stEncodeConfig;
    m_stCreateEncodeParams.maxEncodeWidth = m_uMaxWidth;
    m_stCreateEncodeParams.maxEncodeHeight = m_uMaxHeight;
    m_stEncodeConfig.gopLength = pEncCfg->gopLength;
    m_stEncodeConfig.frameIntervalP = pEncCfg->numB + 1;
        m_stEncodeConfig.frameFieldMode = NV_ENC_PARAMS_FRAME_FIELD_MODE_FRAME;

For YUV444:

            m_stEncodeConfig.encodeCodecConfig.hevcConfig.chromaFormatIDC = 3;
#OR:
            m_stEncodeConfig.encodeCodecConfig.h264Config.chromaFormatIDC = 3;

For 10 bit input:

            m_stEncodeConfig.encodeCodecConfig.h264Config.chromaFormatIDC = 3;

etc..

Modifiying the cuda example to print all method calls I see:

$ ./NvEncoderCudaInterop -i /opt/Shared/Xpra-Build-Libs/nvenc_4.0.0_sdk/Samples/YUV/1080p/PixelBlur-1920x1080.yuv  -o test -size 1920 1080 -numB 0
Encoding input           : "/opt/Shared/Xpra-Build-Libs/nvenc_4.0.0_sdk/Samples/YUV/1080p/PixelBlur-1920x1080.yuv"
         output          : "test"
         codec           : "H264"
         size            : 1920x1080
         bitrate         : 5000000 bits/sec
         vbvMaxBitrate   : 0 bits/sec
         vbvSize         : 0 bits
         fps             : 30 frames/sec
         rcMode          : CONSTQP
         goplength       : INFINITE GOP 
         B frames        : 0 
         QP              : 28 
         preset          : DEFAULT

BufferCount : 1 
AsyncMode   : 0 
AllocateIOBuffers
loadframe
ConvertYUVToNV12
NvEncMapInputResource
NvEncEncodeFrame
loadframe
no encode buffer, calling ProcessOutput
ProcessOutput: nvEncLockBitstream
ProcessOutput: nvEncUnlockBitstream
NvEncUnmapInputResource
ConvertYUVToNV12
NvEncMapInputResource
NvEncEncodeFrame
loadframe
no encode buffer, calling ProcessOutput
ProcessOutput: nvEncLockBitstream
ProcessOutput: nvEncUnlockBitstream
NvEncUnmapInputResource
ConvertYUVToNV12
NvEncMapInputResource
NvEncEncodeFrame
loadframe

etc...

loadframe
ProcessOutput: nvEncLockBitstream
ProcessOutput: nvEncUnlockBitstream
Encoded 116 frames in 293.04ms
Avergage Encode Time :   2.53ms
Last edited 3 days ago by Antoine Martin (previous) (diff)

comment:13 Changed 3 days ago by Antoine Martin

Well, this has taken many hours and something like ~20 to 30 full system lockups followed by reboots and lots of swearing.
In the end, the bug is clearly an underflow in the nvidia API, just like we saw when HEVC support was added (which also cost me hours of wasted time back then): ticket:1046#comment:6.

Fixes in:

  • r16394: minor, load license key file matching API version
  • r16395: default to 30 fps
  • r16396: raise minimum codec size to 128x128, other minor updates
  • r16397: remove pascal cards from the blacklist

Most of this should be backported.
We still get this error for some codec / settings combinations:

xpra.codecs.nvenc.encoder.NVENCException: initializing encoder - returned 8:
This indicates that one or more of the parameter passed to the API call is invalid.

But at least now I stand a chance of being to fix it.

Last edited 3 days ago by Antoine Martin (previous) (diff)

comment:14 Changed 2 days ago by Antoine Martin

Owner: changed from Antoine Martin to DocMAX
Status: reopenednew

Now for the updates and proper fixes:

@DocMAX: please close if this works for you.

comment:15 Changed 36 hours ago by Antoine Martin

More fixes (backporting this mess is not going to be easy - might just go for the easy option: just disable most of nvenc and recommend the newer version):

  • r16405: selftest would fail! (doh)
  • r16406: disabling YUV420P would cause errors
  • r16409: changes in quality could cause visual corruption (wrongly used the old CUDA kernel with the new pixel format..)

comment:16 Changed 27 hours ago by Antoine Martin

Resolution: fixed
Status: newclosed

Fixed and tested, see ticket:1519#comment:3

comment:17 Changed 26 hours ago by Antoine Martin

Tedious backporting to older branches done and tested in r16416.
We don't support HEVC in older branches (easier), use 2.1 or later if you need this.

Note: See TracTickets for help on using tickets.