xpra icon
Bug tracker and wiki

Opened 3 years ago

Closed 2 years ago

Last modified 15 months ago

#832 closed task (fixed)

libvpx 1.4 support: add YUV444, etc

Reported by: Antoine Martin Owned by: Antoine Martin
Priority: critical Milestone: 0.15
Component: encodings Version: trunk
Keywords: Cc:

Description

See https://groups.google.com/a/webmproject.org/forum/#!topic/codec-devel/2zYWenmdUM8.

Of particular interest to us:

The downloads seem to have moved (again) here: http://downloads.webmproject.org/releases/webm/index.html

Attachments (9)

vp9-yuv444p.patch (18.4 KB) - added by Antoine Martin 3 years ago.
large work in progress patch: we need to support different colorspaces per encoding, which is an API change..
ticket832-Encodings.txt (414 bytes) - added by alas 2 years ago.
vp9 test-Encodings win32 client v. fedora 20 server 0.15.4
ticket832-Network.txt (1.2 KB) - added by alas 2 years ago.
vp9 tests network win32 0.15.4 client v. fedora 20 0.15.4 server
ticket832-OpenGL.txt (5.3 KB) - added by alas 2 years ago.
vp9 tests opengl win32 0.15.4 client v. fedora 20 0.15.4 server
ticket832-Server_Info.txt (138.6 KB) - added by alas 2 years ago.
vp9 tests server-info win32 0.15.4 client v. fedora 20 0.15.4 server
ticket832-System.txt (11.3 KB) - added by alas 2 years ago.
vp9 tests system win32 0.15.4 client v. fedora 20 0.15.4 server
ticket832_server-d-encoding-log-output.txt (47.7 KB) - added by alas 2 years ago.
server -d encoding output, working vp9 on fedora 21 server, distortions of lossless
fedora21-comment-15_Server-Info.txt (139.9 KB) - added by alas 2 years ago.
fedora 21 server, comment 15, server_info
vp9-x264-x265-encoding-speed-1024x752.png (81.3 KB) - added by Antoine Martin 2 years ago.
vp9 x264 x265 encoding speed

Download all attachments as: .zip

Change History (31)

comment:1 Changed 3 years ago by Antoine Martin

Status: newassigned

Building on OSX with:

./configure   --enable-vp8 --enable-vp9 \
    --enable-realtime-only --enable-runtime-cpu-detect
    --prefix=${JHBUILD_PREFIX} --target=x86-darwin9-gcc

Gives errors to do with sse2 / ssse3 assembler optimizations (similar to this one: disabled sse2 link errors):

    [LD] test_libvpx
Undefined symbols:
  "_vp9_sub_pixel_variance4x4_ssse3", referenced from:
      _vp9_sub_pixel_variance4x4_ssse3$non_lazy_ptr in variance_test.cc.o
...
  "_vp9_sub_pixel_variance8x4_sse2", referenced from:
      _vp9_sub_pixel_variance8x4_sse2$non_lazy_ptr in variance_test.cc.o
ld: symbol(s) not found
collect2: ld returned 1 exit status
make[1]: *** [test_libvpx] Error 1

Removing vp9 support, which isn't very useful anyway, still gives errors on vp8:

    [LD] test_libvpx
Undefined symbols:
  "_vp8_denoiser_filter_uv_c", referenced from:
      (anonymous namespace)::VP8DenoiserTest_BitexactCheck_Test::TestBody()in vp8_denoiser_sse2_test.cc.o
  "_vp8_denoiser_filter_uv_sse2", referenced from:
      (anonymous namespace)::VP8DenoiserTest_BitexactCheck_Test::TestBody()in vp8_denoiser_sse2_test.cc.o
  "_vp8_regular_quantize_b_sse4_1", referenced from:
      _vp8_regular_quantize_b_sse4_1$non_lazy_ptr in quantize_test.cc.o
  "_vpx_internal_error", referenced from:
      VP8_TestBitIO_Test::TestBody()      in vp8_boolcoder_test.cc.o
ld: symbol(s) not found
collect2: ld returned 1 exit status
make[1]: *** [test_libvpx] Error 1
make: *** [.DEFAULT] Error 2

And switching from yasm to nasm gives new errors:

[AS] vp8/common/x86/subpixel_mmx.asm.o
vp8/common/x86/subpixel_mmx.asm:233: error: beroset-p-637-invalid effective address
vp8/common/x86/subpixel_mmx.asm:389: error: beroset-p-637-invalid effective address
vp8/common/x86/subpixel_mmx.asm:544: error: beroset-p-637-invalid effective address
make[1]: *** [vp8/common/x86/subpixel_mmx.asm.o] Error 1
make: *** [.DEFAULT] Error 2

I think my 10.5.x build system is just getting too old for this.


Built on win32 (see ticket:440#comment:4) using:

cd libvpx-v1.3.0
./configure \
    --enable-vp8 --enable-vp9 \
    --enable-realtime-only --enable-runtime-cpu-detect \
    --target=x86-win32-vs9
make

Then build using visual studio: open the solution, switch to a "release" build and build the solution.
Then copy the Win32\Release\*lib to C:\vpx-1.4\lib\Win32\.
I got the rest of the files into C:\vpx-1.4\ (include and bin by doing a mingw gcc build first - but this doesn't generate the lib files.. whereas visual studio does)

r8913 will build against 1.4 if present and fallback to 1.3

comment:2 Changed 3 years ago by Antoine Martin

I have built beta libvpx-xpra-1.4 packages for centos and fedora, which are drop in replacement for the 1.3 packages.

From reading the docs, it seems that:

  • VP9E_SET_TILE_COLUMNS enables multithreaded encoding and decoding
  • rc_min_quantizer = rc_max_quantizer = 0 enables lossless mode
  • g_profile = 1 enables YUV444? How do we know if it is available? (AFAIK, it is not in 1.3..)

Changed 3 years ago by Antoine Martin

Attachment: vp9-yuv444p.patch added

large work in progress patch: we need to support different colorspaces per encoding, which is an API change..

comment:3 Changed 3 years ago by Antoine Martin

Despite the fact that the vp9 + YUV444P test passes (it only encodes a single frame), we get errors:

  • with ffmpeg decoding (avcodec):
    dec_avcodec.Decoder({'decoder_height': 24614, 'encoding': 'vp9', 'colorspace': 'YUV444P', \
        'actual_colorspace': 'YUV444P', 'height': 300, 'decoder_width': 38, 'width': 300, \
        'version': (56, 1, 100), 'formats': ['YUV420P', 'YUV422P', 'YUV444P'], \
        'frames': 0L, 'type': 'avcodec', 'buffers': 0})\
        .decompress_image(<type 'str'>:1065, {'speed': 33, 'frame': 1, 'quality': 99, \
            'csc': 'YUV444P', 'encoding': 'vp9'}) \
            avcodec_decode_video2 failure: Invalid data found when processing input
    
  • with the vpx native decoder:
    paint_with_video_decoder: wid=2, vp9 decompression error on 250 bytes of picture data \
        for 300x300 pixels using vpx.Decoder(vp9), \
        options={'speed': 33, 'frame': 2, 'quality': 99, 'csc': 'YUV444P', 'encoding': 'vp9'}
    error during vpx_codec_decode: Unspecified internal error
    

Only shows that the tests need to be improved..

comment:4 Changed 3 years ago by Antoine Martin

Owner: changed from Antoine Martin to alas
Priority: majorcritical
Status: assignednew

In the end, it was a red herring: increasing the number of frames we test (done in r8941) does not trigger the problem, so the problem was not with the vpx encoder + decoder combination. I eventually stumbled upon those weird messages:

2015-04-07 18:00:24,127 paint_with_video_decoder: colorspace changed from speed to YUV444P
2015-04-07 18:01:54,048 paint_with_video_decoder: colorspace changed from log.py to YUV444P

This should have been more obvious, r8942 should make it so. (and when this happens, we should probably tell the server to restart the video encoder)

Moving on from fish to ducks: if it quacks like a memleak / memcorruption, that's because it is one: fixed in r8940.
VP9 + YUV444 now works properly. Except... What really threw my off course is that there was another decoding bug in avcodec, with the exact same symptoms! (and avcodec is the default video decoder used - though this can be changed with the video-decoders config switch) It seems that ffmpeg's avcodec cannot decode vp9 + yuv444p using the same code path that we use for vp8, h264 and h265. So this is now disabled in r8943 (+backport in r8945).

But, it was worth the effort, because I also re-RTFM, and found a few things we needed to do to get good performance out of vp9:

  • r8947 - using tile columns did not make any difference whatsoever, so the code is there but disabled by default
  • r8948 + r8949: tuning CPUUSED and disabling VP9E_SET_FRAME_PERIODIC_BOOST makes a massive difference, we also make sure we never used VPX_DL_BEST_QUALITY with vp9.
  • r8950 also adds support for lossless mode when setting quality=100 - (it could be worth comparing with NVENC's lossless mode, x264 does not have one..)

vp9 is now actually quite usable!
On a low-end core i5, I am getting just under 40fps at 1080fps when I select speed=100, with auto-tuning it is closer to 20fps. (and maybe we can improve the heuristics for vp9 so we get >20fps using the default auto settings)
Related note: I think there may be a bug with the speed controls from the tray menu. So it is best to use the command line when testing.

In the future, we may also want to set the "tune-content" to "screen" when not dealing with video regions:

/*!brief VP9 encoder content type */	
typedef enum {	
  VP9E_CONTENT_DEFAULT,	
  VP9E_CONTENT_SCREEN,	
  VP9E_CONTENT_INVALID	
} vp9e_tune_content;

Note for testing: you will need libvpx 1.4.0 or later as I have included an ABI version test to enable the YUV444P feature.
There are beta RPM packages available.

@afarr: this one is definitely worth testing (raising to critical), x264 does not seem to be making progress anymore and h265 (at least in software: #445) is not quite ready yet (though the performance problems I was having may have similar solutions - more RTFM for me).
If that works well enough for you, we should consider removing vp9 from the hidden list (see r7207).

Last edited 3 years ago by Antoine Martin (previous) (diff)

comment:5 Changed 3 years ago by Antoine Martin

  • r8960 fixes builds against libvpx 1.3 (as needed on OSX..)
  • r8996 fixes builds against libvpx older than 1.3 (but still newer than 1.0)
Last edited 3 years ago by Antoine Martin (previous) (diff)

comment:6 Changed 2 years ago by Antoine Martin

Note: as part of #840, I've tried libvpx 1.4 on OSX 10.9.5 + xcode 6.2, still fails.

comment:7 Changed 2 years ago by Antoine Martin

Note: in trunk (0.16) we now allow the user to select vp9 from the tray (r9275 + r9276)

comment:8 Changed 2 years ago by Antoine Martin

r9593 removed the scary vp9 warnings in v0.15.x, since it also works fine in that branch

r9789 enables vp9 decoding through avcodec as long as the version of libav is recent enough and it passes the self tests

Note: we still don't enable it when connecting to v0.14.x server, because it is still listed there as a "problematic encoding".

Last edited 2 years ago by Antoine Martin (previous) (diff)

comment:9 Changed 2 years ago by alas

Tested with osx 10.6.8 0.15.4 r10055 client (no opengl) and with win 8.1 0.15.4 client (yes opengl) against a fedora 20 0.15.4 r10133.

Launching with: xpra attach tcp:10.0.32.53:1201 --encodings=vp9,rgb,webp --quality=100 --speed=100 -d encodings (or the local OS equivalents)... the vp9 performance was abysmal. A couple of frames every couple of seconds in each case.

The vpx is indicated as v1.4.0 in the Encodings output of the bug tool... and checking the server it indicates libvpx-xpra.x86_64 installed of 1.4.0-1.fc20, which seems to be indicated above as being what's required.

Using your clients, but our fedora 20 repo... so I'll also attach the bug tool outputs that seem potentially relevant - maybe I'm just missing something.

Last edited 2 years ago by Antoine Martin (previous) (diff)

Changed 2 years ago by alas

Attachment: ticket832-Encodings.txt added

vp9 test-Encodings win32 client v. fedora 20 server 0.15.4

Changed 2 years ago by alas

Attachment: ticket832-Network.txt added

vp9 tests network win32 0.15.4 client v. fedora 20 0.15.4 server

Changed 2 years ago by alas

Attachment: ticket832-OpenGL.txt added

vp9 tests opengl win32 0.15.4 client v. fedora 20 0.15.4 server

Changed 2 years ago by alas

Attachment: ticket832-Server_Info.txt added

vp9 tests server-info win32 0.15.4 client v. fedora 20 0.15.4 server

Changed 2 years ago by alas

Attachment: ticket832-System.txt added

vp9 tests system win32 0.15.4 client v. fedora 20 0.15.4 server

comment:10 Changed 2 years ago by Antoine Martin

Easy to reproduce it with the command line given, this only happened when using the command line to set a fixed speed. (the quality was irrelevant here)

r10197 fixes this, backported to v0.15.x in r10199.

Since I have bug report data... looking at your server info:

  • encoding.dec_webp.version : (0, 3, 1) - is way too old, known to be buggy - do not use. Maybe I should change the build file to just blow up when it finds outdated versions like this.
  • encoding.vpx.version : v1.3.0 - also too old, you should be on 1.4 to test this ticket (YUV444 and lossless mode)
  • encoding.x264.version : 142 - upgrading this one would not hurt, though there are no known major bugs with 142

comment:11 Changed 2 years ago by alas

Odd, when I check the libvpx version on that server, it says it's at 1.4.0:

[jimador@zapopan ~]$ rpm -qa libvpx-xpra
libvpx-xpra-1.4.0-1.fc20.x86_64

Though, if I look for plain libvpx, I do find a 1.3.0:

[jimador@zapopan ~]$ rpm -qa libvpx
libvpx-1.3.0-4.fc20.x86_64

I suppose there's a flag we must've missed to point at libvpx-xpra instead of libvpx?

If you could make a fedora 21 rpm I could re-test with that (smo's out for a bit, and it'd be best if we not rely on my build skills).

comment:12 Changed 2 years ago by Antoine Martin

Odd, when I check the libvpx version on that server, it says it's at 1.4.0:


That's because you're building against the system libvpx 1.3 and not the xpra libvpx which is 1.4. You need something like:

LDFLAGS=-Wl,-rpath=/usr/lib64/xpra \
PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/lib64/xpra/pkgconfig \
  sudo -E  ./setup.py install

comment:13 Changed 2 years ago by alas

Status: newassigned

Interesting... I spun up a new fedora 21 vm and followed the instructions to set up your repo from http://winswitch.org/downloads/rpm-repository.html?dist_select=FedoraBeta (substituting yum install xpra for the last step).

Installed fedora 21 0.16.0 r10216 but, when I was running it as a server and used the bug tool to compare the same encoding versions I'm seeing:

  • encoding.enc_webp.version : (0, 4, 3) - which I presume is more up to date.
  • encoding.x264.version : 146 - which I also presume is more up to date.

But -

  • encoding.vpx.version : v1.3.0 - which seems to also be too old to test this ticket.

And- when I run rpm -qa libvpx-xpra I get no response. I assume just installing a libvpx-xpra from your dists wouldn't be sufficient to convince the xpra build to use the newer library.

I'll try with the above build flags when I get the chance on my own build environment though, and see what happens.

comment:14 Changed 2 years ago by Antoine Martin

And- when I run rpm -qa libvpx-xpra I get no response


That's the problem.


I assume just installing a libvpx-xpra from your dists wouldn't be sufficient to convince the xpra build to use the newer library.


It should be sufficient.

When building from source, you will also need libvpx-xpra-devel which is the development headers package.

The reason for this is that until now the standard distribution's libvpx version 1.3 was sufficient for our needs (it is only missing VP9 which we did not use), so we didn't require libvpx-xpra (which is available in the repos).
r10218 adds libvpx-xpra as a dependency for Fedora up to version 23 (23 doesn't need our package, it has libvpx 1.4 in the default repos), I am making new beta packages for Fedora right now - if this works ok, the change should be backported (no rush though).

Last edited 2 years ago by Antoine Martin (previous) (diff)

comment:15 Changed 2 years ago by alas

Testing with osx 0.15.4 r10055 client against a fedora 21 0.15.4 r10209 server launched with --encodings=rgb,webp,vp9, the video and audio were pretty good (it felt like the sync was a little further off than with h264, but I don't have all that sensitive of an ear/eye, so hard to say.).

Double checking the server_info from the bug tool to confirm libraries on this thing are sufficiently up to date, I do notice all of these:

client.encoding.csc_atoms        : True
client.encoding.csc_modes        : ('YUV422P', 'BGRX', 'GBRP', 'RGB', 'YUV420P', 'BGRA', 'ARGB', 'XRGB', 'YUV444P')
client.encoding.cython.version   : (0, 3, '0', '22')
client.encoding.dec_webp.version : (0, 4, 3)
client.encoding.default          : vp9
client.encoding.delta_buckets    : 5
client.encoding.enc_webp.version : (0, 4, 3)
client.encoding.full_csc_modes   : {'h264': ('ARGB', 'BGRA', 'BGRX', 'GBRP', 'RGB', 'XRGB', 'YUV420P', 'YUV422P', 'YUV444P'), 'h265': ('BGRX', 'GBRP', 'RGB', 'XRGB', 'YUV420P', 'YUV422P', 'YUV444P'), 'vp9': ('YUV420P',), 'vp8': ('YUV420P',)}

... which seem ok (I'll attach the whole system_info.txt for you to glance at, if anything sounds odd).

I got a swscale warning client side, which I usually ignore because I've never seen it impact anything with the h264 encodings ([swscaler @ 0x7f496c018e00] Warning: data is not aligned! This can lead to a speedloss) ... but shortly after I started noticing the sorts of distortions of picture that I reported in #902, despite the fact I wasn't using h264 to get the frameloss error. (I wasn't able to get a screenshot, the distortion would occasionally persist for a good while, but always resolved when I moved the mouse.)

Scrolling seemed to produce the distortion (including what seemed like subregion "tearing" of video regions, as well as distortion of lossless text areas making text blurry... as well as what seemed like a color distortion turning expected white space to a reddish hue). Resizing sometimes triggered the distortion as well. I was able to capture a couple of seconds of logs server-side with -d encoding (my best guess of relevant flag), which I'll also attach (I turned on the debugging and the firefox window was distorted, so I immediately turned the debugging off).

The line that looked particularly suspicious to me was: using PIL fallback for webp: enc_webp=<module 'xpra.codecs.webp.encode' from '/usr/lib64/python2.7/site-packages/xpra/codecs/webp/encode.so'>, stride=1440, pixel format=RGB, but perhaps that's expected.

Note, once I triggered the distortion in a firefox window (which happened pretty quickly), I was able to trigger it rather easily even with an xterm.

Changed 2 years ago by alas

server -d encoding output, working vp9 on fedora 21 server, distortions of lossless

Changed 2 years ago by alas

fedora 21 server, comment 15, server_info

comment:16 Changed 2 years ago by Antoine Martin

See also: #948, #905, #962.

Last edited 2 years ago by Antoine Martin (previous) (diff)

Changed 2 years ago by Antoine Martin

vp9 x264 x265 encoding speed

comment:17 Changed 2 years ago by Antoine Martin

As per this excellent presentation from ffmpeg developer Ronald S. Bultje: https://blogs.gnome.org/rbultje/2015/09/28/vp9-encodingdecoding-performance-vs-hevch-264/ (youtube video: VideoLAN Dev Days 2015: VP9 encoding/decoding performance vs. H.264/HEVC)

Here is the key graph for us from this presentation:
vp9 x264 x265 encoding speed

And the key finding: In practice, that means that if your CPU usage target for x264 is anything faster than veryslow, you basically want to keep using x264, since at that same CPU usage target, x265 will give worse quality for the same bitrate than x264. The story for libvpx is slightly better than for x265, but it’s clear that these next-gen codecs have a lot of work left in this area.

Which is pretty much what I had found and why:

  • we still default to h264
  • we use ffmpeg for decoding ahead of libvpx
  • as or r10763, we don't bother with the slowest vpx settings at all

vp9 isn't doing too badly, but it is just not fast enough to give us 30fps at 1080p.

Caveats:

  • vp9 does better at high res (4k)
  • there are cases where the bandwidth savings outweigh the number of fps
  • as cpus get faster and the code is improved (more gains to be made on vp9 and x265), the balance will continue to tip in their favour
  • threading: newer codecs take better advantage of multiple core / threads (though apparently less so for libvpx - it is claimed)
Last edited 2 years ago by Antoine Martin (previous) (diff)

comment:18 Changed 2 years ago by Antoine Martin

This should have been caught earlier: r10818 fixes the quantizer calculations (backported in r10819) which inverted the quality scale (0 was giving high quality, 100 low quality.. except for vp9 with 100 which would still give lossless!). Not as important as the speed setting, but still - bad!

comment:19 Changed 2 years ago by alas

Owner: changed from alas to Antoine Martin
Status: assignednew

Ok, testing with 0.16.0 r10983 windows client against 0.16.0 r11031 fedora 21 server - I can definitely confirm that --quality=1 is significantly worse than --quality=100 with vp9, as well as mp3.

I do find that h264 is a little better, even on a 4K monitor, when I'm using a window/video that's "small-ish" (less than half the monitor with --desktop-scaling=1.4) ... with both --speed=auto and --speed=90 - but with either of those settings vp9 does actually perform better with fullscreen on the 4K monitor.

Oddly, I am noticing that with --speed=100 the mp3 seems to perform better with fullscreen on the 4K.

With --quality=100 neither mp3 nor vp9 does very well with fullscreen - tearing and dropped frames/ jumpiness abound.

Overall though, to the extent that my eye can judge such things I'd say it is behaving as the charts lead you to expect - vp9 behaving pretty well and edging out mp3's performance at larger sizes.

With --speed=auto I was also getting about 20 fps (xpra info | grep fps=), and with --speed=90 it bumped up to about 25 fps... though, interestingly, testing with chromium I turned on the browser's fps output and it seemed to be under the impression that it was playing at 50-60 fps (blue digits on a black background made it hard to tell for sure what it thought)... so I guess we're still losing some in the encoding?

Do we need to do any more testing on this?

comment:20 Changed 2 years ago by Antoine Martin

Resolution: fixed
Status: newclosed

with either of those settings vp9 does actually perform better with fullscreen on the 4K monitor
...
vp9 behaving pretty well and edging out h264's performance at larger sizes.


We should take the size into account when choosing a codec: #1014


neither mp3 nor vp9 does very well with fullscreen


I assume you mean h264 and vp9 here?


Oddly, I am noticing that with --speed=100 the h264 seems to perform better with fullscreen on the 4K.


If by better you mean "more frames", that is expected: only x264 can encode fast enough to deal with huge amounts of pixels.
(see comment:17)


With --quality=100 neither mp3 nor vp9 does very well with fullscreen - tearing and dropped frames/ jumpiness abound.


That's expected: quality=100 means lossless mode, which will take a lot of bandwidth and cpu.


though, interestingly, testing with chromium I turned on the browser's fps output and it seemed to be under the impression that it was playing at 50-60 fps..


It is, we're just not sending as many screen updates as that because we cannot keep up.

Closing at last, we'll probably revisit some of this in #1014.

comment:21 Changed 2 years ago by Antoine Martin

Notes:

  • the 0.16 builds will be using libvpx 1.5 as of r11204
  • we give a higher "speed" score to "vp9" if we detect libvpx >= 1.5 since this release made some significant performance improvements

comment:22 Changed 15 months ago by Antoine Martin

vp9 is actually reasonably fast and efficient, as long as we tune the speed parameter within a very narrow range (a value between 4 and 8 rather than 0 and 8 as before, the whole API range is -8 to 8): done in r13175.

Note: See TracTickets for help on using tickets.