Xpra: Ticket #1637: Scroll paint bug in HTML5 Client

I'm running a trunk r16825 Fedora 25 server and connecting using Chrome 60

I started my server with:

xpra start :11 --bind-tcp=0.0.0.0:2200 --start-child=firefox --start-new-commands=yes --html=on --systemd-run=no --start-via-proxy=no --no-daemon

While scrolling with the scroll improved HTML5 client I get a paint corruption as mentioned in #1426. It's kind of hard to describe, but it's similar to the scrolling corruptions we were seeing when the scroll encoding was first being developed. I'll attach a screenshot to demonstrate what I'm seeing. Reproducing is quite simple - open up Firefox and navigate to a long Wikipedia page or something else that's simple and long enough to scroll on. When scrolling you'll see double and overlapping paints that usually go away after a half second when you're done scrolling as it refreshes.

Scroll logs when I managed to get the scroll corruption to stick (I can also add more logs with more debug prints if needed):

2017-09-11 13:55:52,951 encode_scrolling(XShmImageWrapper(BGRX: 0, 0, 1248, 797), {'scroll': True}) window-dimensions=(1248, 797)
2017-09-11 13:55:52,951  will send scroll data=OrderedDict([(-64, OrderedDict([(137, 308), (485, 25), (550, 247)])), (0, OrderedDict([(0, 73)]))]), non-scroll=OrderedDict([(381, 40), (446, 40), (733, 64)])
2017-09-11 13:55:52,969 non-scroll encoding using png (quality=94, speed=46) took 17ms for 3 rectangles
2017-09-11 13:55:52,969 scroll encoding total time: 18ms
2017-09-11 13:55:52,984 best scroll guess took 1ms, matches 73% of 797 lines: -58
2017-09-11 13:55:52,984 encode_scrolling(XShmImageWrapper(BGRX: 0, 0, 1248, 797), {'scroll': True}) window-dimensions=(1248, 797)
2017-09-11 13:55:52,984  will send scroll data=OrderedDict([(-58, OrderedDict([(131, 315), (486, 19), (545, 252)])), (0, OrderedDict([(0, 73)]))]), non-scroll=OrderedDict([(388, 40), (447, 40), (739, 58)])
2017-09-11 13:55:52,996 non-scroll encoding using png (quality=94, speed=46) took 11ms for 3 rectangles
2017-09-11 13:55:52,996 scroll encoding total time: 12ms
2017-09-11 13:55:53,017 best scroll guess took 1ms, matches 74% of 797 lines: -53
2017-09-11 13:55:53,017 encode_scrolling(XShmImageWrapper(BGRX: 0, 0, 1248, 797), {'scroll': True}) window-dimensions=(1248, 797)
2017-09-11 13:55:53,018  will send scroll data=OrderedDict([(-53, OrderedDict([(126, 321), (487, 14), (541, 256)])), (0, OrderedDict([(0, 73)])), (-29, OrderedDict([(777, 6)]))]), non-scroll=OrderedDict([(394, 40), (448, 40), (744, 4), (754, 43)])
2017-09-11 13:55:53,031 non-scroll encoding using png (quality=94, speed=46) took 13ms for 4 rectangles
2017-09-11 13:55:53,031 scroll encoding total time: 13ms
2017-09-11 13:55:53,050 best scroll guess took 1ms, matches 75% of 797 lines: -43
2017-09-11 13:55:53,051 encode_scrolling(XShmImageWrapper(BGRX: 0, 0, 1248, 797), {'scroll': True}) window-dimensions=(1248, 797)
2017-09-11 13:55:53,051  will send scroll data=OrderedDict([(-43, OrderedDict([(116, 332), (531, 266)])), (0, OrderedDict([(0, 73), (772, 7)])), (-18, OrderedDict([(772, 7)]))]), non-scroll=OrderedDict([(405, 83), (761, 11), (779, 18)])
2017-09-11 13:55:53,063 non-scroll encoding using png (quality=94, speed=46) took 11ms for 3 rectangles
2017-09-11 13:55:53,063 scroll encoding total time: 12ms
2017-09-11 13:55:53,085 best scroll guess took 2ms, matches 77% of 797 lines: -35
2017-09-11 13:55:53,085 encode_scrolling(XShmImageWrapper(BGRX: 0, 0, 1248, 797), {'scroll': True}) window-dimensions=(1248, 797)
2017-09-11 13:55:53,085  will send scroll data=OrderedDict([(-35, OrderedDict([(108, 340), (524, 273)])), (0, OrderedDict([(0, 73), (765, 9)])), (-34, OrderedDict([(796, 1)])), (-33, OrderedDict([(796, 1)])), (-32, OrderedDict([(796, 1)])), (14, OrderedDict([(775, 7)])), (15, OrderedDict([(781, 1)]))]), non-scroll=OrderedDict([(413, 76), (774, 15)])
2017-09-11 13:55:53,097 non-scroll encoding using png (quality=94, speed=46) took 11ms for 2 rectangles
2017-09-11 13:55:53,097 scroll encoding total time: 12ms
2017-09-11 13:55:53,116 best scroll guess took 1ms, matches 81% of 797 lines: -22
2017-09-11 13:55:53,116 encode_scrolling(XShmImageWrapper(BGRX: 0, 0, 1248, 797), {'scroll': True}) window-dimensions=(1248, 797)
2017-09-11 13:55:53,117  will send scroll data=OrderedDict([(-22, OrderedDict([(95, 354), (475, 11), (512, 285)])), (0, OrderedDict([(0, 73)])), (-21, OrderedDict([(796, 1)])), (-20, OrderedDict([(796, 1)])), (-19, OrderedDict([(796, 1)])), (-18, OrderedDict([(796, 1)])), (-17, OrderedDict([(796, 1)])), (-16, OrderedDict([(796, 1)]))]), non-scroll=OrderedDict([(427, 26), (464, 26), (781, 16)])
2017-09-11 13:55:53,126 non-scroll encoding using png (quality=94, speed=46) took 8ms for 3 rectangles
2017-09-11 13:55:53,126 scroll encoding total time: 9ms
2017-09-11 13:55:53,150 best scroll guess took 2ms, matches 86% of 797 lines: -9
2017-09-11 13:55:53,150 encode_scrolling(XShmImageWrapper(BGRX: 0, 0, 1248, 797), {'scroll': True}) window-dimensions=(1248, 797)
2017-09-11 13:55:53,151  will send scroll data=OrderedDict([(-9, OrderedDict([(82, 368), (462, 25), (499, 298)])), (0, OrderedDict([(0, 73)]))]), non-scroll=OrderedDict([(441, 12), (478, 12), (788, 9)])
2017-09-11 13:55:53,157 non-scroll encoding using png (quality=94, speed=46) took 6ms for 3 rectangles
2017-09-11 13:55:53,157 scroll encoding total time: 7ms
2017-09-11 13:55:53,261 new image size: 569x318 (was 1248x797), clearing reference checksums
2017-09-11 13:55:53,262 best scroll guess took 0ms, matches 0% of 318 lines: 0


Mon, 11 Sep 2017 21:21:45 GMT - J. Max Mena: attachment set


Sun, 17 Sep 2017 06:31:34 GMT - Antoine Martin: status changed

Minor HTML5 paint improvement in r16890.

It does look exactly like the scroll paint bugs that were fixed in the python client. Debugging this is going to be tricky, it may help to:


Thu, 05 Oct 2017 05:44:10 GMT - Antoine Martin:

Testing with r17103 and starting the server with:

XPRA_SCROLL_MIN_PERCENT=0 XPRA_AUTO_REFRESH=0 xpra-start --start=xterm

Scroll encoding triggers on almost every frame, and the lack of auto-refresh makes it easier to spot.

It does look like the browser is getting confused when we do multiple overlapping scroll paints. I have a fix which is also more correct and potentially faster to boot: we switch to double-buffering (similar to opengl) and swap the buffers when flush=0. Then we can use the previous buffer as source for the scroll paints. Edit: maybe the fix only works with rgb? (jpeg and png still have problems?)


Thu, 05 Oct 2017 05:50:19 GMT - Antoine Martin: attachment set

most simple fix (not efficient)


Thu, 05 Oct 2017 10:25:32 GMT - Antoine Martin: owner, status changed

This looks fixed in r17111. (big change...) This fix requires non-scroll region updates to be sent using "rgb" rather than "jpeg" or "png". I suspect that this is because rgb is processed synchronously whereas the other 2 are painted via a callback. But I'm not sure why this would make any difference: we have added code specifically for re-ordering screen updates when processed asynchronously (see may_paint_now). Because of that, r17112 raises the threshold for scrolling detection to 65%. (as "rgb" uses too much bandwidth otherwise).

These changes are not suitable for backporting, so I may just disable scroll encoding in older versions of the html5 client.

@maxmylyn: does that fix things for you? (please don't close, just re-assign to me - at least some of those changes should be backported, I'm just not sure how much yet)


Mon, 09 Oct 2017 22:42:47 GMT - J. Max Mena: owner changed

Upped my server to r17135:

Yes that has definitely fixed it for me.

Passing back to you


Mon, 16 Oct 2017 11:23:52 GMT - Antoine Martin: attachment set

patch to debug paint / draw functions and make the whole process synchronous


Mon, 16 Oct 2017 11:39:47 GMT - Antoine Martin: owner changed

With the patch above, everything looked fine most of the time. But occasionally some paint events would be processed in between the time we swap the buffers and when we paint the latest buffer on screen, despite the fact that the patch was meant to block all draw functions until after the screen update.

The reason for that is embarrassingly dumb: performance.now() returns milliseconds already, so we were measuring microseconds... and some safety timeouts would trigger early! (in particular the may_paint_now timeout: after 2ms instead of 2 seconds)

@maxmylyn: can you break it again? (see comment:2 for making it easier to spot when it breaks)


Fri, 20 Oct 2017 23:03:00 GMT - J. Max Mena: status changed; resolution set

Upped server to r17215 and I cannot break it after pretty extensive testing.

Closing.


Sat, 23 Jan 2021 05:29:45 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/1637