Xpra: Ticket #1232: detecting motion vectors / handling scrolling better

Scrolling can make us use a video region for the area being scrolled, we should be able to supply the motion vectors to the encoder: we know that the whole region moves as one unit so this should save a lot of CPU and bandwidth.

Plan:

add optional codec API to pass motion vectors to a test video codec (either x264 or vpx)
verify that these motion vectors are used: smaller output? metadata? save-to-file feature from r12144 then playing it back with a player that show motion vectors (ie: ffdshow does)
compare with and the current search algorithms (try EXA, HEX, etc..) and find their limitations - unit test it all
write simple motion detection for video region (exact match only and vertical only, for now) - unit test it, see Search Algorithms for Block-Matching in Motion Estimation
pass the motion vector to the encoder (and turn off deblocking?)
provide a DBUS api for passing those motion vectors from the app

Alternatively, maybe we could also use the motion vector internally: send the delta (using a fast stream compressor like lz4) + motion: still send the newly exposed region separately (different encoding) We may want to force an I frame for the next video frame since the delta will be bigger.

Also worth thinking about: distinguish video from scrolling. Video has high framerate and lasts longer, we can then:

use a different motion search algorithm: use ESA (aka exhaustive search) for video, for scrolling, we should stick with HEX, DIA or UHM (see X264 - Motion Estimation Method- Comparison)
disable "subme" (subpixel motion) for scrolling
disable partitions?
only use a single reference frame for scrolling (much faster)

There are probably some limitations on motion vectors range, etc

Sun, 19 Jun 2016 10:44:40 GMT - Antoine Martin: attachment set

attachment set to detect-scrolling.patch

unoptimized code to detect scrolling

Sun, 19 Jun 2016 10:53:48 GMT - Antoine Martin: status, summary changed

status changed from new to assigned
summary changed from motion vectors for video regions to detecting motion vectors / handling scrolling better

The good news first: with the patch above, I can detect scrolling in the majority of cases where the distance is small enough, ie: with Firefox:

best scroll guess, matches 91% of 464 lines: 10
best scroll guess, matches 89% of 464 lines: 13
best scroll guess, matches 88% of 464 lines: 15
best scroll guess, matches 95% of 464 lines: 6
best scroll guess, matches 93% of 464 lines: 8
best scroll guess, matches 89% of 464 lines: 13
best scroll guess, matches 95% of 464 lines: 6
best scroll guess, matches 90% of 464 lines: 12
best scroll guess, matches 95% of 464 lines: 5
best scroll guess, matches 96% of 464 lines: 5
best scroll guess, matches 97% of 464 lines: 3
best scroll guess, matches 95% of 464 lines: 7
best scroll guess, matches 97% of 464 lines: 3
best scroll guess, matches 98% of 464 lines: 0
best scroll guess, matches 90% of 464 lines: -14
best scroll guess, matches 73% of 464 lines: -42
best scroll guess, matches 64% of 464 lines: -52
best scroll guess, matches 74% of 464 lines: -32
best scroll guess, matches 73% of 464 lines: -32
best scroll guess, matches 52% of 464 lines: -60
best scroll guess, matches 77% of 464 lines: -28
best scroll guess, matches 64% of 464 lines: -45
best scroll guess, matches 85% of 464 lines: -17
best scroll guess, matches 85% of 464 lines: -18

The bad news is that x264 does not support supplying motion vectors: Manually feeding x264 with my own motion data?: Short answer: No you can't feed in your motion estimation data to x264. NVENC does, but it's very poorly documented (as usual) and it is per block... and blocks are small.

So I think we'll just have to handle scrolling without using video, unlike the current test patch which hooks in the video path. But we still want to only fire it when there is a good chance it will be useful (as we have to scan the whole picture, and then compare all the checksums)

Mon, 20 Jun 2016 08:31:51 GMT - Antoine Martin: attachment set

attachment set to detect-scrolling-v3.patch

make it fast with cython zero-copy and CRC64 instead of sha1sum

Tue, 21 Jun 2016 15:39:50 GMT - Antoine Martin:

Some preparatory work in r12877 and r12878, an updated patch is attached.

What works:

scrolling is detected, even multiple regions scrolling in different directions by different amounts each
the client can handle "scroll" paint messages and update the screen accordingly

Still todo:

prevent non-scroll updates whilst scrolling, they mess things up
probably want to reset the video encoder after scrolling, or at least force an I frame
handle sending more than one scroll group - not very hard
move scrolling outside video path, so we can scroll the window before applying the video mask (rounding of coordinates to even numbers)
handle refresh updates: we need to tell the refresh logic to update its list of damaged rectangles - hard! (maybe just refresh everything instead.. which is costly)
any issues with desktop scaling?
CRC64 is not fast enough: 20ms for scanning one 1080p image, assembly to the rescue? (pylzma is fast but does not work for memoryviews..), maybe use a different algo on 32-bit
support for scrolling in the pixmap backing? meh
support for scrolling in the html5 client?
sync paints: we should flush after all the updates are received
non-scroll packets bypass too much of the send accounting logic
also handle transparency
opengl: don't keep temporary fbo around for nothing, use the same code to tackle #478
maybe split the crc and pixel functions from cystats into its own module, may be imported from ximage?
would be nice to merge non-scroll regions separated by small gaps (less than N, N<=2? depends on image width?), as sending those using separate packets is a bit of a waste of CPU and bandwidth
make the client code handle any scrolling by default (both horizontal and vertical) - even if the server side may take much longer, same packet format will be easier!
interpolated smooth scrolling? if we know we're not getting more than 50 fps (batch delay 20ms) and our monitor can do 150fps, then we can do smoother scrolling by scrolling 3 times in smaller increments

Wed, 22 Jun 2016 02:39:39 GMT - Antoine Martin: attachment set

attachment set to scrolling-pixmap.patch

failed attempt at implementing scrolling in the pixmap backing - just crashes randomly..

Wed, 22 Jun 2016 13:16:54 GMT - Antoine Martin:

More preparatory work in r12881 + r12885 + r12886 + r12887, and tests in r12880 + r12882 + r12883 + r12884

Wed, 22 Jun 2016 16:10:32 GMT - Antoine Martin:

Testing notes for my own benefit since I may have to get back to other things:

start the server with:

XPRA_VIDEO_SKIP_EDGE=1 XPRA_B_FRAMES=0 XPRA_ENCODING_STRICT_MODE=1 \
xpra start :100 --start=xterm

client:

xpra attach --speaker=off  --remote-logging=no --no-mmap --opengl=yes --desktop-scaling=no

launch test app or firefox and make sure to hide the xterm (avoid a GL paint bug with more than one window shown):
```
vglrun python ./tests/xpra/clients/test_gl_vscrollup.py 1 10000 50
```

Doing a focus change causes the whole window to be "repainted", without any actual pixel changes, the server correctly sees it:

encode_scrolling(XShmImageWrapper(BGRX: 0, 0, 630, 480), '{0: 480, 1: 478, 2: 476, 3: .. 470, -4: 472, -3: 474, -2: 476}', [], [], {})
 will send scroll packets for yscroll={0: 480, 1: 478, 2: 476...}
 480 lines have not changed: '[0, 1, 2, 3, 4, 5, 6, 7, 8, .. , 474, 475, 476, 477, 478, 479]'

No data is sent to the client at all which is great.

Then the vscroll gl test application scrolls the window up by 50 pixels and repaints the bottom 50 pixels, again the server correctly identifies this:

encode_scrolling(XShmImageWrapper(BGRX: 0, 0, 630, 480), '{0: 380, 1: 379, 2: 378, 3: .. 385, -4: 384, -3: 383, -2: 382}', [], [], {})
 will send scroll packets for yscroll={0: 380, -60: ...  -52: 426, -51: 428, -50: 430, -49: 429, -48: 428, ..}
 scroll groups for distance=-50 : '[0, 1, 2, 3, 4, 5, 6, 7, 8, .. , 424, 425, 426, 427, 428, 429]'=[(0, 430)]
 non scroll: 1 packets: '[430, 431, 432, 433, 434, 4 .. , 474, 475, 476, 477, 478, 479]'=[(430, 50)]

And the client gets the matching paint requests:

do_scroll_paints(((0, 50, 630, 430, -50),), 1)
draw_region(0, 430, 630, 50, png, 604 bytes, 0, {'compress_level': 2}, [<function record_decode_time at 0x7f95402250c8>, <function after_draw_refresh at 0x7f9540225050>])

Yet the window appears black after this first scroll event. That's weird considering that the client and the gl vscroll test app actually use the exact same code for painting. The GL paint code seems to have issues when there is more than one window shown, maybe there are other issues too. Maybe fixing the pixmap backend would help confirm that the paint data is correct (or maybe also save it to file for manual inspection).

Ideally, we would want to choose the "scroll" encoding early, but the encoding selection logic code doesn't have access to the pixels.. This may have to be changed. We could then more efficiently discard unnecessary repaints.

Wed, 22 Jun 2016 16:40:55 GMT - Antoine Martin: attachment set

attachment set to scrolling-pixmap-v2.patch

almost working pixmap implementation (needs XPRA_REPAINT_ALL=1), shows the same corruption

Sun, 26 Jun 2016 11:17:01 GMT - Antoine Martin: attachment set

attachment set to detect-scrolling-v16.patch

updated patch

Sun, 26 Jun 2016 15:51:03 GMT - Antoine Martin:

Lots of changes related to scrolling have been merged:

r12912 server side support
r12914 makes it possible to record and replay scrolling data
r12913 tests for the new motion class
r12901 sets XPRA_SCROLL_ENCODING=0 by default client-side since the gl backend has problems with more than one window..
r12894 adds horizontal scrolling support in client backends (not supported by the server as detection would be much more costly)
r12893 client backend work
r12897 + r12896 tests improvements
r12909 ensures that a refresh will also match the window size (even when the video encoder would round the dimensions down to an even number - as is the case with an xterm)

Still TODO:

fix opengl backing with multiple windows (and fix pixmap backing?)
refresh handling: maybe do something simple, like extending the current refresh regions by the maximum scroll distances
move scroll handling out of the video-only path, and make it work with the video region code
check for issues with desktop scaling or transparency
support for scrolling in the html5 client
non-scroll packets bypass too much of the send accounting logic: maybe return a list of packets from the encode function?
fix issues with b-frames #800

Maybes (see comment:2 for details):

reset the video encoder after scrolling
move scrolling outside video path
handle refresh updates "correctly", moving and splitting the list of damaged rectangles - hard
crc would be quicker if we did it whilst copying the image (when we call restride)
sub-images don't need to copy the pixels (use reference counting)
opengl: don't keep temporary fbo around for nothing, use the same code to tackle #478
merge non-scroll regions separated by small gaps
interpolated smooth scrolling

Mon, 27 Jun 2016 14:17:24 GMT - Antoine Martin:

r12920 finally fixes the opengl client, and scrolling is now enabled by default, can still be disabled by setting XPRA_SCROLL_ENCODING=0
r12921 allows us to override the default XPRA_SCROLL_MIN_PERCENT=40, below this percentage, we won't be using scroll encoding, defaulting back to video
r12928 was missing from previous commits, you need this to build the new cython "motion" code

Until this is integrated more tightly, you need to run the server with XPRA_ENCODING_STRICT_MODE=1 to trigger the video code path, which is the place where the scrolling detection is currently hooked up. You can get the scrolling debug with "-d scroll". The color shown with opengl paint box debugging is orange, it only draws around the area we copied to (not from).

Tue, 12 Jul 2016 16:52:22 GMT - Antoine Martin: milestone changed

milestone changed from 0.18 to 1.0

Milestone renamed

Sat, 20 Aug 2016 12:45:59 GMT - Antoine Martin:

r13402 adds an SSE 4.2 accelerated CRC32 version which is a lot faster than the standard zlib.crc32 on supported hardware (tested on Skylake).

Sat, 20 Aug 2016 15:38:20 GMT - Antoine Martin:

There was a silly bug in the code, which means that the performance was overestimated for crc32c. But as of r13405 we switch to python-xxhash, and we get 7.5GB/s! (with some better tests)

Tue, 23 Aug 2016 08:42:02 GMT - Antoine Martin:

Lots of updates: r13433, r13434, r13435, r13436. scrolling detection is enabled again, and seems to work well enough. As of r13442 we only enable scrolling detection if "xxhash" is available (the alternatives are just too slow). Still todo: #1257, try to manage conflicts between video regions and scrolling... very difficult to have both at the same time.

Thu, 25 Aug 2016 14:45:35 GMT - Antoine Martin: owner, status changed

owner changed from Antoine Martin to alas
status changed from assigned to new

Looks ready for a first round of testing. How I tested: using firefox and scrolling around (without video playing at the same time). You can try to see what encoding is used with opengl paint debug ("Meta+Shift+F12" to turn on for a window), but since the scrolling will accumulate previous paint boxes... it can get messy quickly. A better way is to check what encodings are actually being used with xpra info | egrep "total_frames|total_pixels". Then for extra verbose details, there's "-d scroll".

Important: you must have python-xxhash installed on the server for scroll detection.

Fri, 26 Aug 2016 09:37:32 GMT - Antoine Martin:

FYI: the win32 and osx shadow servers now take advantage of scrolling detection to save a huge amount of bandwidth and CPU, they can be useful for testing and helped uncover an absolute monster of a bug: r13479

Mon, 29 Aug 2016 06:00:33 GMT - Antoine Martin:

support for exclusion zones has been added, see #1295.
adding python-xxhash package for debian server support done in r13496

Wed, 07 Sep 2016 09:13:18 GMT - Antoine Martin:

Just hit this:

Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/xpra/client/gl/gl_window_backing_base.py", line 514, in do_scroll_paints
    assert y+ydelta>=0 and y+h+ydelta<=bh, "vertical scroll by %i: rectangle %s overflows the backing buffer size %s" % (ydelta, (x, y, w, h), self.size)
AssertionError: vertical scroll by 299: rectangle (17, 2, 1002, 299) overflows the backing buffer size (1021, 591)

Not sure how to reproduce it.

Thu, 08 Sep 2016 16:30:19 GMT - Antoine Martin:

The bug in comment:14 is caused by a mismatch between the server window size and the backing client-side (which can happen more easily with desktop-scaling enabled, ie: #1098), r13618 is a workaround for that. r13617 also improves error handling end reporting.

Fri, 09 Sep 2016 11:05:53 GMT - Antoine Martin:

Minor fix in r13623: we don't use scrolling for "real" video, which may use b-frames (#800). "real" video can be identified in xpra info as:

$ xpra info | grep video-mode
window.1.video_subregion.video-mode=True

Wed, 21 Sep 2016 08:14:43 GMT - Antoine Martin:

Minor change: as of r13797 + r13799, we run the scrolling detection in more cases, even when we detected a video region that really does look like real video, but not if the API has been used to set this video region.

Notes:

we don't handle auto-refresh as well as we could, see #1320.
in the future we may be able to re-use this code for shadow servers: #1321 for osx
as of r13775, we can switch to and from "true" video on the fly, see ticket:800#comment:25
as of r13980: the required python-xxhash package has been added as a dependency and should now get installed by default (#1331)

Fri, 07 Oct 2016 22:37:28 GMT - J. Max Mena: owner changed

owner changed from alas to Antoine Martin

Running a Fedora 23 r14042 trunk server with a Fedora 23 r14042 trunk client:

I'm starting my server with XPRA_SCROLL_ENCODING=1 XPRA_ENCODING_STRICT_MODE=1 xpra start :13 --no-daemon --start-child=firefox -d scroll --bind-tcp=0.0.0.0:2200 and connecting with XPRA_SCROLL_ENCODING=1 xpra attach tcp:ip:port

And, I'm not able to trigger the new encoding at all. I can tell this because running with XPRA_OPENGL_PAINT_BOX=1, upon scrolling all I see is h264, none of the new paints mentioned earlier.

Also, I keep seeing this pop up in the logs: attachment/ticket/1232/zero.log

(..)
  File "/usr/lib64/python2.7/site-packages/xpra/server/window/window_video_source.py", line 1120, in calculate_scaling
    ffps = int(pixels/(width*height)/(time.time() - otime))
ZeroDivisionError: integer division or modulo by zero

Passing back to you with the tracebacks.

Also, I'd like (for myself, afarr, and anyone else interested) for you to indicate what flags and options are required to get the new scrolling working. As far as I can tell all that's required is XPRA_SCROLL_ENCODING=1 and XPRA_ENCODING_STRICT_MODE=1 on the server, and just scroll encoding on the client.

Sat, 08 Oct 2016 08:29:31 GMT - Antoine Martin: attachment set

attachment set to zero.log

ZeroDivisionError? in calculate_scaling

Sat, 08 Oct 2016 09:52:34 GMT - Antoine Martin: owner changed

owner changed from Antoine Martin to alas

The zero division error should be fixed in r14053.

setting XPRA_SCROLL_ENCODING is no longer needed, it has been enabled by default since r13436
setting XPRA_ENCODING_STRICT_MODE is not needed, and definitely not recommended as this will disable scrolling detection now

Scrolling should work out of the box, without needing any special command line arguments, provided that:

the required xxhash library is installed on the server
the client uses the opengl accelerated backend

To verify that scrolling is being used:

for each window forwarded, the "scrolling" flag should be on:

$ xpra info :2 | grep -i scrolling
window.1.encoding.scrolling=True

"-d scroll" on the server will always show details when scrolling detection is firing (ie: when scrolling, holding the enter key down in an xterm is also enough to trigger it), ie:

updated scroll data
best scroll guess took 1ms, matches 96% of 312 lines: 0
encode_scrolling(XShmImageWrapper(BGRX: 17, 2, 480, 312), {..}, [], [], {'av-sync': True}) window-dimensions=(499, 316)
 will send scroll packets for yscroll=[0, 13, -13, 13, -13, 26, -26, 26, -26, 39, -39, 39, -39, 52, -52, 52, -52, -65, 65, -65, 65, -78, 78, -78, 78, -91, 91, -91, 91, -104, 104, -104, 104, -117, 117, -117, 117, 130, -130, 130, -130, 143, -143, 143, -143, 156, -156, 156, -156, 169, -169]
 301 lines have not changed
 non scroll: 1 packets: [(300, 11)]
scroll encoding took 7ms
non-scroll encoding took 1ms

"-d draw" on the client (added in r14055, much less detailed than "-d paint") will show all the draw packets received, which should include some "scroll" encoding when the server prints "encode_scrolling(..)". Note however that scrolling detection is very efficient at skipping unchanged screen updates (ie: "301 lines have not changed" in the log just above), so in many cases there will not be any scroll packets at all. "scroll" draw packets are usually accompanied by lossless packets containing the other screen updates (ie: the scrollbar unless you also use #1295), ie when scrolling with firefox, you will get a series of about 10 updates like this one for each mouse wheel movement as it does its smooth scrolling repaint in small increments:
```
process_draw:       3   arrays for window   2 using scroll encoding at         (0, 0, 737, 557) with options={'av-sync': True, 'flush': 3}
process_draw:    2037    bytes for window   2 using    png encoding at        (0, 412, 737, 11) with options={'av-sync': True, 'flush': 2}
process_draw:    1510    bytes for window   2 using    png encoding at        (0, 495, 737, 11) with options={'av-sync': True, 'flush': 1}
process_draw:    1341    bytes for window   2 using    png encoding at         (0, 551, 737, 6) with options={'av-sync': True}
```

The scroll encoding contains the motion data, the png packets contain the area scrolled into view (the last packet) and the areas that have changed (ie: the top and bottom of the scrollbar which has also moved)

One last word: although the scroll paint code will use the stippled brown as "paint box colour", you can barely see it because we usually also paint the lossless screen updates together (flush>0) and those will include the same edges.. r14058 makes it 2 pixels thicker.

Mon, 10 Oct 2016 17:50:53 GMT - J. Max Mena: owner changed

owner changed from alas to Antoine Martin

Upped server and client to r14106:

I have found a pretty clear memory leak with the new scrolling code

To track it:

Launch Firefox in Xpra with the new scrolling code enabled

Navigate to https://en.wikipedia.org/wiki/World_War_II

Scroll up and down at a slow/medium pace all the way to the bottom and the top
- To make sure we're using the scrolling encoding, scrolling too fast triggers h264 periodically

Have a second xterm handy running htop or top
- Using htop, set the sort to MEM%, and watch as Python overtakes Firefox as most memory used after a few seconds.

I verified that it was the new scrolling code by starting up a new server with XPRA_SCROLL_ENCODING=0 and re-ran those steps, and the memory didn't climb. It's very easy to notice as the memory climbs very steadily and consistently.

Also, I found that grabbing the scroll bar with the mouse and moving it up and down quickly (an extreme corner case) triggers full h264 after a short time (probably 1 full second by my estimation). Not sure if it's expected or not, just wanted to note it.

Passing back to you.

Mon, 10 Oct 2016 19:15:06 GMT - J. Max Mena:

Update:

After using the client for about a couple hours scrolling periodically, watching videos, etc etc, the memory does not drop, and continues to climb well until the machine has run out of swap/memory, locking up the session.

Tue, 11 Oct 2016 04:06:01 GMT - Antoine Martin: owner changed

owner changed from Antoine Martin to J. Max Mena

the memleak is fixed in r14107
moving quickly ("extreme corner case") is meant to use h264 since that's the best encoding there is for fast screen refreshes with no obvious scrolling detection possible.

Mon, 24 Oct 2016 19:52:23 GMT - J. Max Mena: owner changed

owner changed from J. Max Mena to Antoine Martin

Noted fix of memory leak and extreme corner case. Looks pretty fantastic to me, I haven't found any more corner cases.

Will test more thoroughly when I get the time, but it seems to be behaving very nicely.

I'll pass this back to you, to decide what to do with this ticket. My vote is close it as working, and then open new tickets in the future, but I'll defer to you.

Tue, 25 Oct 2016 03:39:12 GMT - Antoine Martin: status changed; resolution set

status changed from new to closed
resolution set to fixed

Thanks!

Closing, we'll revisit this as part of #1341.

Wed, 02 Nov 2016 02:12:57 GMT - J. Max Mena:

Jotting down notes from the meeting today:

Basically, small webpage updates after scrolling for a bit cause the whole window to get repainted as h264 in a very low quality (I've been seeing ~50-60) compared to the high 90s that the scrolling uses, creating very unexpected blurriness.

A couple of short repro steps:

Use either Reddit comment sections or Wikipedia mobile and scroll up and down normally, reading as you go (optional)

After stopping scrolling, cause a field or two to collapse

Upon doing so, the whole window gets repainted with h264, and with -d compress,x264,refresh enabled, you can grep for quality and watch it drop from 97-99 down to 50-60 for the h264 paints.

Wed, 02 Nov 2016 06:40:23 GMT - Antoine Martin: status changed; resolution deleted

status changed from closed to reopened
resolution fixed deleted

Wed, 02 Nov 2016 06:48:03 GMT - Antoine Martin: owner, status changed

owner changed from Antoine Martin to J. Max Mena
status changed from reopened to new

Re-opened because r14382 + r14383 make some changes to this core part of the code.

The big problem is that we have to spot when "video" (or scrolling in this case) stops and use a different encoding. Doing this detection is expensive to do, and hard to get right too. The auto-refresh should fix the lossy screen update, does it not?

Fri, 04 Nov 2016 19:17:43 GMT - J. Max Mena: owner changed

owner changed from J. Max Mena to Antoine Martin

Upped server and client r14397:

The changes definitely have helped.

NOTE: This will be triggerable in Chrome going forward due to how Chrome paints - even small web-page updates trigger a full-window repaint. Firefox seems to behave properly, however.

Passing back to you - my testing has shown that it's been resolved in applications that behave nicely, not sure what we want to do about Chrome, or even if there even is anything we can do.

Wed, 09 Nov 2016 08:26:00 GMT - Antoine Martin: status changed

status changed from new to assigned

#1352: it seems that there are some painting issues.

Fri, 11 Nov 2016 21:25:29 GMT - J. Max Mena:

From ticket #1352:

Okay, so I'm seeing some some widespread yet intermittent issues using the scrolling codec (hmmm maybe we should come up with a cool name).

For now I'm using a Fedora 24 trunk 1.0 r14404 server and client

For the most part the following websites are the easiest to trigger:

Craigslist

Wikipedia

Trac

Reddit (specifically /r/AskReddit threads)

Basically any site that has a good mix of text yet interactive fields causes some weird behavior. I see everything from double lines of text to images appearing over comment fields (if the image is behind a collapsible element). I also see behavior similar to the b-frames issue of the application updating but not getting the paint.

And, to make it more annoyoing, it's most prevalent in Fedora, and every time I try to open Ksnap it gets a paint refresh and goes away.

Turning on XPRA_OPENGL_PAINT_BOX I see that it occurs when there's a mix of the scrolling codec, png, and rgb24 (orange).

Repro steps are spotty, I'm trying to narrow down a valid and reliable repro but am coming up with nothing:

Start an Xpra session
Open Firefox
Navigate to one of the problem sites and interact with it as normal
Intermittent scrolling and typing will cause some weird and broken paints periodically - I see something on average once every 5 minutes or so.
- Using reddit comment sections and making them collapse and uncollapse triggers the broken paints far more often.

And, it's still an issue with B-Frames disabled, however I'm finding it much harder to get the paint bug to "stick". For now it seems to be breaking slightly and then it clears itself up.

I'll attach some screenshots and some -d compress,scroll prints to this ticket.

My repro is using Reddit comment fields:

scrolling and collapsing and uncollapsing comment fields sometimes triggers some broken paints. With B-Frames enabled they appear to stick longer and more resiliently, without B-Frmaes they tend to be a bit harder to trigger and clear themselves up more.

Fri, 11 Nov 2016 21:27:54 GMT - J. Max Mena: attachment set

attachment set to 1232dscrollcompress.txt

-d scroll,compress output - managed to get it broken similar to the screenshots, then minimized and unminimized and stopped the server

Fri, 11 Nov 2016 21:28:16 GMT - J. Max Mena: attachment set

attachment set to 1232 paint bug broken.png

Broken state after collapsing and uncollapsing the comment

Fri, 11 Nov 2016 21:28:32 GMT - J. Max Mena: attachment set

attachment set to 1232 paint bug refreshed.png

after minimizing and unminimizing

Fri, 18 Nov 2016 18:11:26 GMT - J. Max Mena:

Note:

It is notably easier to trigger on https://news.google.com

Sun, 20 Nov 2016 17:00:00 GMT - Antoine Martin: priority changed

priority changed from major to blocker

Confirmed and fairly easy to reproduce. r14465 improves the code and adds XPRA_OPENGL_SAVE_BUFFERS to the opengl client, making it possible to save the state of the client-side FBO, frame by frame.

Other TODO:

scroll data size mismatch: 876x1027 vs 876x1031 - we could trim or pad the checksum array

Mon, 21 Nov 2016 09:07:55 GMT - Antoine Martin: owner, status changed

owner changed from Antoine Martin to J. Max Mena
status changed from assigned to new

More changes in r14466, see commit message. Using the newly introduced XPRA_SAVE_VIDEO_FRAMES=jpeg server side switch and the existing XPRA_OPENGL_SAVE_BUFFERS=jpeg client side switch.

How to use this cool new feature (should go on the wiki somewhere) for debugging any video picture issues:

enable both env vars
enable "compress" debug logging (and more? "scroll", whatever)
wait for problem to occur
inspect the recent sequence of images until we see the exact point where things don't look right
grab the log sample around that point in time (the image filenames use the same format as the log timestamp)

Back to the bug: r14467 fixes that, I can no longer reproduce it. Note however that we just repaint the area, it may still appear out of order (looks like the picture is flickering a bit when it does), but the next screen refresh fixes it.

I'm still seeing the occasional lower than desired visual quality when scrolling very very fast. It should go back up more quickly than it does after you stop scrolling - but this is not really scrolling related and belongs in another ticket (related to #1257 and #1135).

It would be nice if we could also update the image checksum when we process the next auto-refresh (scrolling is currently considered lossy), but this would be quite hard to implement.

Mon, 21 Nov 2016 23:47:58 GMT - J. Max Mena:

Upped to r14467:

Honestly, it looks and behaves much better now - I'm not seeing any more of the issues I described earlier.

Even after almost a full work day of using it, I still can't find any outstanding issues.

Passing back to you; probably for closure.

Tue, 22 Nov 2016 06:10:41 GMT - Antoine Martin: status changed; resolution set

status changed from new to closed
resolution set to fixed

Let's finally close this.

Thu, 02 Feb 2017 11:33:05 GMT - Antoine Martin:

Follow up in #1426 and #1320

Thu, 25 Apr 2019 06:28:43 GMT - Antoine Martin:

See also #2248: more aggressive use of "scroll" encoding.

Fri, 08 May 2020 14:11:30 GMT - Antoine Martin:

A bug has been fixed 3 years later! #2757

Sat, 23 Jan 2021 05:18:34 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/1232