Xpra: Ticket #2056: large delta buffer for applications updating slowly

See ticket:2049#comment:4.

The only way we can deal with those pathological repaints from badly written applications would be to double-buffer the whole window picture and try delta compression on it. A quick pass using lz4 will tell us if delta compression is worth doing. (if the repaint is bogus, lz4 compression will achieve ~90% compression or more effortlessly)

To reduce the cost (memory bandwidth mostly, some CPU), this could be limited to:

windows without video
windows with less than N frames per second
windows below a certain size?
high quality / low speed?

Maybe merge this with the scroll detection code? (saves having the full copy in memory - but prevents delta..)

Thu, 22 Nov 2018 09:27:26 GMT - Antoine Martin: status, milestone changed

status changed from new to assigned
milestone changed from 2.5 to 3.1

Mon, 10 Dec 2018 18:41:02 GMT - Nathan Hallquist:

My users want a terminal window that has a little less latency. I've done some more testing of both gnome-terminal and xterm and I've got some questions (this is very much outside my expertise so please forgive and correct me if these questions are dumb):

The flicker between lossy and RAW can be unpleasant with terminal-like programs. Is there a way to disable that with an x-property so that it only uses RAW + LZ4? Is this a bad idea?

Does RAW support some kind scrolling magic? I saw some tickets that seemed to imply yes, but I did some experimentation where I scrolled a few lines in an xterm window with scroll and compress debug on. My interpretation of the debug is that XPRA sends a RAW of the whole window and only does scrolling when it briefly switches to video encoding, but I'm not at all certain.

Your mention of double-buffering sounds very interesting to me. I just did an experiment: I screen grabbed my web browser having the XPRA bug tracker loaded and scrolled about 5% and did another screen grab. When I xz them individually the PPM file compresses to ~ 50K. When I xz the concatenation of them it compresses to ~ 60k. Does (or can) (or would it make sense for) XPRA to exploit this kind of effect with double-buffering? (Honestly, I'm not 100% sure that what I'm saying makes sense, but the idea of compressing a "delta" sounds really nice).

Alternatively I could tell my users something like "use GNU screen for text and xpra for graphics," but they all seem to dislike GNU screen, whereas they seem to like things like XPRA, NX and VNC. I might be able to sell them on this, however, because XPRA is nicely rootless.

Anyway, thanks again for all your help!

Mon, 10 Dec 2018 18:53:27 GMT - Antoine Martin:

The flicker between lossy and RAW can be unpleasant with terminal-like programs. Is there a way to disable that with an x-property so that it only uses RAW + LZ4? Is this a bad idea?

The best way would be to ensure that we tag the terminals as text applications - that should already be the case. Under normal conditions, this should result in lossless packets and therefore no flickering.

Does RAW support some kind scrolling magic? I saw some tickets that seemed to imply yes, but I did some experimentation where I scrolled a few lines in an xterm window with scroll and compress debug on. My interpretation of the debug is that XPRA sends a RAW of the whole window and only does scrolling when it briefly switches to video encoding, but I'm not at all certain.

Scrolling detection only kicks in when there are enough updates to the same rectangle. Currently this is done by hooking into the video code path, but this could be changed.

Your mention of double-buffering sounds very interesting to me... (Honestly, I'm not 100% sure that what I'm saying makes sense, but the idea of compressing a "delta" sounds really nice).

The server-side "double-buffering" allows all sorts of interesting optimizations, it's choosing the right ones that is going to be difficult. Sometimes the code will end up spending too much time figuring out how best to compress the pixels, rather than just doing it. (ie: that's why we currently don't do scrolling detection outside the video code path)

Mon, 10 Dec 2018 18:59:30 GMT - Nathan Hallquist:

The best way would be to ensure that we tag the terminals as text applications - that should already be the case. Under normal conditions, this should result in lossless packets and therefore no flickering.

I've tried that, but I *thought* I saw it still flicker (and I think I saw it in the compress log). What would is the relevant debug capture to check that I'm tagging it as "text" correctly? Also, I'll make a patch to add the xprop commands to the tray ASAP.

Mon, 10 Dec 2018 22:36:42 GMT - Nathan Hallquist:

When I set the property I get this from the metadata log. Is the "does not support" a problem?

2018-12-10 14:34:25,994 content_type=text
2018-12-10 14:34:25,994 updateprop(content-type, text) previous value=None
2018-12-10 14:34:25,994 updating metadata on WindowModel(0x800022): <GParamBoxed 'content-type'>
2018-12-10 14:34:25,995 make_metadata: client does not support 'content-type'
2018-12-10 14:34:25,995 make_metadata(2, WindowModel(0x800022), content-type)={}

Mon, 10 Dec 2018 22:38:39 GMT - Antoine Martin:

content_type=text

I assume this is for a terminal. Then that's what you want to see.

Is the "does not support" a problem?

No. That just means the client doesn't use this attribute so it isn't sent.

Tue, 11 Dec 2018 19:11:22 GMT - Nathan Hallquist:

I've had some fun figuring some of the codepath related to get_best_encoding_video. The undesirable behavior I'm seeing happens here:

if pixels_last_4secs<((3+text_hint*3)*videomin):

Namely it's not branching into return nonvideo when I think it should and it's getting h264. I've done some experimentation on xterm. I found:

I want a single "ls" to not go into video no matter how large the window is. The lossy to lossless flash is jarring. The videomin setting introduces a window size dependence that for an xterm is undesirable.

I do want video encoding when text is scrolling so fast I'd ordinarily have no hope of reading it. Again, this behavior should only be a function of scrolling not a function of window size.

After a little bit of testing I have found that this change, get's me the desired behavior:

if pixels_last_4secs<((3+text_hint*2*3)*cww*cwh):

I've also found that webp is lossy in my configuration, so I've added a content_type check to the nonvideo codepath:

if "webp" in options and pixel_count>=16384 and ww>=2 and wh>=2 and depth in (24, 32) and not self.content_type == "text":

If these changes are a good idea in the context of an xterm windows, can we add a new content-type for just xterm; or, perhaps, have it do this for all "text" windows? If this idea is bad for some reason that isn't obvious to me, what would be a better way?

Here's my total diff:

Index: xpra/server/window/window_video_source.py
===================================================================
--- xpra/server/window/window_video_source.py   (revision 21055)
+++ xpra/server/window/window_video_source.py   (working copy)
@@ -445,7 +445,7 @@
                 lim = now-4
                 pixels_last_4secs = sum(w*h for when,_,_,w,h in lde if when>lim)
                 text_hint = self.content_type=="text"
-                if pixels_last_4secs<((3+text_hint*3)*videomin):
+                if pixels_last_4secs<((3+2*text_hint*3)*cww*cwh):
                     return nonvideo(quality+30, "not enough frames")
                 lim = now-1
                 pixels_last_sec = sum(w*h for when,_,_,w,h in lde if when>lim)
@@ -488,7 +488,7 @@
             #assume that we have "turbojpeg",
             #which beats everything in terms of efficiency for lossy compression:
             return "jpeg"
-        if "webp" in options and pixel_count>=16384 and ww>=2 and wh>=2 and depth in (24, 32):
+        if "webp" in options and pixel_count>=16384 and ww>=2 and wh>=2 and depth in (24, 32) and not self.content_type == "text":
             return "webp"
         #lossless options:
         if speed==100 or (speed>=95 and pixel_count<MAX_RGB) or depth>24:

Tue, 11 Dec 2018 20:05:15 GMT - Antoine Martin: attachment set

attachment set to text-hint-less-video.patch

don't lower videomin when text hint is set

Tue, 11 Dec 2018 20:14:19 GMT - Antoine Martin: owner, status changed

owner changed from Antoine Martin to Nathan Hallquist
status changed from assigned to new

After a little bit of testing I have found that this change, get's me the desired behavior:

This may have undesirable side effects for non-text apps. How about the patch above instead? (only changes the videomin value when the text hint is not set)

I've also found that webp is lossy in my configuration, so I've added a content_type check to the nonvideo codepath:

webp can do lossless, and at present it is the only picture encoding that takes the content-hint into account to better tune its compression settings. Also, the current webp encoder is meant to switch to lossless with quality values higher than 75%, which is pretty low already (lower than what we use for other encoders - that's because webp lossless is often more efficient than lossy! - we have a ticket for that somewhere). What quality settings and values were you seeing? Maybe we can lower the lossless threshold for text instead?

Tue, 11 Dec 2018 20:45:31 GMT - Antoine Martin: attachment set

attachment set to text-hint-webp-more-lossless.patch

make webp switch to lossless more readily for text content

Tue, 11 Dec 2018 20:45:51 GMT - Antoine Martin:

Maybe we can lower the lossless threshold for text instead?

See untested patch.

Tue, 11 Dec 2018 22:08:02 GMT - Nathan Hallquist:

This is good.

Everything is perfect when quality is set to "auto". Does XPRA support having different quality settings for different windows? If so, how do I go about setting it?

Tue, 11 Dec 2018 22:14:09 GMT - Nathan Hallquist:

A little more context: I've found from a lot of experimenting that if I set quality to maximum 40% on my CAD window we achieve best usability on broadband with h264. Everything stays very quick and the quality loss isn't a problem because of the lossless redraw at the end.

I'd like to be able to set the quality on just the 3d render window to something other than auto and leave everything else on "auto".

Replying to Nathan Hallquist:

This is good.

Everything is perfect when quality is set to "auto". Does XPRA support having different quality settings for different windows? If so, how do I go about setting it?

Tue, 11 Dec 2018 22:20:43 GMT - Antoine Martin: status changed; resolution set

status changed from new to closed
resolution set to fixed

Everything is perfect when quality is set to "auto".

Great: merged in r21203 + r21204.

Does XPRA support having different quality settings for different windows? If so, how do I go about setting it?

In theory, this should work:

WID=1
Q=99
xpra control :13 quality $Q $WID

But you will need r21205 or later to be able to specify the window-id.

This is not ideal because you need to know the window-id, we could support a _XPRA_QUALITY window attribute, which would be nicer. Feel free to create a new ticket for that.

Tue, 11 Dec 2018 22:37:49 GMT - Nathan Hallquist:

This ticket was originally for double-buffering and delta on gnome-terminal. I had also been asking about at any related performance issues on terminals on this tickets. I probably should have opened new ones, but I'm never sure, if that is rude or not to open too many tickets on a tracker I don't own.

Should this ticket be reopened to track double-buffering and delta compression?

Tue, 11 Dec 2018 22:38:31 GMT - Antoine Martin: status changed; resolution deleted

status changed from closed to reopened
resolution fixed deleted

Should this ticket be reopened to track double-buffering and delta compression?

Yes, my bad, sorry.

Tue, 18 Dec 2018 23:57:23 GMT - Nathan Hallquist:

How do I ask XPRA to save every frame into a separate lossless image file?

I intend to make a data set so I can experiment with a compression idea. I want to try a proof-of-concept on my hard-drive before I take your time with some wordy description. (Also, by doing it this way I can avoid embarrassing myself on a public forum if it is a dumb idea)

Fri, 28 Dec 2018 23:20:47 GMT - Nathan Hallquist:

I've been using XPRA for all of my work lately (to stress test).

I'm finding that WebP frames in xterm are mostly okay. Sometimes the color doesn't quite match lossless frames, but that's not too jarring.

However, often XPRA goes to x264 and x264 is *very* fuzzy. Sometimes x264 get's triggered as I'm using the shell a little too quickly. Sometimes after an x264 frame it doesn't get a lossless frame and it stays fuzzy and doesn't refresh until I type something. Unfortunately, I've not quite figured out how to reliably trigger this.

I'm not sure what the best fix is. Can x264 be configured by a quality setting to be less fuzzy?

If it is feasible, I will propose having xterm use different encoders for rapid changes depending on whether the mouse is activated or not. If the mouse is moving, the user is probably trying to find something on the scroll bar, and I'm not sure what the right encoding would be, but I think a video encoder like x264 might make sense here. But when the mouse is not pressed I would prefer for xterm to stay lossless and then redraw less frequently (batch more) until the picture settles.

Thu, 03 Jan 2019 10:22:30 GMT - Antoine Martin:

How do I ask XPRA to save every frame into a separate lossless image file?

This isn't builtin at the moment, though you can find some example code near SAVE_TO_FILE in the window source code. You could hook this into a number of places:

in do_damage which triggers whenever xpra is notified of a screen update
in process_damage_region which fires after the batch delay, at which point the screen updates may have been aggregated

One major problem with this approach is that this can really slow down the thread you do this in, even if you save the pixel data uncompressed (anything that does IO will be many orders of magnitude slower). So you may want to start a dedicated IO thread for this, with a queue of pending buffers and save them when the CPU has time for it - otherwise you will be affecting the system too much, which can skew the data...

I intend to make a data set so I can experiment with a compression idea.

That's exactly what I did a while back, and how I came up with some of the encoding heuristics. It would be great to re-do this more properly. Some of those heuristics are bound to be out of date. (new codecs, better performance, etc) Ideally, we could generate lists of screen updates as files then feed them to the encoding logic any number of times and measure what comes out: what CPU time is used, how much bandwidth, etc.. But I digress, this belongs in a different ticket.

I want to try a proof-of-concept on my hard-drive before I take your time with some wordy description. (Also, by doing it this way I can avoid embarrassing myself on a public forum if it is a dumb idea)

No chance! First, you're right on the money already. Second, I embarrass myself regularly by getting it wrong, it's the best way to learn!

I'm finding that WebP frames in xterm are mostly okay. Sometimes the color doesn't quite match lossless frames, but that's not too jarring.

We have had colourspace inaccuracy issues in the past: ticket:1438#comment:4 for jpeg with opengl rendering, maybe webp suffers from a similar problem (#1764).

However, often XPRA goes to x264 and x264 is *very* fuzzy.

The idea is that x264 should only trigger for video (or animation) content, where lossy quality is not a problem.

Sometimes x264 get's triggered as I'm using the shell a little too quickly.

It will be very noticeable in a terminal... so we try to avoid doing that, and when we do the auto-refresh should still kick in.

Sometimes after an x264 frame it doesn't get a lossless frame and it stays fuzzy and doesn't refresh until I type something. Unfortunately, I've not quite figured out how to reliably trigger this.

That sounds like a bug. If you can reproduce it, please include the -d damage,refresh,regionrefresh debug output.

I'm not sure what the best fix is. Can x264 be configured by a quality setting to be less fuzzy?

Try to capture xpra info when this happens and we'll be able to see what settings are used, and why the quality is too low.

If it is feasible, I will propose having xterm use different encoders for rapid changes depending on whether the mouse is activated or not. If the mouse is moving, the user is probably trying to find something on the scroll bar, and I'm not sure what the right encoding would be, but I think a video encoder like x264 might make sense here.

Or the user could be selecting text within the window. Every time the pointer crosses over to the next character this will trigger a damage event, this may also end up triggering the "video" encoder.

But when the mouse is not pressed I would prefer for xterm to stay lossless and then redraw less frequently (batch more) until the picture settles.

I am not yet convinced that the pointer status can be used as a hint. (also it may be pressed but pointing to another window, which may or may not trigger paint events in this window...)

Wed, 20 Mar 2019 05:06:15 GMT - Antoine Martin: milestone changed

milestone changed from 3.1 to 4.0

Milestone renamed

Wed, 12 Feb 2020 05:27:30 GMT - Antoine Martin: status changed; resolution set

status changed from reopened to closed
resolution set to needinfo

Sat, 23 Jan 2021 05:41:00 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/2056