Xpra: Ticket #592: Pages sometimes show blurry content

Sometimes Xpra will enter a state where contents on a webpage become unreadable. It's most reproducible with large window sizes, usually after manually resizing the window. I'll attach a couple of screenshots of various buggy states, along with a .txt containing an xpra info.

Easily reproducible steps:

If this does not induce a buggy state, try resizing the window rapidly from small to big. If it's still not buggy, try going from big to small to big again rapidly.

This state is only ever entered when using the auto quality option. Manually forcing the best quality option will not induce this blurry state. Also of note, disconnecting and reconnecting (without closing chrome) will re-enter the blurry state.



Mon, 09 Jun 2014 18:48:28 GMT - J. Max Mena: attachment set


Mon, 09 Jun 2014 18:49:02 GMT - J. Max Mena: attachment set

Xpra in a slightly blurry state. It's not as bad as I've seen it, but this is the best I could get for the xpra info.


Mon, 09 Jun 2014 18:51:19 GMT - J. Max Mena: attachment set

(this is the same file as blurry.png) A more blurry state that I screenshotted last week after it sat for several minutes.


Tue, 10 Jun 2014 01:53:52 GMT - Antoine Martin: owner changed

Please include xpra info captured when the window quality is degraded. Or even better, a few times as it is degrading.

From the screenshots, it looks like the picture is downscaled to 2:3 or even 1:2. You can verify this by running without downscaling:

XPRA_SCALING=0 xpra start ..

Tue, 10 Jun 2014 02:12:52 GMT - J. Max Mena: attachment set

Xpra Info as requested...a little bit late.


Tue, 10 Jun 2014 05:24:04 GMT - Antoine Martin:

If you're using this screen layout regularly, we should probably add a modeline for it:

client.screen[0].monitor[0].geometry=(1920, 296, 1600, 900)
client.screen[0].monitor[1].geometry=(0, 0, 1920, 1080)
client.screen[0].size=(3520, 1196)

Though the automatic mode matching found something pretty close:

server.root_window_size=(3600, 1200)

As I mentioned before, I can't reproduce this, but I don't have a 1080p mswindows client to test with (at least not easily, and not with opengl enabled). Does this also happen with Linux + opengl clients? OSX? Does it never recover? I really don't understand why re-connecting does not reset things as all the statistics do get reset in that case. Does r6682 (for trunk) or r6682 (for v0.13.x) help?

That's a pretty wide range there, from 1ms to 617ms, and the 90 percentile is quite high too (407). What makes this very odd is that the ping latency is actually very good and stable (the server is probably not far at all):

client.ping_latency.50p=2
client.ping_latency.80p=2
client.ping_latency.90p=3
client.ping_latency.avg=3
client.ping_latency.cur=2
client.ping_latency.max=56
client.ping_latency.min=1

The difference between the two is that the "client ping latency" is measured from dedicated ping packets whereas the "client latency" is measured by substracting the decoding time from the time it takes to process a paint request. The two should not be so out of whack. This could be because of the resizing (which is similar to scrolling, only worse), it could end up queuing too many paint requests on the client end, maybe we need to slow down the refresh during resizing.

Please post the server output with -d stats (or you can also enable it at runtime with xpra control :DISPLAY debug enable stats), we should see some record_latency: .. information in the server log telling us where all that time is being spent.

Another thing that is puzzling is why the auto-refresh didn't kick in. Since the picture data is downscaled, the auto-refresh should get scheduled soon after each window paint. -d refresh would tell us more about that.

So we are downscaling by 2:3 because of the low automatic quality. The quality is low (66) because of:

window[3].encoding.quality.backlog_factor=100
window[3].encoding.quality.batch_factor=33
window[3].encoding.quality.latency_factor=3273

This is the documentation around that code:

    # quality:
    #    0    for lowest quality (low bandwidth usage)
    #    100  for best quality (high bandwidth usage)
    # here we try minimize client-latency, packet-backlog and batch.delay

So, the problem comes from the client latency getting too high. If we figure out why that is, we can probably fix this.

Note: this does not explain why re-connecting does not fix the issue, it should.


Tue, 10 Jun 2014 11:30:57 GMT - Antoine Martin:

Please see comment:2, as we really need to figure out what is going on, and not just which changeset fixes the symptoms. In particular, why the auto-refresh didn't kick in, why re-connecting doesn't fix things, etc..


Tue, 10 Jun 2014 18:03:00 GMT - J. Max Mena:

I'm attaching a 592stats.txt file to the ticket. I booted it up, and connected. A few seconds after connecting it became unread-ably blurry. I tried a few tabs, and refreshing and it never went out of blurry, so everything printed is with it being blurry. I re-ran the same test with -d refresh, and it was the exact same; starts out fine, then becomes blurry. I'll upload the output with a 592refresh.txt

This was all running the r6713 Windows client. I'll test the Linux and OSX clients later today and report back.

Also of note, chrome opened up running full-screen both times(or at least really close to it, which is where it was when I last closed it).


Tue, 10 Jun 2014 18:13:16 GMT - J. Max Mena: attachment set

Re-ran the test after realizing I never ended the Xpra server just to be sure I got all the logs. This time, I connected and it became blurry, then after opening a wiki entry it snapped back into the normal readable mode. I rapidly resized it and managed to get it back into the buggered state running almost full-screen. Upon getting the buggered state to stick, I ended the session.


Tue, 10 Jun 2014 18:26:04 GMT - J. Max Mena: attachment set

re-ran the test to get some more in-depth log files. Upon connecting it started out blurry, but after browsing the wikipedia a bit, it became readable again. Once in the readable state I tried to get back to the blurry state. After leaving it on a webpage and resizing the window to a small(roughly 600x600) window, and dragging it back to fullscreen 1080p caused it to enter the blurry state again. Once in the blurry state it would snap back to readable, but then back to the blurry state. Once in the consistently burry state, I haven't found a reliable way out of it. It seems resizing the window is the most reliable way to get it into the blurry state, but I have seen it become this way without any resizing at all before.


Tue, 10 Jun 2014 20:00:09 GMT - J. Max Mena:

Retested with r6713 Fedora 20 and OSX client. I was able to reproduce with the Fedora 20 client, but unable to reproduce with the OSX client.


Wed, 11 Jun 2014 00:52:05 GMT - Antoine Martin:

The tests above were performed with an out of date server:

2014-06-10 11:17:32,859 xpra server version 0.13.1 (r6587)

and

2014-06-10 11:07:41,609 xpra server version 0.13.1 (r6587)

So we don't know anything about the fixes from yesterday. Also, please state which version and revision was used, I had to look at the download area to figure out that r6713 is probably a stable 0.13.4 build I made yesterday, and not the trunk version with the fixes..


Wed, 11 Jun 2014 03:25:18 GMT - Antoine Martin:

Thanks for the logs.

(negative number of bytes! timed as 44 years instead of mere milliseconds!) But this should have been very rare, and would probably have resolved itself after about 500 screen updates. (which comes quicker than you would think)

This should be less of a problem with the changes in trunk from comment:3, but doesn't explain why it is happening.

It looks like the problem is bandwidth related, which could explain why I can't reproduce it on my fast LAN. It is a bit odd how this seems to happen with nothing preceding it:

2014-06-10 11:18:35,577 record_latency: took    4.8 ms round trip,    4.8 for echo,    3.0 for decoding of     7290 pixels, \
    320 bytes sent over the network in    1.8 ms,    1.8 ms for echo
2014-06-10 11:18:38,521 record_latency: took 2185.6 ms round trip, 2185.5 for echo,  309.0 for decoding of   521696 pixels, \
    47284 bytes sent over the network in 1876.6 ms, 1876.5 ms for echo

Why would 47KB take almost 2 seconds to round trip? Especially when the previous update, sent 3 seconds before took no time at all (1.8ms). Could there be some random drop outs on your lan?


Wed, 11 Jun 2014 18:38:56 GMT - J. Max Mena:

Retested with r6737 Windows client against a r6737 Fedora 20 Server. I re-ran similar tests, such as perusing Wikipedia at a large resolution and looking up a few German words in wiktionary. Other than some slight blurring when scrolling or resizing the window, I could not get the blurry state to stick for a significant amount of time. At one point I thought I had it stuck but it snapped back into a readable state. It is still reproducible, but almost impossible to get the blurry state to stick. Now it will go between being blurry and readable. Sometimes it stays readable, sometimes it'll stop on blurry for a period before going back to 'hunting' for the readable state.

I'll attach some more stats and refresh log files shortly.


Wed, 11 Jun 2014 18:40:19 GMT - J. Max Mena: attachment set

-d stats piped into a .txt while perusing wikipedia. Never got the blurry state to stick around. A couple of times it jumped back into being blurry, but it recovered quickly (less than a second).


Wed, 11 Jun 2014 18:41:47 GMT - J. Max Mena: attachment set

Similar test as the stats file. Again, could not get the blurry state to stick. Sometimes when navigating back to a tab it would be blurry for a short period of time, but it recovered.


Thu, 12 Jun 2014 03:24:35 GMT - Antoine Martin:

Please check with current 0.13.x, is that good enough?

I am moving some of the remaining issues to #596


Tue, 17 Jun 2014 20:02:11 GMT - J. Max Mena:

Retested with 13.6 r6783 Win7 64-bit client and Fedora 20 server:

I can provide some more logs if requested, and screen-shots as well. Blurriness has definitely improved in this latest test.


Wed, 18 Jun 2014 10:49:04 GMT - Antoine Martin: status changed; resolution set

This will have to do for 0.13, closing, please see #596.


Sat, 23 Jan 2021 05:00:18 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/592