Is it possible somehow to get statistics on how much bandwidth a xpra shadow
would require?
I am trying to do a shadow over the internet (both ends on Ethernet); however, I am given a bunch of:
2018-11-05 21:12:56,142 client @16.500 server is not responding, drawing spinners over the windows 2018-11-05 21:12:59,770 client @20.125 server is OK again 2018-11-05 21:13:01,108 client @21.467 server is not responding, drawing spinners over the windows 2018-11-05 21:13:02,960 client @23.295 server is OK again 2018-11-05 21:13:06,143 client @26.500 server is not responding, drawing spinners over the windows 2018-11-05 21:13:08,247 client @28.608 server is OK again 2018-11-05 21:13:10,812 Warning: delayed region timeout 2018-11-05 21:13:10,812 region is 15 seconds old, will retry - bad connection? 2018-11-05 21:13:10,813 2 late responses: 2018-11-05 21:13:10,813 14 png : 16s 2018-11-05 21:13:10,813 15 png : 15s 2018-11-05 21:13:11,153 client @31.515 server is not responding, drawing spinners over the windows 2018-11-05 21:13:14,025 client @34.390 server is OK again 2018-11-05 21:13:14,583 Warning: delayed region timeout 2018-11-05 21:13:14,584 region is 15 seconds old, will retry - bad connection? 2018-11-05 21:13:14,584 5 late responses: 2018-11-05 21:13:14,584 12 png : 20s 2018-11-05 21:13:14,584 13 png : 19s 2018-11-05 21:13:14,585 14 png : 18s 2018-11-05 21:13:14,585 15 png : 17s 2018-11-05 21:13:14,585 16 h264 : 15s 2018-11-05 21:13:15,136 Warning: delayed region timeout 2018-11-05 21:13:15,137 region is 15 seconds old, will retry - bad connection? 2018-11-05 21:13:15,137 4 late responses: 2018-11-05 21:13:15,137 12 h264 : 20s 2018-11-05 21:13:15,137 13 h264 : 19s 2018-11-05 21:13:15,138 14 png : 17s 2018-11-05 21:13:15,138 15 png : 15s 2018-11-05 21:13:16,185 client @36.545 server is not responding, drawing spinners over the windows 2018-11-05 21:13:16,693 client @37.045 server is OK again 2018-11-05 21:13:21,187 client @41.545 server is not responding, drawing spinners over the windows 2018-11-05 21:13:22,221 client @42.577 server is OK again 2018-11-05 21:13:30,172 client @50.515 server is not responding, drawing spinners over the windows 2018-11-05 21:13:30,458 client @50.811 server is OK again 2018-11-05 21:13:34,376 Warning: sanitizing invalid gtk selection 2018-11-05 21:13:34,376 format=0x340957e8, type=0x2abaae0, length=-0x1 2018-11-05 21:13:35,183 client @55.545 server is not responding, drawing spinners over the windows 2018-11-05 21:13:37,291 client @57.640 server is OK again 2018-11-05 21:13:40,203 client @00.561 server is not responding, drawing spinners over the windows 2018-11-05 21:13:42,778 client @03.140 server is OK again 2018-11-05 21:15:05,525 client @25.875 server is not responding, drawing spinners over the windows 2018-11-05 21:15:05,781 client @26.140 server is OK again
At some time, I wondered if available bandwidth would be an issue. I turned the quality to the lowest possible setting (favoring bandwidth), and it seemed to be working. Also, speedometer showed from a "constant" peak of ~1.1 mb/s, to drop to 380 kb/s - 640 kb/s. I believe that my connection's peak is also ~1.3 mb/s, which could explain the timeouts (10 Mbits)
shadow
session would require?
My server's setup is 3 screens (1680x1050+0+30, 1920x1080+3600+0, 1920x1080+1680+0) I have a 10/10 Mbits connection on the client, assume inf/inf at server Server's altered settings are only to disable sound/mic/webcam (otherwise all are at default) Client has disabled OpenGL (for no reason, I just don't like the warning), sound/mic/webcam, and scaled desktop (because 1600x1050 is non-standard for me, and 1920x1080 is already the host screen)
Lowest quality is not really nice if I have to discern anything smaller than 320x240 pixels
Is it possible to have some estimation on how much bandwidth a shadow session would require?
No, that depends entirely on what is happening in the session and how well the encoders deals with it.
.. or have any kind of "real-time" plottable statistics?
The session info window should have some, including a plot which you can save.
Client has disabled OpenGL (for no reason, I just don't like the warning
Which warning?
Is there any setting that would help me work under the bandwidth limitation?
From a local terminal, try this command to start the shadow server:
XPRA_SHADOW_REFRESH_DELAY=250 xpra shadow ...
This will limit the framerate to 4fps (default is 50ms which gives 20fps).
If that helps, please post the -d encoding,bandwidth
server debug log and we should be able to make the shadow framerate automatically adapt to the bandwidth conditions.
Replying to Antoine Martin:
.. or have any kind of "real-time" plottable statistics?
The session info window should have some, including a plot which you can save.
I tried to take some. with-delay_no-quality_timeout
was taken about at the end. "At" the end, red line spiked to 2M, and then died in ~5 seconds.
Otherwise, all "traffic" (both timed out and not) looked the same. Maybe I should've tested the bandwidth from the server side, like I did last time?
Client has disabled OpenGL (for no reason, I just don't like the warning
Which warning?
Warning: vendor 'Intel' is greylisted, you may want to turn off OpenGL if you encounter bugs
Since I don't know "what bugs to expect", I just disable it altogether
Is there any setting that would help me work under the bandwidth limitation?
From a local terminal, try this command to start the shadow server:
XPRA_SHADOW_REFRESH_DELAY=250 xpra shadow ...This will limit the framerate to 4fps (default is 50ms which gives 20fps). If that helps, please post the
-d encoding,bandwidth
server debug log and we should be able to make the shadow framerate automatically adapt to the bandwidth conditions.
It seems to me that XPRA_SHADOW_REFRESH_DELAY
does not change the result.
All the servers were started with
(XPRA_SHADOW_REFRESH_DELAY=250) xpra shadow -d encoding,bandwidth :0
and attached with
"C:\Program Files\Xpra\xpra_cmd" attach ssh://sntentos@172.16.57.121/0 --desktop-scaling=0.75 --opengl=no (--quality=10)
--quality=10
was much more of a definite factor to get it working, rather than anything else.
Both adding XPRA_SHADOW_REFRESH_DELAY
and trying to set the quality/bandwidth from the tray icon looked like no-ops.
From the log samples:
nonvideo(100, framerate lowered)
: we should probably stick with video encoders for shadow windows, you can try r21002 or later (server side update).
send_delayed for wid 1, delaying again because of backlog: 5 packets, batch delay is 497, elapsed time is 2541ms
, the automatic quality and speed heuristics keep the values way too high:
update_encoding_options(False) wid=1, want_alpha=False, speed=99, quality=99, \ bandwidth-limit=0, lossless threshold: 59 / 5, rgb auto threshold=32768 (min=2048, max=32768), \ get_best_encoding=<bound method WindowVideoSource.get_best_encoding_video...
This looks like a bug.
Can you post the server's -d encoding,bandwidth,stats
output please?
Replying to Antoine Martin:
- your OS doesn't have webp support.. so we end up using png, which uses tons of CPU and bandwidth - upgrading to Ubuntu 18.04 or later would help with that, that explains why lowering the quality helps so much: you end up using jpeg instead, which uses a fraction of the bandwidth
Can I "skip" updating Ubuntu for now, and somehow add the webp
support?
If is has to do with "adding/missing a package": xpra routinely complains about missing packages, e.g. paramiko, numpi; however, I have numpy installed:
$ pip freeze | grep numpy numpy==1.13.3 $ pip3 freeze | grep numpy numpy==1.13.3
Can I "skip" updating Ubuntu for now, and somehow add the webp support?
No. In any case, fixing the "the automatic quality and speed" is more important.
xpra routinely complains about missing packages, e.g. paramiko, numpi; however, I have numpy installed:
This is bogus. See ticket:1926#comment:1.
There you go.
Updating both server and client, seemed that handling was "barely" better.
Still unworkable though.
I raised quality to 15, seemed still okay
.. you can try r21002 or later (server side update).
Updating both server and client, seemed that handling was "barely" better.
Those logs are from an older version of the server (2.5-r20979), so there are no fixes for avoiding the "framerate lowered" with shadow servers in the version you have tested with.
Thanks for posting these logs anyway, I have found many things we should be handling better already:
Updated packages posted for most distros.
Please try again and attach the -d stats,compress
server debug output (and not a zip file with extra stuff).
You should be getting much more reasonable automatic "speed" and "quality" settings out of the box.
Though you may still need to lower min-quality
down to 0 since you found that you need a really low quality setting.
More improvements in r21021 but those aren't suitable for backporting.
Replying to Antoine Martin:
.. you can try r21002 or later (server side update).
Updating both server and client, seemed that handling was "barely" better.
Those logs are from an older version of the server (2.5-r20979), so there are no fixes for avoiding the "framerate lowered" with shadow servers in the version you have tested with.
Thank you for all the fixes. I will try to update both client and server, once beta packages are out, then re-run with -d stats,compress
.
I assume r21002-fixes are irrelevant to report now? Sadly, when I re-ran there wasn't a Xenial package built.
I am happy to see that a simple "C:\Program Files\Xpra\xpra_cmd" attach ssh://sntentos@172.16.57.121/0 --min-quality=15 --desktop-scaling=0.75 --opengl=no
(on a xpra shadow :0
) worked out of the box. I could almost see a video, with bandwidth "constraints" well under control.
However, was I using the new version for sure? I did see the beta/xenial having the r21016 version posted, but xpra --version
(and Session info, etc) said r20XXX. Maybe, xpra --version
doesn't come directly from the repo, and it's something you have to do manually? If so, would you consider automating it?
For the OpenGL thing: It seems I have already been self-medicating:
Remember my screen setup: 1680x1050+0+30, 1920x1080+3600+0, 1920x1080+1680+0:
(from left to right) Screen 0 "jumps up and down", Screen 1 "does weird things", and Screen 2 sometimes you click and there is "no input"
However, was I using the new version for sure?
dpkg -l xpra
but xpra --version (and Session info, etc) said r20XXX
What exact number?
xpra --version doesn't come directly from the repo, and it's something you have to do manually? If so, would you consider automating it?
It is already automated. But the packaging files may not have the same version number as the actual xpra source code.
For the OpenGL thing: It seems I have already been self-medicating: ...
I have no idea what I am looking at or what this means. But it doesn't look related to this ticket.
... worked out of the box. I could almost see a video, with bandwidth "constraints" well under control.
You probably do not need to tweak min-quality
, or anything else for that matter: leaving opengl on would give you much better client performance too.
The fixes ensure that we use video encodings more aggressively for shadow windows, which drastically reduces the amount of bandwidth used (by avoiding png), that's especially noticeable during start up. The actual automatic quality values used are hovering around ~50%, occasionally creeping up over 90% for a short while before dropping back down again as this seems to consume too much bandwidth for your connection. If you want improved picture quality, you should be able raise the min-quality and let xpra adapt automatically, it should then reduce the framerate.
The only thing left in that log that doesn't look quite right is how when we end up selecting the "vp8" encoder, we never seem to get more than a single frame out of it, which is going to waste CPU and bandwidth. Please post the:
XPRA_DEBUG_VIDEO_CLEAN=1 xpra shadow -d compress,stats
debug log output.
Replying to Antoine Martin:
However, was I using the new version for sure?
dpkg -l xpra
Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-======================================-========================-========================-================================================================================= ii xpra 2.5-20181117r21016-1 amd64 tool to detach/reattach running X programs
but xpra --version (and Session info, etc) said r20XXX
What exact number?
$ xpra --version xpra v2.5-r20980
For the OpenGL thing: It seems I have already been self-medicating: ...
I have no idea what I am looking at or what this means. But it doesn't look related to this ticket.
No. But it means the comment below: Replying to Antoine Martin:
... worked out of the box. I could almost see a video, with bandwidth "constraints" well under control.
You probably do not need to tweak
min-quality
, or anything else for that matter: leaving opengl on would give you much better client performance too.
... cannot be used. I can leave --min-quality
out next time
The fixes ensure that we use video encodings more aggressively for shadow windows, which drastically reduces the amount of bandwidth used (by avoiding png), that's especially noticeable during start up. The actual automatic quality values used are hovering around ~50%, occasionally creeping up over 90% for a short while before dropping back down again as this seems to consume too much bandwidth for your connection. If you want improved picture quality, you should be able raise the min-quality and let xpra adapt automatically, it should then reduce the framerate.
Let's see :-D
The only thing left in that log that doesn't look quite right is how when we end up selecting the "vp8" encoder, we never seem to get more than a single frame out of it, which is going to waste CPU and bandwidth. Please post the:
XPRA_DEBUG_VIDEO_CLEAN=1 xpra shadow -d compress,statsdebug log output.
Removing --min-quality
appeared to make the output more "pixelated", so that text was not easily readable and/or had artifacts around it (sort-of like a pixelate filter)
Otherwise it was good enough for interaction.
No. But it means the comment below: (..) ... cannot be used. I can leave --min-quality out next time
I don't understand any of this.
Is it or is not related to this ticket?
min-quality
has nothing to do with opengl.
The log attached does not contain any "vp8" compression, so I can't debug that last issue. The contexts are being re-cycled a little bit too much, I'll see if I can make them a little bit more sticky.
The xenial repo has xpra 2.5-r21028.
Replying to Antoine Martin:
No. But it means the comment below: (..) ... cannot be used. I can leave --min-quality out next time
I don't understand any of this. Is it or is not related to this ticket?
min-quality
has nothing to do with opengl.
No, the fact that "what I look" and "what I mouse-click" is not 100% accurate, means that I cannot use that mode. It is not related to "dying out of bandwidth", but it is related to the shadow mode since I've been using it. I am sorry, but I cannot explain it any further on a comment section, if the video I sent does not "show it" to you.
The log attached does not contain any "vp8" compression, so I can't debug that last issue. The contexts are being re-cycled a little bit too much, I'll see if I can make them a little bit more sticky.
The xenial repo has xpra 2.5-r21028.
Should I force vp8 encoding (--encoding=vp8
) with the new version and try again?
Forcing vp8 seems to lag more than the default option. The output is much more visible, but there is a delay of at least 1 second (at session beginning it was MUCH more). Maybe drop the quality aggressively at the beginning of > 1 sec delay, and then ramp up?
No, the fact that "what I look" and "what I mouse-click" is not 100% accurate, means that I cannot use that mode.
Right, so this is a totally different bug, do we have a ticket number for it?
(might be related to desktop-scaling
if you use that feature)
Could be related to one or more of: #1967, #1805 / #1801, #1656 (r17150), #1567, #1469, #41 / #1247 (r14368), #1339
Should I force vp8 encoding (--encoding=vp8) with the new version and try again?
No.
The problem probably comes from the automatic encoding selection.
We need all the encodings enabled to see what that code does. Maybe even add -d score
:
XPRA_DEBUG_VIDEO_CLEAN=1 xpra shadow -d compress,stats,score
Forcing vp8 seems to lag more than the default option.
What do you mean by "lag" here? I assume you mean the 1 second delay?
The output is much more visible,
What does it mean "visible"?
but there is a delay of at least 1 second (at session beginning it was MUCH more).
In your log, I see that the first few frames for each window are compressed in ~70 to 80ms each. But they use up a lot more bandwidth than with h264: around 600KB for each frame, which means around 1.5 second's worth of bandwidth on a 10Mbps connection. So you end up with a similar problem to what you had originally with "png": the server has to wait before sending any more frames because the line is saturated. Then after that things settle down: quality and speed are lowered, differential compression kicks in and compresses better.
Maybe drop the quality aggressively at the beginning of > 1 sec delay, and then ramp up?
We already do that. Other users rightly complained that we were starting too low. It is impossible to satisfy everyone.
Also note that newer distributions ship a newer version of libvpx, which supports vp8 and vp9 and with much improved performance.
Replying to Antoine Martin:
No, the fact that "what I look" and "what I mouse-click" is not 100% accurate, means that I cannot use that mode.
Right, so this is a totally different bug, do we have a ticket number for it? (might be related to
desktop-scaling
if you use that feature) Could be related to one or more of: #1967, #1805 / #1801, #1656 (r17150), #1567, #1469, #41 / #1247 (r14368), #1339
Could be #1801 (since I never worked with the other mode), #1567, or an un-reported one, something I might have broken on my installation, Nvidia graphics card or something in-between. Me trying to debug it cannot go further that "output is distorted, output is 'shaking', mouse (x,y) seem wrong".
Should I force vp8 encoding (--encoding=vp8) with the new version and try again?
No. The problem probably comes from the automatic encoding selection. We need all the encodings enabled to see what that code does. Maybe even add
-d score
:XPRA_DEBUG_VIDEO_CLEAN=1 xpra shadow -d compress,stats,scoreForcing vp8 seems to lag more than the default option.
What do you mean by "lag" here? I assume you mean the 1 second delay?
Yes
The output is much more visible,
What does it mean "visible"?
Almost no artifacts, colors are accurate.
but there is a delay of at least 1 second (at session beginning it was MUCH more).
In your log, I see that the first few frames for each window are compressed in ~70 to 80ms each. But they use up a lot more bandwidth than with h264: around 600KB for each frame, which means around 1.5 second's worth of bandwidth on a 10Mbps connection. So you end up with a similar problem to what you had originally with "png": the server has to wait before sending any more frames because the line is saturated. Then after that things settle down: quality and speed are lowered, differential compression kicks in and compresses better.
Maybe drop the quality aggressively at the beginning of > 1 sec delay, and then ramp up?
We already do that. Other users rightly complained that we were starting too low. It is impossible to satisfy everyone.
Also note that newer distributions ship a newer version of libvpx, which supports vp8 and vp9 and with much improved performance.
I will try
XPRA_DEBUG_VIDEO_CLEAN=1 xpra shadow -d compress,stats,score
at a later time
Requested diagnostics attached
From that log, it seems that at least the client is no longer dying on its own since you ended up disconnecting it yourself: Disconnecting client .. client request
at the end of the log.
How good / bad was the experience?
How well we get things to run is always going to be a challenge: you have 3 screens with two 1080p and one 1680x1050. That's 5911200 pixels in total. Each pixel is 24-bit, so that's 17MB of data, or roughly 141 Mbits per refresh to send over the network. Your line is 10Mbps, so we need to compress by a factor of ~15 (ideally more to avoid congestion issues) just to be able to do a single refresh every second. Fortunately, h264 is very efficient and can compress 100 times or more, allowing you to see more than one frame per second. We can probably do better still, just don't expect miracles - we can't workaround the laws of physics.
There are also some recent updates to the speed / quality / batch heuristics: #2061, in particular r21127 avoids subsampling with shadow servers which may help.
Replying to Antoine Martin:
From that log, it seems that at least the client is no longer dying on its own since you ended up disconnecting it yourself:
Disconnecting client .. client request
at the end of the log. How good / bad was the experience?
Client hasn't dying by itself in some time now (noted somewhere above). Experience is "workable". I'd still wish if text would render artifact-less on the first couple of seconds (instead of more). However I can read the math below, and figure out that's kind of hard to do.
How well we get things to run is always going to be a challenge: you have 3 screens with two 1080p and one 1680x1050. That's 5911200 pixels in total. Each pixel is 24-bit, so that's 17MB of data, or roughly 141 Mbits per refresh to send over the network. Your line is 10Mbps, so we need to compress by a factor of ~15 (ideally more to avoid congestion issues) just to be able to do a single refresh every second. Fortunately, h264 is very efficient and can compress 100 times or more, allowing you to see more than one frame per second. We can probably do better still, just don't expect miracles - we can't workaround the laws of physics.
There are also some recent updates to the speed / quality / batch heuristics: #2061, in particular r21127 avoids subsampling with shadow servers which may help.
I did leave a comment regarding the "described" heuristics on the mentioned ticket; feel free to see if they apply at all or not.
As for further optimizations: (Maybe)
Definitions:
Assumptions:
Preconditions:
Situation: If you have quality issues (traced to bandwidth and/or network instability)
Then:
It is a very crude example I thought, but I am missing strict algorithmic head / proper knowledge. Feel free to refine it, if applicable
If my logic is sound (and applicable), I assume you can save a lot of wasted bandwidth for output user is not going to notice in the end anyway.
Client hasn't dying by itself in some time now (noted somewhere above).
OK, thanks. Closing.
"Viewable" window: A window that is at least X% inside the "visible" area of the viewport
With compositing window managers and such, this can't really be relied upon.
You can recognize when window is active / "viewable" / minimized)
We already do:
You can recognize you have quality issues that can be traced to bandwidth and/or network instability
We already do.
Decrease FPS on non-active windows to 10~15fps (windows may be spread on multiple displays, so they still need to "feel" responsive)
We already do, the cap is at 25fps, lower if needed. There is no cap for viewable windows - you can achieve ~100fps or more.
Decrease FPS on non-"viewable"/minimized windows to 1fps (window will still have "enough" image data, when raised/become "viewable")
We already do exactly that.
Well then ... sucks to have my connection :-p
I'll try to raise min-quality (as noted elsewhere) and see if I can survive with it.
Extra idea: In both Windows 10 and in e.g. Ubuntu, there is the concept of workspaces. Can you detect the "current" workspace, and where is a window placed?
For Windows 10, there is an AutoHotKey script hooking enough in the Windows Task View / Workspace logic: https://github.com/sdias/win-10-virtual-desktop-enhancer
Extra idea: In both Windows 10 and in e.g. Ubuntu, there is the concept of workspaces. Can you detect the "current" workspace, and where is a window placed?
We already do that for X11 to slow down the rate of updates when the window is not on the visible workspace.
For Windows 10, there is an AutoHotKey? script hooking enough in the Windows Task View / Workspace logic: https://github.com/sdias/win-10-virtual-desktop-enhancer
Thanks for the pointer, added to #2081
this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/2029