Because we try hard not to drop any sound packets (unless we do a sound pipeline restart), we have a small-ish sound buffer at the client side, we will need to delay the video frames to match that.
Do we want to delay all screen updates or just video? Since "normal" updates could just be lossless refreshes for the video, we may have to do it for all screen updates - which is a bit of a shame. Unless we pass the video region info to the client and let it manage what to delay?
What is going to be needed:
Issues:
Here are some good test videos to use for testing:
As of r8983, we now send and "info" packet by to the parent process after each buffer we push, so we should have up to date and accurate information about the queue levels.
Another useful metric is to hit the pause button in a media player (or even youtube) and to measure for how much longer the music keeps playing on the client end. Even with a simple 440HZ Tone.
It can be useful to run the software producing the sound outside xpra using a regular desktop session's pulseaudio setup (and the --no-pulseaudio
flag when starting the xpra server) so that the request to stop the sound (ie: pause the music using a click or keypress) is processed almost instantly without going through xpra forwarding (which adds its own event processing latency).
Running the latest code with a Linux client on a GB LAN, I see:
150ms is definitely within the range that we can deal with by adding a simple paint delay. It is more difficult for me to test win32 and osx, as I run those within virtual machines which add their own latency.. (forwarding the sound back from the virtualized sound card back to the host)
Tested with an r9016 Win8.1 Client against an r9016 Fedora 20 server:
Stole the sesion with an OSX 10.10.1 r9016 client:
See also #849 (more general sound improvements)
work in progress patch
With the patch above, I can get almost perfect sync on a LAN with Linux clients. There are two parts (only one is used so far - the server side):
Other changes needed / issues remaining:
timeout_add
: we can't just use an arbitrary changing values here or we would quickly get frames out of order! (and maybe the call_in_encode_thread
function should take care of the delay, as it is already used with a delay by the window icon code)
damage_time
, those are used for auto speed and quality tuning and need to be updated to subtract the delay
updated patch, adds command line option, client delay value update batching
latest work in progress: adds ability to change the delay on the fly and keep the frame order
New issues and ideas:
encode_from_queue
?)
cancel_damage
process_damage_region
should probably always use the clipped size for video?
av-sync
should be an options flag? (and use it to expose the actual value to clients?)
rgb24
in video version of process_damage_region
r9372 implements most of what is required and seems to work quite well on my LAN, see commit message.
For debugging purposes, there is now a -d av-sync
logging flag.
To make it possible to fine tune the syncing, the environment variable XPRA_AV_SYNC_DELTA
can be used to adjust the delay, the effect on both client and server should be the same (be it is also cumulative!):
We now expose the delay via xpra info
(look for av-sync
).
Still TODO:
process_damage_region
The commit above only worked with XPRA_XSHM=0
to turn off shared memory, that's because the pixels keep getting updated after we request them so the image we grabbed via xshm was always fully up to date, instead of the older version at the time we requested it.
r9373 freezes the pixel buffer before queuing up the work. (and uses the new re-stride code from #839 to try to save space, but it will still consume a lot more cpu and memory than non-av-synced pixels...).
Finally, we need to find a way to auto-tune the XPRA_AV_SYNC_DELTA
.
That's not going to be easy as there are a number of things that will have an impact, at both ends!:
etc.
From my brief testing, I found that those values work well:
Maybe we need to tune some of those encoders better (all the 0.10 plugins can be found here: http://www.freedesktop.org/software/gstreamer-sdk/data/docs/latest/):
GstWavpackEncMode
, with the following definitions found in the source file (extra-processing
already defaults to 0, so that's fine as it is):
{GST_WAVPACK_ENC_MODE_VERY_FAST, "Very Fast Compression", "veryfast"}, {GST_WAVPACK_ENC_MODE_FAST, "Fast Compression", "fast"}, {GST_WAVPACK_ENC_MODE_DEFAULT, "Normal Compression", "normal"}, {GST_WAVPACK_ENC_MODE_HIGH, "High Compression", "high"}, {GST_WAVPACK_ENC_MODE_VERY_HIGH, "Very High Compression", "veryhigh"},
encoding-engine-quality
: Quality/speed of the encoding engine, this does not affect the bitrate'' - and in the source code we find:
LAMEMP3ENC_ENCODING_ENGINE_QUALITY_FAST = 0, LAMEMP3ENC_ENCODING_ENGINE_QUALITY_STANDARD, LAMEMP3ENC_ENCODING_ENGINE_QUALITY_HIGH
quality
: Speed versus compression tradeoff. - in the source:
{0, "0 - Fastest compression", "0"}, {1, "1", "1"}, {2, "2", "2"}, {3, "3", "3"}, {4, "4", "4"}, {5, "5 - Default", "5"}, {6, "6", "6"}, {7, "7", "7"}, {8, "8 - Highest compression", "8"}, {9, "9 - Insane", "9"}, {0, NULL, NULL}
Also, from what I am reading (Need help with using OPUS over RTP), audioconvert
is optional for some encoders/decoders (but it is required for opus for example).
Some other videos useful for visualizing the synchronization:
ugly patch for tuning the encoders to use low latency settings
The patch above reduces the latency of all the codecs it tunes:
gstreamer does calculate the latency of its pipeline: Clocks and synchronization in GStreamer, but I see no way of getting the data from the obscure message we get on the bus:
sound-source Latency message from /GstPipeline:pipeline0/GstLameMP3Enc:lamemp3enc0 (__main__.GstLameMP3Enc): <gst.Message (none) from lamemp3enc0 at 0x7ff7fc0023c0>
This function is not exposed as far as I can see: gst-audio-encoder-get-latency.
As of r9381, we auto-tune the av-sync latency based on the codec in use given this table (in milliseconds):
MP3 : 250, FLAC : 150, WAV : 0, WAVPACK : 600,
You also need r9382 to use the low-latency codec options which make flac usable. Now we need to test this code on more platforms and see what needs adjusting.
Note: we now choose flac ahead of mp3, because it doesn't really use that much more bandwidth (at least for use on a LAN), and it does lower the encoder latency by 100ms which is not insignificant, and it seems to be less prone to jitter (which raises the client side queue delay) and mp3 seems to cause some frame video frame delays (to be investigated) - here are some statistics you can get using xpra info | grep client.av-sync
:
client.av-sync.client.delay=0 client.av-sync.delta=0 client.av-sync.total=150
client.av-sync.client.delay=176 client.av-sync.delta=0 client.av-sync.total=426
That said, I've just tested again and got different results...
So for testing mp3, use the client with --speaker-codec=mp3
.
But since flac is disabled on win32... it will default to mp3 on that platform.
The plan is to allow us to use python3 + gstreamer 1.0 in 0.16 (see #849), so maybe we could re-enable flac on win32 then - the integration code is not hard (pretty much there already), but the packaging part will be a challenge!
Fixes:
As of r9517, the av-sync delay is per-window and changes more gradually to prevent the stuttering effect. New issues:
Xorg: ../../../include/privates.h:123: dixGetPrivateAddr: Assertion `key->initialized' failed. Xpra: Fatal IO error 0 (Success) on X server :2. xterm: fatal IO error 11 (Resource temporarily unavailable) or KillClient on X server ":2" Xpra: Fatal IO error 11 (Resource temporarily unavailable) on X server :2.
Founds lots of hits, but none of them seem particularly relevant.
See attachment/ticket/800/delayed-frames.patch: the video encoders will keep the delayed frames internally, so we should pass at least part of the av-sync delay to them...
See 849#comment:16
Lots of fixes and updates in r10878, in particular:
XPRA_MAX_SYNC_BUFFER_SIZE
is the maximum amount of memory we can use for buffering video frames (defaults to 256 - in MB)
XPRA_AV_SYNC_RATE_CHANGE
by how much we adjust the av-sync delay (defaults to 20 - in ms)
XPRA_AV_SYNC_TIME_CHANGE
how often we adjust the av-sync delay (defaults to 500 - in ms)
Minor debugging tweaks in r11365, and ability to fine tune the av-sync delay added in r11366.
This can be used to adjust the buffering of the video at runtime without restarting the server (same as what XPRA_AV_SYNC_DELTA
does on startup).
So for example, to delay the video by an extra 50ms:
xpra control :10 sound-output av-sync-delta "50"
This is also available via the dbus interface (see #904), which is a little bit more user-friendly.
This is not cumulative, it resets the delta value every time.
Because of the command line parser used, it is a bit difficult to specify negative values as they would get interpreted as options rather than arguments (no such problem with the dbus version). To workaround this, you can specify negative values by quoting them and adding a space, ie: xpra control :10 sound-output av-sync-delta " -100"
Tested with artificial sound jitter to cause the sound buffer levels to go up:
XPRA_SOUND_SOURCE_JITTER=300 xpra start ...
The "sending update queue.used=" message on the client when running with -d av-sync
fires a lot and then on the server we do adjust the target delay accordingly (every XPRA_AV_SYNC_TIME_CHANGE
ms):
update_av_sync_delay() current=63, target=116, adding 20 (capped to +-20 from 53)
This occasionally causes items to be delayed a bit too much as we decrease the delay:
encode_from_queue: processing item 1/5 (overdue by 157ms)
The vast majority of the encoding happens to be on time, or overdue by a negligible amount (under 50ms). More importantly, the sync test videos from comment:1 seem to play mostly in sync.
This is only ever going to work as well as permitted by the somewhat incomplete testing work done in #849, see also #999 for how to reproduce different bandwidth conditions more reliably.
Failing to detect the video region (see #410) will cause things to get out of sync, and that this is not a problem with sound sync per-se.
Having sound sync enabled will now also cause further problems with video regions being wrongly detected (see #967), as this will delay the screen updates for the wrong area of the window (or even, the whole window). Again, this is a problem with video region detection, not sound sync.
Set up a Fedora 23 trunk r11384 Server and connected a Fedora 23 trunk r11384 client(both machines hardware):
av-sync-delta
to around 250, and I'm getting it almost perfectly in sync.
I've also found that it seems to reset on reconnect. Either that or my connection conditions change slightly. Either way, I have to fiddle with it each connection, but in my setup around 250 seems to be good. Also, failing to detect the video region definitely comes into play; I've noticed that when (with opengl paint boxes enabled) it is painting the test video region with chunks of image encoders (it seems to like WebP..I think, I forget the colors), it rarely gets into sync.
I'll need to test it far more thoroughly (maybe we can setup some different network conditions), but so far it appears to have a noticeable effect.
Please always specify at the very least:
xpra info | grep av-sync
I've also found that it seems to reset on reconnect.
The delta obviously is per client connection, so multiple clients can get different av sync delays. They should all be in sync - just not with each other.
but in my setup around 250 seems to be good
Are you saying that you need to add 250ms to get things to sync?
(the point of this ticket is that the system should be able to auto-detect the delay to use).
which codec was used (vorbis?)
Woops, forgot to include that. Yes, it was using vorbis.
Are you saying that you need to add 250ms to get things to sync?
In our current networking setup that appears to work nicely with vorbis. It seems to be the most in-sync that way. I imagine different network setups would need wildly (slight exaggeration) different delay values. My recommendation is to add a man-page and wiki entry showing how to change these values so that users can tweak it as they see fit(assuming they aren't already there). 250ms with vorbis works here right now, but with networking it's always a moving target.
As of right now (really close to in sync), xpra info | grep av-sync
shows:
client.av-sync.client.delay=92 client.av-sync.delta=250 client.av-sync.total=342 features.av-sync=True window[1].av-sync.current=342 window[1].av-sync.target=342 window[25].av-sync.current=342 window[25].av-sync.target=342
I'll be looking at different codecs and delays today so I'll make a nice chart showing what works here in an "ideal" networking setup(hard network connection + non-virtual machines).
Played around with it for a good portion of today. I've found that as the session goes along it kinda varies a bit.
Speaker Codec | Best latency settings |
vorbis | 250-275 |
opus | 200-250 |
wav | 200-250 |
vorbis info in comment:21
opus xpra info | grep av-sync
output:
client.av-sync.client.delay=160 client.av-sync.delta=200 client.av-sync.total=360 features.av-sync=True window[1].av-sync.current=360 window[1].av-sync.target=360 window[25].av-sync.current=360 window[25].av-sync.target=360
Sound queue for opus:
wav xpra info | grep av-sync
output:
client.av-sync.client.delay=80 client.av-sync.delta=200 client.av-sync.total=280 features.av-sync=True window[1].av-sync.current=280 window[1].av-sync.target=280 window[25].av-sync.current=280 window[25].av-sync.target=280
Sound queue for wav:
These three were all I got to today. It is quite time consuming to get down just perfect. Plus, we had an all hands meeting...and I had a CS final this morning so I came in to the office late. I'll get to the rest tomorrow.
My recommendation is to add a man-page and wiki entry showing how to change these values so that users can tweak it as they see fit(assuming they aren't already there).
No, tweaking should be a last resort thing. The point of this ticket is to make it sync automatically, by default, without any user intervention.
250ms with vorbis works here right now, but with networking it's always a moving target.
And the heuristics are meant to adjust for that.
I am at a loss to explain why you would need a 250ms adjustment. This is a huge value, and I don't understand where this delay is coming from. This is definitely not what I am seeing here on a LAN where the defaults work out of the box.
Please include a screenshot of the sound latency graph when this happens, and maybe also a full xpra info
.
Sound queue for wav: ... .and sound quality was less than stellar.
We don't care much about wav, the prefered codec order is:
CODEC_ORDER = [VORBIS, OPUS, FLAC, MP3, WAV, WAVPACK, SPEEX] #AAC is untested
No, tweaking should be a last resort thing. The point of this ticket is to make it sync automatically, by default, without any user intervention.
Okay, I misinterpreted the point of the ticket. I'll keep this in mind going forward.
I am at a loss to explain why you would need a 250ms adjustment.
And I am as well. All day yesterday, and most of this morning it's working fine without any delta adjustment.
One important thing we've found(and Lonny pointed out to me) is that the sync is noticeably better with a constant background noise playing. On average I see it 20-50ms better when I leave a background noise source on while doing the sync tests. The background noise can be anything from music in a flash game to movies playing in VLC.
(huge bug found and fixed, see ticket:1054#comment:3 - may have caused the video to not get delayed in some corner cases: only with a video region of the same width as default xshm images, IIRC: 2048 pixel wide)
r11585 improves the handling for the freeze() failure reported in ticket:995#comment:29. (will backport) If this happens again, hopefully we can figure out why and fix it properly.
Follow up in #1103
Some significant finds for sync'ing posted in 1103:comment2 (with vorbis).
Can we please close this one?
Ooops, yes.
The sync is actually very good except in some fringe cases (4K monitors with low refresh rates connected to weak machines, when opengl is greylisted, etc.) or for those who don't think audio-late of 1-3 frames (with a 26 fps video sync test) is good enough.
Will follow up some more with #1103 (or whatever new tickets are opened in its wake). Closing this as doing quite the job for now.
Follow up in #1164.
Looks like there is a bug in this code: #1237.
r12950 makes it possible to force a fixed frame delay server side using the env var XPRA_FORCE_AV_DELAY
, this applies to all screen updates indiscriminately instead of just video regions. Useful for testing.
Some fixes:
XPRA_FORCE_AV_DELAY
for testing
Improvements in audio pipeline capture latency related to AV timestamps changes in r15986.
this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/835