Xpra: Ticket #835: synchronize sound with video frames

Because we try hard not to drop any sound packets (unless we do a sound pipeline restart), we have a small-ish sound buffer at the client side, we will need to delay the video frames to match that.

Do we want to delay all screen updates or just video? Since "normal" updates could just be lossless refreshes for the video, we may have to do it for all screen updates - which is a bit of a shame. Unless we pass the video region info to the client and let it manage what to delay?

What is going to be needed:

Issues:



Sun, 12 Apr 2015 12:46:27 GMT - Antoine Martin: status changed

Here are some good test videos to use for testing:

As of r8983, we now send and "info" packet by to the parent process after each buffer we push, so we should have up to date and accurate information about the queue levels.


Mon, 13 Apr 2015 13:21:15 GMT - Antoine Martin:

Another useful metric is to hit the pause button in a media player (or even youtube) and to measure for how much longer the music keeps playing on the client end. Even with a simple 440HZ Tone.

It can be useful to run the software producing the sound outside xpra using a regular desktop session's pulseaudio setup (and the --no-pulseaudio flag when starting the xpra server) so that the request to stop the sound (ie: pause the music using a click or keypress) is processed almost instantly without going through xpra forwarding (which adds its own event processing latency).

Running the latest code with a Linux client on a GB LAN, I see:

150ms is definitely within the range that we can deal with by adding a simple paint delay. It is more difficult for me to test win32 and osx, as I run those within virtual machines which add their own latency.. (forwarding the sound back from the virtualized sound card back to the host)


Wed, 15 Apr 2015 22:38:30 GMT - J. Max Mena:

Tested with an r9016 Win8.1 Client against an r9016 Fedora 20 server:


Stole the sesion with an OSX 10.10.1 r9016 client:


Thu, 30 Apr 2015 14:41:22 GMT - Antoine Martin:

See also #849 (more general sound improvements)


Tue, 05 May 2015 09:02:52 GMT - Antoine Martin: attachment set

work in progress patch


Tue, 05 May 2015 10:04:11 GMT - Antoine Martin:

With the patch above, I can get almost perfect sync on a LAN with Linux clients. There are two parts (only one is used so far - the server side):

Other changes needed / issues remaining:


Wed, 13 May 2015 10:59:52 GMT - Antoine Martin: attachment set

updated patch, adds command line option, client delay value update batching


Thu, 14 May 2015 13:18:27 GMT - Antoine Martin: attachment set

latest work in progress: adds ability to change the delay on the fly and keep the frame order


Thu, 14 May 2015 13:25:31 GMT - Antoine Martin:

New issues and ideas:


Fri, 15 May 2015 07:22:34 GMT - Antoine Martin:

r9372 implements most of what is required and seems to work quite well on my LAN, see commit message. For debugging purposes, there is now a -d av-sync logging flag.

To make it possible to fine tune the syncing, the environment variable XPRA_AV_SYNC_DELTA can be used to adjust the delay, the effect on both client and server should be the same (be it is also cumulative!):

We now expose the delay via xpra info (look for av-sync).

Still TODO:


Fri, 15 May 2015 10:45:15 GMT - Antoine Martin:

The commit above only worked with XPRA_XSHM=0 to turn off shared memory, that's because the pixels keep getting updated after we request them so the image we grabbed via xshm was always fully up to date, instead of the older version at the time we requested it. r9373 freezes the pixel buffer before queuing up the work. (and uses the new re-stride code from #839 to try to save space, but it will still consume a lot more cpu and memory than non-av-synced pixels...).

Finally, we need to find a way to auto-tune the XPRA_AV_SYNC_DELTA. That's not going to be easy as there are a number of things that will have an impact, at both ends!:

etc.

From my brief testing, I found that those values work well:

Maybe we need to tune some of those encoders better (all the 0.10 plugins can be found here: http://www.freedesktop.org/software/gstreamer-sdk/data/docs/latest/):

Also, from what I am reading (Need help with using OPUS over RTP), audioconvert is optional for some encoders/decoders (but it is required for opus for example).

Some other videos useful for visualizing the synchronization:


Fri, 15 May 2015 11:32:46 GMT - Antoine Martin: attachment set

ugly patch for tuning the encoders to use low latency settings


Fri, 15 May 2015 11:40:39 GMT - Antoine Martin:

The patch above reduces the latency of all the codecs it tunes:


Fri, 15 May 2015 14:26:29 GMT - Antoine Martin:

gstreamer does calculate the latency of its pipeline: Clocks and synchronization in GStreamer, but I see no way of getting the data from the obscure message we get on the bus:

sound-source Latency message from /GstPipeline:pipeline0/GstLameMP3Enc:lamemp3enc0 (__main__.GstLameMP3Enc): <gst.Message (none) from lamemp3enc0 at 0x7ff7fc0023c0>

This function is not exposed as far as I can see: gst-audio-encoder-get-latency.


Fri, 15 May 2015 16:43:14 GMT - Antoine Martin:

As of r9381, we auto-tune the av-sync latency based on the codec in use given this table (in milliseconds):

        MP3         : 250,
        FLAC        : 150,
        WAV         : 0,
        WAVPACK     : 600,

You also need r9382 to use the low-latency codec options which make flac usable. Now we need to test this code on more platforms and see what needs adjusting.


Note: we now choose flac ahead of mp3, because it doesn't really use that much more bandwidth (at least for use on a LAN), and it does lower the encoder latency by 100ms which is not insignificant, and it seems to be less prone to jitter (which raises the client side queue delay) and mp3 seems to cause some frame video frame delays (to be investigated) - here are some statistics you can get using xpra info | grep client.av-sync:

That said, I've just tested again and got different results... So for testing mp3, use the client with --speaker-codec=mp3.

But since flac is disabled on win32... it will default to mp3 on that platform.

The plan is to allow us to use python3 + gstreamer 1.0 in 0.16 (see #849), so maybe we could re-enable flac on win32 then - the integration code is not hard (pretty much there already), but the packaging part will be a challenge!


Thu, 21 May 2015 03:41:46 GMT - Antoine Martin:

Fixes:


Mon, 25 May 2015 08:22:22 GMT - Antoine Martin:

As of r9517, the av-sync delay is per-window and changes more gradually to prevent the stuttering effect. New issues:

Founds lots of hits, but none of them seem particularly relevant.


Mon, 13 Jul 2015 05:36:59 GMT - rektide: cc set


Tue, 21 Jul 2015 11:42:39 GMT - Antoine Martin:

See attachment/ticket/800/delayed-frames.patch: the video encoders will keep the delayed frames internally, so we should pass at least part of the av-sync delay to them...


Fri, 14 Aug 2015 11:45:20 GMT - Antoine Martin:

See 849#comment:16


Fri, 16 Oct 2015 15:06:58 GMT - Antoine Martin:

Lots of fixes and updates in r10878, in particular:


Mon, 07 Dec 2015 11:58:54 GMT - Antoine Martin: owner, status changed

Minor debugging tweaks in r11365, and ability to fine tune the av-sync delay added in r11366. This can be used to adjust the buffering of the video at runtime without restarting the server (same as what XPRA_AV_SYNC_DELTA does on startup). So for example, to delay the video by an extra 50ms:

xpra control :10 sound-output av-sync-delta "50"

This is also available via the dbus interface (see #904), which is a little bit more user-friendly. This is not cumulative, it resets the delta value every time. Because of the command line parser used, it is a bit difficult to specify negative values as they would get interpreted as options rather than arguments (no such problem with the dbus version). To workaround this, you can specify negative values by quoting them and adding a space, ie: xpra control :10 sound-output av-sync-delta " -100"


Tested with artificial sound jitter to cause the sound buffer levels to go up:

XPRA_SOUND_SOURCE_JITTER=300 xpra start ...

The "sending update queue.used=" message on the client when running with -d av-sync fires a lot and then on the server we do adjust the target delay accordingly (every XPRA_AV_SYNC_TIME_CHANGE ms):

update_av_sync_delay() current=63, target=116, adding 20 (capped to +-20 from 53)

This occasionally causes items to be delayed a bit too much as we decrease the delay:

encode_from_queue: processing item 1/5 (overdue by 157ms)

The vast majority of the encoding happens to be on time, or overdue by a negligible amount (under 50ms). More importantly, the sync test videos from comment:1 seem to play mostly in sync.


This is only ever going to work as well as permitted by the somewhat incomplete testing work done in #849, see also #999 for how to reproduce different bandwidth conditions more reliably.

Failing to detect the video region (see #410) will cause things to get out of sync, and that this is not a problem with sound sync per-se.

Having sound sync enabled will now also cause further problems with video regions being wrongly detected (see #967), as this will delay the screen updates for the wrong area of the window (or even, the whole window). Again, this is a problem with video region detection, not sound sync.


Wed, 09 Dec 2015 23:57:49 GMT - J. Max Mena:

Set up a Fedora 23 trunk r11384 Server and connected a Fedora 23 trunk r11384 client(both machines hardware):

I've also found that it seems to reset on reconnect. Either that or my connection conditions change slightly. Either way, I have to fiddle with it each connection, but in my setup around 250 seems to be good. Also, failing to detect the video region definitely comes into play; I've noticed that when (with opengl paint boxes enabled) it is painting the test video region with chunks of image encoders (it seems to like WebP..I think, I forget the colors), it rarely gets into sync.

I'll need to test it far more thoroughly (maybe we can setup some different network conditions), but so far it appears to have a noticeable effect.


Thu, 10 Dec 2015 05:28:06 GMT - Antoine Martin:

Please always specify at the very least:


I've also found that it seems to reset on reconnect.


The delta obviously is per client connection, so multiple clients can get different av sync delays. They should all be in sync - just not with each other.


but in my setup around 250 seems to be good


Are you saying that you need to add 250ms to get things to sync? (the point of this ticket is that the system should be able to auto-detect the delay to use).


Wed, 16 Dec 2015 19:42:48 GMT - J. Max Mena:

which codec was used (vorbis?)


Woops, forgot to include that. Yes, it was using vorbis.


Are you saying that you need to add 250ms to get things to sync?


In our current networking setup that appears to work nicely with vorbis. It seems to be the most in-sync that way. I imagine different network setups would need wildly (slight exaggeration) different delay values. My recommendation is to add a man-page and wiki entry showing how to change these values so that users can tweak it as they see fit(assuming they aren't already there). 250ms with vorbis works here right now, but with networking it's always a moving target.

As of right now (really close to in sync), xpra info | grep av-sync shows:

client.av-sync.client.delay=92
client.av-sync.delta=250
client.av-sync.total=342
features.av-sync=True
window[1].av-sync.current=342
window[1].av-sync.target=342
window[25].av-sync.current=342
window[25].av-sync.target=342

I'll be looking at different codecs and delays today so I'll make a nice chart showing what works here in an "ideal" networking setup(hard network connection + non-virtual machines).


Wed, 16 Dec 2015 23:58:41 GMT - J. Max Mena:

Played around with it for a good portion of today. I've found that as the session goes along it kinda varies a bit.

Speaker Codec Best latency settings
vorbis 250-275
opus 200-250
wav 200-250

vorbis info in comment:21

opus xpra info | grep av-sync output:

client.av-sync.client.delay=160
client.av-sync.delta=200
client.av-sync.total=360
features.av-sync=True
window[1].av-sync.current=360
window[1].av-sync.target=360
window[25].av-sync.current=360
window[25].av-sync.target=360

Sound queue for opus:

wav xpra info | grep av-sync output:

client.av-sync.client.delay=80
client.av-sync.delta=200
client.av-sync.total=280
features.av-sync=True
window[1].av-sync.current=280
window[1].av-sync.target=280
window[25].av-sync.current=280
window[25].av-sync.target=280

Sound queue for wav:


These three were all I got to today. It is quite time consuming to get down just perfect. Plus, we had an all hands meeting...and I had a CS final this morning so I came in to the office late. I'll get to the rest tomorrow.


Thu, 17 Dec 2015 06:03:58 GMT - Antoine Martin:

My recommendation is to add a man-page and wiki entry showing how to change these values so that users can tweak it as they see fit(assuming they aren't already there).


No, tweaking should be a last resort thing. The point of this ticket is to make it sync automatically, by default, without any user intervention.

250ms with vorbis works here right now, but with networking it's always a moving target.


And the heuristics are meant to adjust for that.

I am at a loss to explain why you would need a 250ms adjustment. This is a huge value, and I don't understand where this delay is coming from. This is definitely not what I am seeing here on a LAN where the defaults work out of the box.

Please include a screenshot of the sound latency graph when this happens, and maybe also a full xpra info.


Sound queue for wav: ... .and sound quality was less than stellar.


We don't care much about wav, the prefered codec order is:

CODEC_ORDER = [VORBIS, OPUS, FLAC, MP3, WAV, WAVPACK, SPEEX]    #AAC is untested

Fri, 18 Dec 2015 19:22:36 GMT - J. Max Mena:

No, tweaking should be a last resort thing. The point of this ticket is to make it sync automatically, by default, without any user intervention.


Okay, I misinterpreted the point of the ticket. I'll keep this in mind going forward.


I am at a loss to explain why you would need a 250ms adjustment.


And I am as well. All day yesterday, and most of this morning it's working fine without any delta adjustment.

One important thing we've found(and Lonny pointed out to me) is that the sync is noticeably better with a constant background noise playing. On average I see it 20-50ms better when I leave a background noise source on while doing the sync tests. The background noise can be anything from music in a flash game to movies playing in VLC.


Tue, 22 Dec 2015 13:35:19 GMT - Antoine Martin:

(huge bug found and fixed, see ticket:1054#comment:3 - may have caused the video to not get delayed in some corner cases: only with a video region of the same width as default xshm images, IIRC: 2048 pixel wide)


Tue, 05 Jan 2016 08:46:40 GMT - Antoine Martin:

r11585 improves the handling for the freeze() failure reported in ticket:995#comment:29. (will backport) If this happens again, hopefully we can figure out why and fix it properly.


Thu, 28 Jan 2016 20:56:12 GMT - Antoine Martin:

Follow up in #1103


Sat, 30 Jan 2016 03:30:04 GMT - alas:

Some significant finds for sync'ing posted in 1103:comment2 (with vorbis).


Sun, 10 Apr 2016 06:10:56 GMT - Antoine Martin: priority changed

Can we please close this one?


Mon, 11 Apr 2016 18:52:24 GMT - alas: status changed; resolution set

Ooops, yes.

The sync is actually very good except in some fringe cases (4K monitors with low refresh rates connected to weak machines, when opengl is greylisted, etc.) or for those who don't think audio-late of 1-3 frames (with a 26 fps video sync test) is good enough.

Will follow up some more with #1103 (or whatever new tickets are opened in its wake). Closing this as doing quite the job for now.


Wed, 13 Apr 2016 14:20:07 GMT - Antoine Martin:

Follow up in #1164.


Mon, 20 Jun 2016 04:07:49 GMT - Antoine Martin:

Looks like there is a bug in this code: #1237.


Sun, 03 Jul 2016 10:00:41 GMT - Antoine Martin:

r12950 makes it possible to force a fixed frame delay server side using the env var XPRA_FORCE_AV_DELAY, this applies to all screen updates indiscriminately instead of just video regions. Useful for testing.


Fri, 28 Oct 2016 04:06:58 GMT - Antoine Martin:

Some fixes:


Thu, 01 Jun 2017 10:18:52 GMT - Antoine Martin:

Improvements in audio pipeline capture latency related to AV timestamps changes in r15986.


Sat, 23 Jan 2021 05:07:25 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/835