Xpra: Ticket #1103: reduce server side sound pipeline latency

Follow up from #849, #835, #1067.

We want to make the server side sound pipeline processing data as quickly as possible to reduce the latency. Maybe the XPRA_SOUND_QUEUE_SOURCE_TIME from #1067 made things worse.

Some pointers:

GStreamer, embedded, and low latency are a bad combination: adding a queue makes things worst at a faster rate, suggested to use ‘queue’ instead of ‘queue2’..
delay between speaker and microphone: It would be a good idea to set your alsasrc buffer-time if you want to have a bit of control over the latency.

Thu, 28 Jan 2016 21:24:51 GMT - Antoine Martin: owner changed

owner changed from Antoine Martin to alas

See r11771 commit message. We can now tune the "buffer-time" and "latency-time". Note: unlike the underlying gstreamer property which uses microseconds, WE use milliseconds (just like everywhere else with the sound env vars).

We now change the defaults from 200/10 to 64/32. This can be seen with -d gstreamer server switch during sound source startup:

sound source default: buffer-time=200
sound source overriding with: buffer-time=64
sound source default: latency-time=10
sound source overriding with: latency-time=32
sound source initial actual-buffer-time: -1
sound source initial actual-latency-time: -1

This is applicable to pulsesrc, alsasrc, osssrc, oss4src at least (we use pulsesrc by default on posix), I will need to fix the win32 and osx shadow servers to get rid of the warning it is likely to generate there. (minor)

And we can now also see the actual values used via xpra info:

$ xpra info | grep client.speaker.actual-
client.speaker.actual-buffer-time=63990
client.speaker.actual-latency-time=31995

This value should probably be taken into account for av-sync (#835), and may explain some of the runtime discrepancies that have been observed.

The new environment variables introduced for fine tuning buffer and latency are:

XPRA_SOUND_SOURCE_BUFFER_TIME=64 XPRA_SOUND_SOURCE_LATENCY_TIME=32 xpra start ...

Notes:

the latency must be lower than the buffer

setting those values too low can result in errors, for example with 8/8 with vorbis I got:

sound source pipeline error: gst-stream-error-quark: Could not encode stream. (8)
sound source  gstaudioencoder.c(1368)
sound source  gst_audio_encoder_chain ()
sound source  /GstPipeline:pipeline0/GstVorbisEnc:vorbisenc0:
sound source  buffer size 2822 not a multiple of 4

@afarr: ready for testing so we can figure out if/how to use this:

what values should we be using with the most important codecs to minimize latency without introducing other problems? (vorbis, opus, mp3)
what are the limits? (the values that we cannot use because they will trigger errors)
I don't think the muxers matter (gdp #1075, mka #1090) - but it is worth a check.
does setting the XPRA_SOUND_QUEUE_SOURCE_TIME=0 help or make things worse (as per #1067)? It does a similar thing to this buffer-time... I would think that setting a low 'buffer-time' should make it more likely that we'll need the queue to drop samples when under load..
does this help with av-sync and does it look like the "actual-buffer-time" needs to be taken into account for av-sync? (if we can keep the sound source 'latency-time' low enough, we can probably just ignore it)

Sat, 30 Jan 2016 03:25:09 GMT - alas:

Preliminary tests with Vorbis with 0.17.0 r11776 win32 client against 0.17.0 r11778 fedora 23 server...

It looks like the errors aren't so much a result of "setting values too low" ... as a matter of what the XPRA_SOUND_SOURCE_LATENCY_TIME=. I saw errors with settings of 32/16, but also with 100/48 ... as well as such diverse and random settings as 113/57, 137/24, 137/57 ... meanwhile, I was able to connect and get sound with values like 100/50, 137/52, 137/51, 126/52, 116/52, 113/52, 113/51, 137/20. It looks like there's some math I haven't fully gotten my head around relating to the XPRA_SOUND_SOURCE_LATENCY_TIME= in conjunction with the error message of "sound source buffer size [x] not a multiple of 4".

It looks like there's a range around 20 that can be used for the XPRA_SOUND_SOURCE_LATENCY_TIME=, another range around 32, another around 50 ... etc. I can test for specifics, but if you can think of what the underlying math might suggest for hints, that might make the process less onerous.

Testing with vorbis seems to show a consistent default av-sync of about 4-6 frames audio late, whatever the XPRA_SOUND_SOURCE_BUFFER_TIME= & XPRA_SOUND_SOURCE_LATENCY_TIME= (an issue I had been trying to get my head around for #835... and consistent with the default values seen for at least the last 100-ish + revisions that I've been testing).

Based on a dumb-luck test with av-sync-delta=200, while testing for #835, seeming to improve sync - and happening to correspond with earlier server default buffer times... I tried setting the av-sync-delta values equal to the server buffer latency as set with the environment variables, and found the tests against the BBC HD audio sync test reduced from generally 4-6 frames audio late to closer to 2-4 frames... but when I set av-sync-delta to the value of the sum of the buffer and the latency times (testing against values of 100/50, 64/32, 113/52, and 137/20) the same BBC sync test site indicated audio late of 0-2 frames and other sites seemed not only perfectly sync'd, but more perfectly sync'd than my local browser (which was actually indicating about 0-3 frames video late on the BBC sync test... mac-mini bootcamping windows with Intel 4000 ...). At that 0-2 frames audio late performance, my eyes & ears can't distinguish from perfect sync, but at 3 frames it/they can...

Some very preliminary tests with opus ... with a 137/20 environment variable setting the BBC sync site indicated about 5-7 frames audio late, with av-sync-delta at 157 (sum of latency and buffer) it still indicated about 1-3 frames audio late, but with av-sync-delta upped slightly to 167 it was performing at the same 0-2 frames audio late and perfect (by my ears/eyes) sync on other sites... so it looks like opus may have some slightly more complicated ratios to test.

Will test with some mp3 when the chance presents itself.

Will also test XPRA_SOUND_QUEUE_SOURCE_TIME=0 when testing other codecs.

(Suppose I'll also add details to #835 once they are more fully ... full.)

Sat, 30 Jan 2016 05:37:35 GMT - Antoine Martin:

..to the value of the sum of the buffer and the latency times...

That doesn't make sense: those two values are the lower and upper bounds of the source buffer. Taking the sum of both is meaningless, the average would make more sense. But even more so would be to look at the actual values used, as per comment:1 $ xpra info | grep client.speaker.actual-. Then add XPRA_SOUND_QUEUE_SOURCE_TIME.

Fixes for osx and win32 shadow servers in r12102.

Mon, 11 Apr 2016 02:18:02 GMT - Antoine Martin:

See ticket:1162#comment:2.

This is now disabled, but you may want to enable it if this runs well on your setup. Anything that lowers the latency is a good thing.

Fri, 02 Sep 2016 16:48:43 GMT - alas: owner changed

owner changed from alas to Antoine Martin

Ok, noted. I'll do a little testing on our end and see if it makes a noticeable difference.

If I notice anything worth mentioning I'll open a new ticket.

In the meantime, I'll pass this back to you to close, unless there's something more to do?

Sat, 03 Sep 2016 04:22:23 GMT - Antoine Martin: status changed; resolution set

status changed from new to closed
resolution set to wontfix

Closing as wontfix since we're not using those tweaks at all.

Sat, 23 Jan 2021 05:15:03 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/1103