Xpra: Ticket #1638: Sound sync regression

After testing #1637 and #1595 I've noticed that the sound sync is getting wildly out of sync over time.

My server and client are both trunk r16825 client/server running Fedora 25. I start with the usual defaults xpra start :13 --bind-tcp=0.0.0.0: --start-child=firefox, but also I double checked using a local video in VLC and it also gets out of sync. Surprisingly, the sound sync gets out sync within a matter of a few short minutes (usually 5-7 minutes).

Also, upon connecting, the following errors print out on the server:

[max@Vorfuehreffekt ~]$ E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] ratelimit.c: 4 events suppressed
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket

I walked through all the other codecs, and I'm noting similar behavior - sound gets pretty far out of sync (up to several seconds), in addition to that, sound is also fairly choppy at times. With MP3, I notice that after a server has been running for a while, when letting the server go silent then start playing sound the sound plays a rather loud tick / noise before sound comes back.

Unfortunately I ran out of time writing this, so I'm putting it up as an FYI. I still need to check sound levels and get more useful logs, so I'll assign it to me for now. I'll be back in office on Wednesday PST.



Tue, 12 Sep 2017 02:02:29 GMT - J. Max Mena:

NOTE:

XPRA_CUTTER_THRESHOLD=0 will disable the sound packets coming in empty to test to see if that's the cause of the misbehaving sound.

edit: defaults to 0.02 or 2%


Tue, 12 Sep 2017 11:06:56 GMT - Antoine Martin:

Could also be the switch from opus / vorbis to mp3 which was also needed for cutter (see ticket:1617#comment:5).


Tue, 12 Sep 2017 14:51:56 GMT - Antoine Martin:

Please see ticket:1617#comment:7 : opus now supports cutter and is the default again, but we probably still want to know if the out-of-sync problem was caused by switching to mp3 or by adding "cutter".

The Fallig back to copying full block data over socket is unrelated, please use a separate ticket and make sure to attach the full server log. Did pulseaudio start OK? I'm not seeing that with my VM.


Wed, 13 Sep 2017 22:01:20 GMT - J. Max Mena:

Okay I'm gonna walk through the available codecs on my machine, while watching Office Space on VLC, hopefully I only have to watch it the one time. On the bright side this is an excuse to take a closer look at the state of the sound codecs as of r16839.

Codec Sync? Notes
opus Works fine until video is paused New default - works fine until video is paused then sound may or may not come back properly. Turning the speaker forwarding off and on fixes it..until you pause it again
opus+ogg no Seems fine until you pause video for a few minutes, then it's off - Sound buffer seems to hold steady at the minimum - graph is a "teeth" pattern of +-20ms (looks roughly 1 second per level)
vorbis+mka no - very off even from the start <-
mp3 yes? very scratchy when sound does come through, it was silent most of the time with the sound buffer level being below the minimum. Seemed in sync though
flac+ogg yes actually seems to work fine (need to follow up with a longer test)
aac+mpeg4 yes seems to work fine
wav+lz4 no gst-stream-error-quark
wav+lzo no sound totally cut out with buffer level at 0
wav no gst-stream-error-quark
wavpack no not scratchy or anything, just out of sync
speex+ogg slightly off slightly off but sometimes come back into sync - stable, doesn't become scratchy or cut off
opus+mka no sound buffer level holds steady at 0 - scratchy and sounds "starved" (correct term?) like the buffer is underrun
flac no same as opus+mka but not nearly as out of sync
mp3 + mpeg4 no starts out okay but gets further and further behind

And again with XPRA_CUTTER_THRESHOLD=0:

Codec Sync? Notes
opus yes sound cuts out randomly
opus+ogg yes same as opus (aren't they the same?)
vorbis+mka no very scratchy
mp3 yes? very scratchy when sound does come through, it was silent most of the time with the sound buffer level being below the minimum
flac+ogg yes same as with sound cutter set to 0.02
aac+mpeg4 yes seems to work fine
wav+lz4 no no sound at all - level looks fine, but no sound
wav+lzo no very little sound (constantly cut out) - level looks fine (after speaker restart, first connection got gst-quark)
wav no gst-stream-error-quark
wavpack no not scratchy just out of sync
speex+ogg yes seems fine
opus+mka no same as with cutter at the default, but less scratchy, but consistently at least a few seconds behind
flac no scratchy and cut out a lot; sound buffer stayed at 0
mp3 + mpeg4 no starts out okay but after a minute sound gets way behind, buffer level looks okay

NOTE: I am not nearly qualified to give a definite 'is absolutely in sync', I'm just going off what my ear tells me. When I say yes, I mean it's about the same as it was before the recent changes to sound.

That being said, flac+ogg and aac+mpeg4 seemed to actually work okay, even after a while. In addition, the error that I put in the OP didn't come back, so that was a one-time thing.


gst quark error for wav and the others:

sound output pipeline error: gst-stream-error-quark: The stream is of a different type than handled by this element. (5)
2017-09-13 14:01:51,669 sound output  appsrc name=src emit-signals=0 block=0 is-live=0 stream-type=0 format=4 ! wavparse ! audioconvert ! audioresample ! queue name=queue min-threshold-time=0 max-size-buffers=0 max-size-bytes=0 max-size-time=450000000 leaky=2 ! volume name=volume volume=0 ! pulsesink client-name="Xpra" async=True device="alsa_output.pci-0000_00_1b.0.analog-stereo" qos=True sync=False
2017-09-13 14:01:51,669 sound output  riff-read.c(262)
2017-09-13 14:01:51,669 sound output  gst_riff_parse_file_header ()
2017-09-13 14:01:51,669 sound output  /GstPipeline:pipeline0/GstWavParse:wavparse0:
2017-09-13 14:01:51,670 sound output  Stream is no RIFF stream
2017-09-13 14:01:51,670 sound output  0x15aff77

Thu, 14 Sep 2017 17:53:28 GMT - Antoine Martin:

You can now use the env var XPRA_SOUND_LOG_CUTTER=1 to see when the audio level passes the threshold (see ticket:1617#comment:8 for details).

Note: any codec option that uses a container format will not be using cutter (ie: anything with "+ogg", "+mka" or "+mpeg4"), and cutter is also disabled for "vorbis", "wav", "wavpack" and one of the "aac" encoders. So for some of the cases where you noticed a difference, then the two test runs were different in some other way since cutter was not even being used. Maybe just natural variation.

Generally speaking, some scratchiness can be expected when we use cutter as the sound queue will be completely empty when we resume the stream. The usual queue level management logic (#849) should take care of that relatively quickly. If this turns out to be too much of a problem, we could easily send a control packet to the client telling it to prepare for the stream re-start by temporarily raising the min level. (this would belong in #1617)

I'm assuming that you're still using Fedora 25 as both client and server. I've re-tested with a Fedora 26 server, and used for clients: Fedora 26 and MS Windows 7 32-bit.

So, what do I need to do to reproduce the scratchiness that you're hearing? Since you're even having problems with cutter disabled, let's first figure out why that is. Surely this must be a regression since you had tested this same list of codecs a while back? This could also be related to r16831.


Fri, 15 Sep 2017 23:32:24 GMT - J. Max Mena:

Okay, I agree we should focus on why I'm seeing so much scratchiness. Since you aren't seeing it I'm not sure what to tell you - pretty much any sound source I can think of is very very scratchy where noted - be it a music file (Rhythmbox/VLC/whatever), a video file (VLC / mplayer), sound via a browser (Pandora / Spotify (needs Widevine)), or video via a browser (Netflix / YouTube). Sound on my system is very scratchy.

I vote we start with MP3 and Opus since Opus is the new default and MP3 was the default for a short while, and MP3 the most reliable one to act up.

When running with MP3 and -d gstreamer I get these prints to flood the output constantly:

2017-09-15 15:02:16,778 sound output queue_underrun(<__gi__.GstQueue object at 0x7f3edfd7e640 (GstQueue at 0x10070f160b0)>,)
2017-09-15 15:02:16,807 sound output queue_running(<__gi__.GstQueue object at 0x7f3edfd7e640 (GstQueue at 0x10070f160b0)>,)
2017-09-15 15:02:16,807 sound output queue_pushing(<__gi__.GstQueue object at 0x7f3edfd7e640 (GstQueue at 0x10070f160b0)>,)
2017-09-15 15:02:16,807 sound output queue_underrun(<__gi__.GstQueue object at 0x7f3edfd7e640 (GstQueue at 0x10070f160b0)>,)
2017-09-15 15:02:16,819 sound output queue_running(<__gi__.GstQueue object at 0x7f3edfd7e640 (GstQueue at 0x10070f160b0)>,)
2017-09-15 15:02:16,819 sound output queue_pushing(<__gi__.GstQueue object at 0x7f3edfd7e640 (GstQueue at 0x10070f160b0)>,)
2017-09-15 15:02:16,819 sound output queue_underrun(<__gi__.GstQueue object at 0x7f3edfd7e640 (GstQueue at 0x10070f160b0)>,)

And, based on what I'm seeing on the graphs, the sound queue on my system is constantly being underrun, which would explain the scratchiness (I am pretty sure).

Now, with opus, I get the exact same prints, but they come and they go - when they stop the sound isn't scratchy. In addition, the sound did stop completely a couple of times and come back after a few seconds - and pausing the video for a period of time then coming back to it causes the sound to completely stop(during this time the underrun prints stop). Also, the sound did completely stop a few times.

I did notice while it was playing clearly that when I switched virtual desktops (I keep the Xpra windows open on the first one, and Firefox on the second one) it would produce a moment of scratchiness when I switched back to the desktop with the audio source playing. I wonder if that may have an effect since we paint windows slower when the desktop isn't in view.


Fri, 22 Sep 2017 21:35:42 GMT - J. Max Mena: owner, status changed

Upped my server to r16954:

I noticed that when using opus sound seems to behave fine, but -d gstreamer on the server side, I see it printing:

2017-09-22 14:34:43,575 sound source cutter: skipping buffer with pts=121133500000 (min-timestamp=121230000000)
2017-09-22 14:34:43,576 sound source cutter: skipping buffer with pts=121153500000 (min-timestamp=121230000000)
2017-09-22 14:34:43,576 sound source cutter: skipping buffer with pts=121173500000 (min-timestamp=121230000000)
2017-09-22 14:34:43,576 sound source cutter: skipping buffer with pts=121193500000 (min-timestamp=121230000000)
2017-09-22 14:34:43,588 sound source cutter: skipping buffer with pts=121213500000 (min-timestamp=121230000000)

Even while it should be playing sound. Could that be causing the underruns?

EDIT: (accidentally pasted over this part)

And when I see it print that sound gets scratchy on the client, with the client printing that the sound queue is getting underrun - and that's right about when the sound sync goes out the window.


Fri, 22 Sep 2017 21:37:40 GMT - J. Max Mena:

For reference, I'm watching this video https://www.youtube.com/watch?v=OTh4f7ye1IQ&index=3&list=PLOQZmjD6P2Hl3KrDBs6Vcc_5XDZbqPABC (the guy runs a YouTube? channel about popular guns in video games) in Chrome/Firefox? (checked both).


Tue, 17 Oct 2017 03:57:04 GMT - Antoine Martin: owner changed

Even while it should be playing sound.

The buffers shown there are only 20ms apart, isn't it possible for your audio to have gone silent for 5 buffers? (1/10th of a second)

Could that be causing the underruns?

Yes, that's very likely.

And when I see it print that sound gets scratchy on the client

Underruns would do that. Please include the sound buffer level graphs at that point.

and that's right about when the sound sync goes out the window.

"out of the window" - ahead or behind? I assume that the sound ends up being ahead?


Related changes:

If you still encounter problems:

Other things worth trying on the client:

On the server:


Wed, 01 Nov 2017 19:34:45 GMT - J. Max Mena:

So I upped my server and client to r17281:

I think I'll hold on to this ticket a bit longer to double check it.


Thu, 02 Nov 2017 07:47:29 GMT - Antoine Martin:

Lowering XPRA_CUTTER_THRESHOLD to 0.000001 makes ...

A value this small effectively disables cutter. With debugging enabled, you should have noticed that cutter was never firing.

Audio is recorded at 16-bit per channel, so the maximum precision in decimal is:

$ python -c "print('{0:.8f}'.format(1.0/2**16))"
0.00001526

Have you tried any of the other tunables from comment:9?

Also, these should help:

If not, please see #849 for the kind of details needed (graphs, info, tested with tc to cause latency, etc..)


Mon, 11 Dec 2017 23:06:53 GMT - J. Max Mena:

As of trunk r17606 - sound is much better - the sync is about perfect, and all the scratchiness I noted isn't apparent any more. I'm tempted to close this, but I'll see if I can test it a bit more thoroughly than watching a couple video files and some YouTube? videos.


Tue, 12 Dec 2017 22:30:56 GMT - J. Max Mena: status changed; resolution set

Upped my server and client to r17607 and done some more thorough testing today and this seems to have all but gone away - the scratchiness is gone, and sound sync is much better (as good as it was before).

I'm gonna close this.


Mon, 22 Jan 2018 09:22:27 GMT - Antoine Martin:

Note: I have now encountered the Cannot send block reference with non-registered memfd ID ... error with r18103 (for #1266) so now we also disable memfd by adding --enable-memfd=no to the pulseaudio command line.


Sun, 03 Mar 2019 12:50:58 GMT - Antoine Martin:

Follow up in #2186.


Sat, 23 Jan 2021 05:29:46 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/1638