Follow up from #400, #669 - related to #835.
Some fixes (backported already, see ticket:669#comment:43) and some changes worth recording (some of which we may want to backport): r9328.
r9407 allows us to try to use gstreamer 1.x with python2 for probing and debugging (not really usable yet).
In order to completely move to a subprocess for all the gstreamer bits, we need to ensure that (almost) all calls to browser/xpra/trunk/src/xpra/sound/gstreamer_util.py go through the subprocess wrapper. Either by adding new commands to the wrapper for probing (ie: getting the list of plugins) or by ensuring the data is exported back to the client / server via info packets from the subprocess.
Grepping the source:
Packaging updates:
If this works out, we can enhance the logic to take the average value and range of the queue values to try to minimize its size (helps with #835).
XPRA_SOURCE_APPSINK="appsink name=sink emit-signals=true max-buffers=10 drop=true sync=true async=true qos=true" \ xpra start :10
experimenting with auto-tuning the queue to minimize overruns and underruns whilst keeping the overall delay low
support for using palib instead of execing pactl and parsing the output
fixed palib: add setup.py for installation, ensure we can call it multiple times without causing problems with signals
See the work in progress patches above based on the code found in pypacl: we want the palib parts which are nice, but not the pypactl bits which are not finished and not usable. This would require packaging it, which shouldn't be too hard (GPLv3+), a setup.py has already been added.
specfile
updated source with py3k fixes
updated source with py3k fixes and server_info support
As of r9537, the code will now try to load palib and print a warning before falling back to execing pactl as before.
I have posted beta RPM builds of python-palib
.
Everything seems to work, and faster too. It is cleaner / safer than the previous code which relies on execing "pactl" and parsing strings (and hoping they won't change in the future!). We also get more information out of the pulseaudio server:
$ ./xpra/sound/pulseaudio_util.py device.alsa_input.pci-0000_00_14.2.analog-stereo : Built-in Audio Analog Stereo device.alsa_input.usb-046d_HD_Pro_Webcam_C920_15475ECF-02-C920.analog-stereo : HD Pro Webcam C920 Analog Stereo device.alsa_output.pci-0000_00_14.2.analog-stereo : Built-in Audio Analog Stereo device.alsa_output.pci-0000_00_14.2.analog-stereo.monitor : Monitor of Built-in Audio Analog Stereo devices : 4 pulseaudio.cookie : 394338217 pulseaudio.default_sink_name : alsa_output.pci-0000_00_14.2.analog-stereo pulseaudio.default_source_name : alsa_input.pci-0000_00_14.2.analog-stereo pulseaudio.host_name : desktop pulseaudio.id : 1000@0d37226ca0064aaab1db9e66016c45f5/2311 pulseaudio.server : {0d37226ca0064aaab1db9e66016c45f5}unix:/run/user/1000/pulse/native pulseaudio.server_name : pulseaudio pulseaudio.server_version : 6.0 pulseaudio.user_name : antoine pulseaudio.wrapper : palib
about python palib: this library is old (2010), the code is ugly, badly indented, etc.. But it works.
Ideally, we could re-write the bits we use from palib using libpulseaudio - which is also ugly, less friendly (the code is generated automatically from the library) - but since we use very little and don't actually use the mainloop, this would be cleaner.
As of r9633 + r9636 (+some minor fixes in r9634 + r9635), we can run gstreamer 1.x from the GTK2 client or server (or even gstreamer 0.10 from the GTK3 client if one wanted to do so - apart from testing, I can't imagine why you would want to do that).
All calls to the sound subpackage now go through a subprocess, which can use a different gobject / glib library.
By default, we still run sound with the same python version as the client or server that controls it, but this can be overriden on posix with:
XPRA_SOUND_COMMAND="/usr/bin/python3 /usr/bin/xpra" /usr/bin/python2 xpra start ...
OSX and win32 would require a lot more work to support this as we would need to package two versions of the python interpreters (2.7.x and 3.x), each with all the dependencies.
Here are some new useful hidden subcommands, used by some xpra help command line options and for populating the list of codecs on startup:
$ python2 /usr/bin/xpra _sound_query sources sources=pulsesrc,autoaudiosrc,alsasrc,osssrc,oss4src,audiotestsrc
$ python2 /usr/bin/xpra _sound_query encoders encoders=mp3,flac,wavpack,wav
$ python2 /usr/bin/xpra _sound_query decoders decoders=mp3,flac,wavpack,wav
Note how you get different results when running with python3 (where we now disable flac to avoid a bug):
$ python3 /usr/bin/xpra _sound_query encoders encoders=mp3,wavpack,wav
updated patch
palib leaks file descriptors and is unmaintained, we need something better. (see #912)
shows the min/max/cur values of the sound buffer as a graph on session info
the new queue level graph shown in the session info window
I've wasted time on pulseaudio for nothing, see ticket:912#comment:4 ...
But as of r10227, we now show the sound buffer level and min/max limits in use:
I want to do a bit more testing on various platforms to see how this behaves, what range of values we get, etc.. @afarr: feel free to provide feedback on this.
Then it shouldn't be too hard to adjust the levels in a more gradual way. We ought to be able to figure out how low we can get the level by keeping a history of the level range we observe. Actually lowering the level is a bit more difficult:
Problems:
example of sound source jitter causing overruns and underruns
As of r10231, it is a bit easier to stress test the sound level:
XPRA_SOUND_SOURCE_JITTER=700 xpra start ..
Will introduce random jitter in the delivery of sound buffers (given in milliseconds, the random value will be between 1 and the value specified). This will cause the client side levels to go up, and occasionally hit overruns:
switch to 100ms sampling so we can see more clearly the changes
The data sampling is still asynchronous, but as of r10234 we run it every 100ms which makes the graph data more precise:
try to tune the queue dynamically to minimize buffering
shows how we gradually lower the buffer level by lowering the max level
harmless underruns
a very clear example of what we're trying to do with the max level: lower the current level
Recap: the point of these changes is to try to keep the buffer level low as this helps with #835.
Dealing with underruns is relatively easy: we just temporarily increase the min-level which causes data to accumulate in the buffer (visually, the green line goes up and back down again) Going below "min" usually triggers an underrun, but that doesn't always causes the sound to stutter or stop. Here's what it looks like:
Dealing with overruns is much harder:
Those two behaviours conflict with each other.. The events are asynchronous and multi-threaded, which makes it very hard to tie things together. For example, the "Level" can go above the "max" threshold without triggering an overrun if the pipeline consumes the buffers in time... Changing the "max" level also causes the sound pipeline to stutter, so we can't change it too often.
Here's what we try to do in r10250:
(this lowering causes the range to go up... which can cause the max level to go back up, but by that point the work has been done already)
New tunables:
XPRA_SOUND_GRACE_PERIOD=2000
number of milliseconds to wait for before we do anything with underruns and overruns, allows the pipeline time to settle down a bit - probably find as it is. (could eventually be replaced by an event, like the one we get when the decoder is processing the codec and gives us the codec name - meh)
XPRA_SOUND_MARGIN=50
(from 0 to 200), tunes how aggressively we try to lower the max level (from 0=no margin=aggressive to 200 which lets the buffer level go wild) - this one may well need tuning. Too low and we get too many overruns, too high and the buffer level is too high...
XPRA_SOUND_FAKE_OVERRUN
no longer exists..
Testing notes / some ideas of things to try to see what effect they have:
XPRA_SOUND_SUBPROCESS_DEBUG=1 xpra attach ...
XPRA_SOUND_COMMAND=python3 /usr/bin/xpra
- we want to switch to python3 / gstreamer 1.x sooner rather than later, see #903)
XPRA_WRAPPER_FAULT_INJECTION_RATE=200 xpra attach ...
During my testing I found (mostly with mp3, and also with vorbis):
PS:
xpra info | grep caps
GStreamer-CRITICAL **: gst_segment_to_stream_time: assertion 'segment->format == format' failed
(not much we can do about this I am afraid... I'll have to write some code to skip this combination since the warning is deep in gstreamer code and we can't really avoid it since gstreamer is the one that is generating the packet metadata that causes the problems)
And here's the proof:
gst-launch-0.10 -q audiotestsrc ! vorbisenc ! gdppay ! filesink location=/dev/stdout | gst-launch-0.10 filesrc location=/dev/stdin ! gdpdepay ! vorbisdec ! autoaudiosink gst-launch-1.0 -q audiotestsrc ! vorbisenc ! gdppay ! filesink location=/dev/stdout | gst-launch-0.10 filesrc location=/dev/stdin ! gdpdepay ! vorbisdec ! autoaudiosink gst-launch-1.0 -q audiotestsrc ! vorbisenc ! gdppay ! filesink location=/dev/stdout | gst-launch-1.0 filesrc location=/dev/stdin ! gdpdepay ! vorbisdec ! autoaudiosink
gst-launch-0.10 -q audiotestsrc ! vorbisenc ! gdppay ! filesink location=/dev/stdout | gst-launch-1.0 filesrc location=/dev/stdin ! gdpdepay ! vorbisdec ! autoaudiosink
@afarr: ready for some testing / feedback.
Many more improvements have been made, in particular r10305 which makes vorbis, speex and opus so good that they should now displace mp3 as the new default codec in 0.16 (and it is such a noticeable improvement that this should probably be backported too). We can keep mp3 around for the html5 (and android?) clients, opus is not available with python2 (does not work which is a shame) but we can use it server side by using python3 for the sound wrappers. So the default order should probably now be: opus, vorbis, mp3, flac, speex, wav, .. But obviously, this will require a lot more testing. mp3 has served us well until now.
flac is still disabled on win32 (see #749 for details, maybe we should also re-open #300? see also #299) Though we can now force enable any codec using:
XPRA_SOUND_CODEC_ENABLE_XXXXXXX=1
(replace XXXXX with the codec name in uppercase, ie: FLAC
)
The latency is so much better that we get sound sync (#835) almost for free as long as the network allows us to keep the client-side buffer low (see comments above for details).
I would also like to add a new capability to clients so we can use different container formats with new clients: it looks like gdppay + gdpdepay is more lightweight than using oggmux + oggdemux. PS: the latency improvement may cost a little bit more CPU usage. PS2: we should probably think about switching to GStreamer 1.x by default server side in 0.16, or at least have an easier way of switching (config file instead of env var) PS3: minor updates in r10345 (commit message is wrong).
Done some thorough testing on the following client OSs against a trunk Fedora 21 Server:
I still have a few more OSs I can try, but I have a solid 'feel' for how the sound sync is working in each OS with their default codecs (my next step is to try with the different codecs in different client OSs). As far as I can tell it's pretty close. Sometimes sound can actually get slightly ahead of video - I've seen it do this in Windows and OSX. Fedora works well - although video got choppy on my low end machine. (Smo thinks it's using too much CPU, though)
The only cases where I've had a problem was after I was using a session for a couple hours (doing boring QA things...Trac among others), and the sound queue had steadily climbed to 150-200ms. Other than that, it's surprisingly good coming from 14.XX and 15.XX!
I'm not entirely sure what info you'd like, so if you want anything more concrete than first impressions(maybe figure out a screen record with sound?), let me know and I'll try and collect it.
Testing a bit with 0.16.0 win32 r10306 client against a fedora 21 0.16.0 r10306 server...
XPRA_SOUND_SOURCE_JITTER=500
, on a random mp3 playing website with chrome (http://www.mp3juices.cc/ no video playing), and it looks like the max level adjusts pretty well to the "topography" of the sound buffer levels (in the neighborhood of 500 ms). the min level doesn't seem to be adjusting at all in this case though... for the most part rolling at a flat 0 ms.
XPRA_SOUND_SUBPROCESS_DEBUG=1
client side, before re-starting the mp3 player, the client-side output is showing overrun counts = 0 consistently, but the sound levels seem to run around 400 ms (with a lot more dips to 0 ms) - but the max level, rather than the 565 ish ms, seems to have settled at 625 ms (resulting in a lot more whitespace between the sound level and the max level in the case of videos playing dialog, several tabs of youtube, but without the mp3 player.
With XPRA_SOUND_SOURCE_JITTER=0
A quick couple of milliseconds of the client-side debug output from the above (I assume it will look about like you expect):
2015-08-19 18:55:15,756 sound-sink export(info, ...) 2015-08-19 18:55:15,756 sound-sink send: adding 'info' message (0 items already in queue) 2015-08-19 18:55:15,756 sound-sink add_packet_to_queue(info ...) 2015-08-19 18:55:15,756 sound-sink processing packet add_data 2015-08-19 18:55:15,756 sound-sink add_data(522 bytes, {'duration': 26122449, 'sequence': 0, 'time': 1440035724113L}) queue_state=pushing 2015-08-19 18:55:15,756 sound-sink processing packet add_data 2015-08-19 18:55:15,770 sound-sink pushed 522 bytes, new buffer level: 78ms, queue state=pushing 2015-08-19 18:55:15,770 sound-sink set_min_level pct= 0, cmtt= 0, mtt= 0 2015-08-19 18:55:15,770 sound-sink set_max_level lrange=130, last_max_update=237s 2015-08-19 18:55:15,770 sound-sink set_max_level overrun count=1 , margin= 50, pct= 0, cmst=152, mst=186 2015-08-19 18:55:15,770 sound-sink add_data(417 bytes, {'duration': 26122449, 'sequence': 0, 'time': 1440035724113L}) queue_state=pushing 2015-08-19 18:55:15,770 sound-sink pushed 417 bytes, new buffer level: 104ms, queue state=pushing 2015-08-19 18:55:15,770 sound-sink set_min_level pct= 0, cmtt= 0, mtt= 0 2015-08-19 18:55:15,770 sound-sink set_max_level lrange=130, last_max_update=237s 2015-08-19 18:55:15,770 sound-sink set_max_level overrun count=1 , margin= 50, pct= 0, cmst=152, mst=186 2015-08-19 18:55:15,786 sound-sink processing packet add_data 2015-08-19 18:55:15,786 sound-sink add_data(522 bytes, {'duration': 26122449, 'sequence': 0, 'time': 1440035724142L}) queue_state=pushing 2015-08-19 18:55:15,786 sound-sink pushed 522 bytes, new buffer level: 130ms, queue state=pushing 2015-08-19 18:55:15,786 sound-sink set_min_level pct= 0, cmtt= 0, mtt= 0 2015-08-19 18:55:15,786 sound-sink set_max_level lrange=130, last_max_update=237s 2015-08-19 18:55:15,786 sound-sink set_max_level overrun count=1 , margin= 50, pct= 0, cmst=152, mst=186
I'll see about finding some local sound & video to play (rather than websites) to see if it behaves roughly similarly. I'll also try some different jitter settings, and maybe some encodings, as soon as I get the chance.
Am I right in assuming that, currently, video encodings won't have any effect on the sound channel levels? (I can't imagine trying to encode video with rgb will be kind to processors, but is it worth trying crazy combinations like that to see what effect it has on the sound... or rather, if it has any effect?)
..but I have a solid 'feel' for how the sound sync is working..
This ticket is not sound sync.
This ticket is about more general improvements, which will help with sound sync. See comment:14 : Recap: the point of these changes is to try to keep the buffer level low as this helps with #835.
We need data on how well this is working. The latency graph was custom made for easily visualizing this data.
the sound queue had steadily climbed to 150-200ms
Did it ever correct itself back down? What does it look like? Where are the numbers?
Also, please try vorbis and opus: we want to change the default codec order, at least in trunk, see comment:16.
still doesn't seem to be adjusting upward though, in case that's significant.
No, we don't want the min level to go up permanently, it is only used briefly to force the buffers to fill up when we get underruns, to try to prevent further overruns.
That said, if in the future we find that the sound is ahead of the video then we can use it to delay the sound and bring it in sync. I don't have any numbers to judge.
Am I right in assuming that, currently, video encodings won't have any effect on the sound channel levels?
Less efficient encodings (especially at high res), like rgb without any compression, will use up more bandwidth and each picture packet will be much bigger and so they will cause more jitter in the sound buffer levels.
The jitter hack is good for seeing how well the code deals with it, real jitter is better because it tell us how jitter happens in the field.
Since I don't see other codecs being tested and it works very very well for me, r10372 changes the preferred order to be: VORBIS, OPUS, FLAC, MP3, WAV, WAVPACK, SPEEX. So vorbis should get used and tested now by default.
Ideally, we want to know for each encoding and on a wide variety of platforms (maybe make a table):
Important note: r10391 switches from Bytes/s to bits/s as the unit used for sound. (more standard for sound) Vorbis seems quite happy at 96Kbit/s.
The latency fix has been applied to v0.14.x (r10430) and v0.15.x (r10429)
work on completely removing any gstreamer imports from the client and server: everything goes through the wrapper, and we only call it once to probe
We were able to do a good bit of testing with osx 0.16.0 r10380 against fedora 21 0.16.0 r10380 with a variety of codecs. I'll see about attaching in an orderly fashion for your perusal (not sure if inline will be more clutter or more useful, so I'll leave that for later).
We'll try to run more of the same with more platforms as we can.
osx & fedora 21 0.16.0 r 10380 default ... encodings
osx & fedora 21 0.16.0 r 10380 default ... encodings graph
osx & fedora 21 0.16.0 r 10380 server & client vp8 ... encodings info
osx & fedora 21 0.16.0 r 10380 server and client vp8 ... encodings graph
Err... looks like we have been so focused on video, we tested with a variety of encodings, rather than codecs.
Will re-run with osx and other platforms focusing on different codecs, graphs... and details of issues. Soon. (Let me know if you'd like more graphs of further encodings permutations though...)
The OSX client build you used only supports mp3 and wav, that is going to seriously limit your testing. When I test my r10380 beta OSX build, I do get vorbis. Was this one of your own builds? There is a new beta (r10446) which has all the codecs available (except for opus which requires python3). Though by the time you read this, there may be new changes worth testing.
The server build has a few more (wavpack and flac) but is missing the new default (vorbis) and speex - how did you install the server package?? (the required gstreamer packages are already listed as dependencies and should have been installed automatically). (You will need the python3 xpra build to be able to test opus - documented above).
Also, those graphs don't show anything really interesting. I guess that everything was running smoothly? (it doesn't state what the client window was doing or what sort of sound was playing)
The vp8 info shows that you used encodings=vp8
rather than encoding=vp8
(the 's' makes all the difference). Was this intentional?
allows us to completely remove all gstreamer imports from the client and server, we only deal with sound properties and let the sound subprocess do what it claims to be able to handle through these properties
(not merging the patch above just yet because it could break things and I haven't tested it enough yet - but it is definitely the right way of doing things, we no longer exec the subprocess multiple times for probing, sound starts up quicker, we save memory, should make things more reliable when mixing python2 and python3 subprocess, etc..)
I ran xpra with default speaker codecs for both client and server. There was no weird behavior. The sound was great and continuous. Attached screenshots.
I suspect that osx client might've been running on a 10.6.8 old osx machine... maybe that was why the VORBIS (and others) weren't available?
Anyway, pvekateswaralu will post some new graphs from a newer osx.
Meanwhile I did some more testing with windows client 0.16.0 r10442 against a fedora 21 0.16.0 r10306.
Started with default codecs, which ran VORBIS on both sides.
Have a screenshot for graphs with:
After starting the initial youtube video, I noticed a little stutter in the sound, and saw the following when I checked the server output (not certain it corresponded, but this was the only time I saw this output:
2015-08-26 17:02:03,870 sound-source pipeline warning: Can't record audio fast enough 2015-08-26 17:02:03,870 sound-source pipeline warning: ["gstbaseaudiosrc.c(840): gst_base_audio_src_create (): /GstPipeline:pipeline0/GstPulseSrc:pulsesrc0:\nDropped 1080640 samples. This is most likely because downstream can't keep up and is consuming samples too slowly."]
When I started the mp3 player in addition to the video, I heard another short stutter, but didn't see any sign of output either client or server side.
I then tried to start the server with --speaker-codec=mp3
, but checking in the session info I saw that it was listing the microphone codec as being mp3, but the speaker codecs as unaffected.
A quick check for info output this:
[tlaloc@jimador ~]$ xpra info :13 | grep codec client.encoding.avcodec2.version=(56, 41, 100) client.speaker.codec=vorbis client.speaker.codec_description=Vorbis server.argv=('/usr/bin/xpra', '--bind-tcp=0.0.0.0:1201', '--mdns=no', '--start-c hild=xterm', '--start-child=xterm', 'start', ':13', '--speaker-codec=mp3', '--daemon=no') window[3].encoding.pipeline_option[0].csc=codec_spec(swscale) window[3].encoding.pipeline_option[0].encoder=codec_spec(x264) window[3].encoding.pipeline_option[1].encoder=codec_spec(x264) window[3].encoding.pipeline_option[2].csc=codec_spec(swscale) window[3].encoding.pipeline_option[2].encoder=codec_spec(x264) window[3].encoding.pipeline_option[3].csc=codec_spec(swscale) window[3].encoding.pipeline_option[3].encoder=codec_spec(x264) window[3].encoding.pipeline_option[4].csc=codec_spec(swscale) window[3].encoding.pipeline_option[4].encoder=codec_spec(x264) window[3].encoding.pipeline_option[5].csc=codec_spec(cython) window[3].encoding.pipeline_option[5].encoder=codec_spec(x264)
Also attaching an xpra info so you can check the client and server builds real quick if anything looks odd (should both be your builds, but I've burned myself with cross-wired repos that I'm still mediocre at installing).
sound output with VORBIS and just xterm
windows client with VORBIS and youtube video, not scrolling
windows client with VORBIS and youtube video, while scrolling
windows client with VORBIS and youtube video, fullscreen 4K
windows client with VORBIS, youtube video and mp3 player in second tab, scrolling on youtube tab (video offscreen)
windows client with VORBIS, youtube and mp3 player, focus on (non-video) mp3 player tab
windows client, server with --speaker-codec=mp3, changes microphone codec, info
windows client, fedora 21 server, default codecs, xpra info
@pvenkateswaralu: I don't know what the "default codec" is, as this will vary depending on what codecs you have installed and what the server supports.
sound-source pipeline warning: Can't record audio fast enough sound-source pipeline warning: ["gstbaseaudiosrc.c(840): gst_base_audio_src_create (): \ /GstPipeline:pipeline0/GstPulseSrc:pulsesrc0:\nDropped 1080640 samples. \ This is most likely because downstream can't keep up and is consuming samples too slowly."]
That's interesting. If the buffers accumulate too quickly, we do drop them, but we do this in the gstreamer capture element at the end of the compression pipeline (aka "appsink") not in the soundcard capture element at the beginning of the pipeline (aka "pulsesrc" on Linux).
The pipeline goes something like this: pulsesrc -> audioconvert (changes sample rate, etc) -> volume control -> encoder -> container -> capture (appsink).
So my guess is that the compression is taking too long.. And I don't want to add a queue element in this pipeline, as this adds latency and is difficult enough to deal with client side (that's the point of this ticket). We may have an issue if the CPU is not capable of compressing in realtime. Most modern CPUs are more than fast enough to compress sound in realtime no matter what encoding is used. So I am tempted to just make a note of this and hope for the best.
When I started the mp3 player in addition to the video, I heard another short stutter, but didn't see any sign of output either client or server side.
Does this show up on the latency graph as a spike?
If you can reproduce, you may want to run the client and/or server with -d sound
and capture the few KB of log output around the time of the event, then look for underruns or overruns in the output. Those events are no longer logged at "info" level, only at "debug" level because the new sound code triggers them far too often.
I then tried to start the server with
--speaker-codec=mp3
, but checking in the session info I saw that it was listing the microphone codec as being mp3, but the speaker codecs as unaffected.
Thanks, that's fixed in r10454.
This graph shows a spike in latency (probably as you started scrolling):
This graph shows the latency going through the roof: So I don't think we should worry too much about the sound doing badly in this case, we should deal with the underlying problem instead which is that the encode+send takes far too long (over 2 seconds!). Can you please file a separate ticket for this?
All the other graphs look fairly normal to me: the latency is in the 50 to 150ms range which is fine.
The xpra info looks about right. FYI: the important bits are found using:
$ xpra info | grep client.speaker client.speaker.buffers=4829 client.speaker.bytes=971134 client.speaker.caps={'application/x-gdp': {}} client.speaker.codec=vorbis client.speaker.codec_description=Vorbis client.speaker.pid=11353 client.speaker.pipeline=pulsesrc ! volume name=volume volume=1.0 ! vorbisenc ! gdppay crc-payload=0 crc-header=0 ! appsink name=sink emit-signals=true max-buffers=10 drop=true sync=false async=false qos=false client.speaker.state=active client.speaker.time=1440641552 client.speaker.volume=100
As per comment:14 :
PS: r10451 is the final code cleanup which completely separates the gstreamer bits from the main process, it even prevents the gstreamer bindings from ever being imported into the main process by accident. This should have no user visible changes, but will reduce the memory footprint of the client and server quite a bit. (though the same libraries will still be imported by the sound helper subprocess when used)
Replying to antoine:
@pvenkateswaralu: I don't know what the "default codec" is, as this will vary depending on what codecs you have installed and what the server supports.
@antoine: The default speaker codec is the "Vorbis".
I did some testing with OSX client 0.16.0 r10380 against a fedora21 0.16.0 r10306.
Started with the default codecs, which ran VORBIS on both sides.
Have a screenshot for graphs with:
After starting youtube video on one tab and spotify mp3 on another tab, I noticed a little slutter in the beginning, but didn't see any output on either client side or the server side.
Here's a quick check for the info output.
[maint@Fedora21-Server-289 ~]$ xpra info :13 | grep codec client.encoding.avcodec2.version=(56, 41, 100) client.speaker.codec=vorbis client.speaker.codec_description=Vorbis window[2].encoding.pipeline_option[0].csc=codec_spec(swscale) window[2].encoding.pipeline_option[0].encoder=codec_spec(x264) window[2].encoding.pipeline_option[1].encoder=codec_spec(x264) window[2].encoding.pipeline_option[2].csc=codec_spec(swscale) window[2].encoding.pipeline_option[2].encoder=codec_spec(x264) window[2].encoding.pipeline_option[3].csc=codec_spec(swscale) window[2].encoding.pipeline_option[3].encoder=codec_spec(x264) window[2].encoding.pipeline_option[4].csc=codec_spec(swscale) window[2].encoding.pipeline_option[4].encoder=codec_spec(x264) window[2].encoding.pipeline_option[5].csc=codec_spec(cython) window[2].encoding.pipeline_option[5].encoder=codec_spec(x264)
I did some testing with Fedora 21 client 0.16.0 r10306
against a fedora 21 0.16.0 r10306.
Started with default codecs, which ran FLAC on both sides.
Have a screenshot for graphs with:
Sound Quality found good for all the different possibilities tested.
When playing the both the video-audio from youtube and audio from Spotify in two different Tabs simultaneously, the Audio from Tab closed played in the background but the sound quality remained unchanged.
allows more detailed selection of the codec used: codec + muxer
newer work in progress patch
makes gstreamer1 the default on platforms where we can easily use it (not osx or win32), pending because it causes problems with exe and log files on win32...
NOTE: I meant to post this comment last week. Apparently the computer I was using decided it would be better to hibernate until now...
Running a Windows 8.1 trunk r10380 client against a trunk 10399 Fedora 21 server:
Default codec was Vorbis (noticeably better than MP3)
Throughout testing I did adjust the volume periodically - no noticeable effect.
Screenshots of the Graphs page can be provided upon request; in addition to media files used.
As for now, I'll start switching speaker codecs and comparing them to each other. I have a nice consistent test I can use thanks to the video and audio files. Now that I'm using my home machines I'll have access to a much wider array of media files.
This weekend I'll get around to shrinking the Windows partition on my old laptop and see about triple booting (if the bootloader will let me - I have had issues in the past) Win7/Fedora 21/Linux Mint(or something else; maybe Fedora 22) and using that as a client, as it has an AMD GPU. Of course, that's after I fix my Linux server - I broke a fan the other day and need to completely take it apart...ugh.
Screenshots of the Graphs page can be provided upon request..
Those can be useful if they show unusual or unexpected behaviour. The dozens attached to this ticket so far, not so much.
Reminder: this ticket is about managing the client-side sound queue and keeping it as low as we can. So we want to stress the system (client / server / connection / sound server / etc..) to trigger spikes in latency. Hopefully the heuristics can force overruns and push the queue levels back down. (see the second graph in comment:14 for an example of how it is supposed to work)
I am most interested in cases where this does not work as it should:
Since changing the defaults seems to be the most effective way to get some testing done on a particular option, I have now changed the default gstreamer version we run on Linux (OSX and win32 would take a lot more work to package it, but it would be worth doing - just so we stop running woefully out of date code, with libraries we are unable to build from source on win32..), this is split into multiple changesets because they all touch different areas (and we may want to undo / change them individually): r10466, r10467, r10468, r10469, r10470, r10471 + r10472.
There are beta builds available win32, OSX. (looks like Fedora rpmbuild is still choking on the full vp9 selftests.. will fix) PS: I have just spotted a last minute problem: the sound subprocess seems to go zombie when we exit the client on win32 (and maybe osx too?)... exiting reliably vs killing all subprocesses, pick one...
(should not be assigned to me unless there are issues to address, which is what we are trying to find - it is testing time)
Lots more improvements and fixes: r10490, r10491, r10492, r10493, r10494, r10495, r10498, r10500. Also tested on Ubuntu and Debian with the gstreamer 1.x bindings.
Running a Fedora 20 r10475 trunk server and connected a Win8.1 r10504 client:
EDIT:
end edit
Will attach screenshot.
Connected to a home server from other side of city on wi-fi with mild latency.
@maxmylyn: thanks, this one is interesting. Did it ever come back down again? The buffer levels range from about 125ms to ~425ms, it should eventually (within a minute or so) try to lower the level down to the 0-300ms range (assuming that there aren't more spikes increasing the range during that time).
This sort of latency will probably require #835 to get the sound more or less in sync. The difficulty here is that the video will also suffer from this 300ms jitter! (was it far off as it is?)
The 2Mbps is a bit high! I suspect the accounting is off by a factor of 8. (256Kbps)
@pharindranath: please do not use word processors for recording comments. Having to click the link, click download, click open-with-word-processor, just to read 5 lines of plain text is cumbersome.
Here's the text found in the document I have now deleted:
Sound Quality found good for all the different possibilities tested. First Tab had youtube and second Tab had Spotify. When playing the both the video-audio from youtube and audio from Spotify in two different Tabs simultaneously, the Audio from Tab closed played in the background but the sound quality remained unchanged.
Client and server: Fedora 21, Xpra 0.16.0 r10306
What are the xterm, session info versions and "Scrolling_VideoTab.png" for? Can I delete them?
Having dozens of screenshots with very similar contents makes it very difficult to analyse the results. Is there anything noteworthy in them? All I can see is that the buffer levels stay below ~125ms, which is great but does not tell us how well the code manages the queue level when under stress.
Did it ever come back down again?
Nope. Stayed that way. For what it's worth, interacting with anything had some mild latency. When I attempted earlier while sitting on my laptop outside with a weaker Wi-Fi signal, Firefox was entirely unusable until I disabled the speaker. I suspect the 256kbps was too much bandwidth on the bare edge of the router's range. The nice thing about Xpra is that when nothing is updating, no bandwidth is used; however the speaker continues to send data even when no sound is playing - I'm not sure if this is a bug or not.
Was the video as far off?
Yes, actually. I would say most video/image updates had a similar (250-300ms) range of latency.
2Mbps is a bit high!
My bad...I apparently cannot read the "b/s", so it's actually 256kbps
I'll have an hour or two tomorrow to test this more thoroughly from the spotty Wi-Fi at my university.
EDIT:
Came back to a week old Trunk r10380 Server on my Fedora 21 VM. After reconnecting the sound queue was a bit high and laggy, but after a minute or so of usage the levels went back down, and now the session is good as new. I'm going to stop it and update the server to latest latest to include the improvements, and continue testing with different codecs.
Nope. Stayed that way.
Please provide a graph and a sample -d sound
output, the heuristics should try to lower the queue levels back down.
however the speaker continues to send data even when no sound is playing
From xpra, we don't know if sound is playing or not, so we just capture and forward all the time. The actual bandwidth used by "no sound" will vary depending on which codec is used, it will be higher now with the lower latency codec tweaks (many more small chunks).
Re-attempted a connection(from the same client against the same server) and now the connection is failing from the Win8.1 client with the following messages:
2015-09-02 13:14:32,069 unknown string message: 0xc0f4 / 198 / 0 2015-09-02 13:14:37,088 unknown string message: 0xc0f4 / 198 / 0 2015-09-02 13:14:50,142 unknown string message: 0xc0f4 / 198 / 0 2015-09-02 13:14:51,144 unknown string message: 0xc0f4 / 198 / 0 2015-09-02 13:14:52,312 failed to receive anything, not an xpra server? 2015-09-02 13:14:52,312 could also be the wrong username, password or port 2015-09-02 13:14:52,312 or maybe this server does not support 'unknown' compre ssion or 'bencode' packet encoding? 2015-09-02 13:14:52,326 Connection lost
EDIT:
Not entirely sure why it's not working, I'll have to re-attempt it in a bit; after my class.
Okay I've re-attempted a connection (Mind you this is the exact same client, server, and network location as Monday) several times and now the network no longer allows me to connect from here using SSH. (which is what I was using before..even with encryption) And, I can't test using TCP because I don't have the networking setup properly. I'll attempt again from home on the bare edge of our Wi-Fi. Until then, I think it's best to disregard what I encountered until I can recreate it somewhere else.
Is there a time frame for switching to GStreamer 1.x? Older GStreamer is going to be removed from Debian and Xpra is affected... See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=785859
Is there a time frame for switching to GStreamer 1.x?
GStreamer 1.x is now the default on most platforms.
Trunk or 0.15.x or both?
Trunk, 0.15.x is in maintenance mode.
sound buffer graphs with only an xterm and gedit, but with #985 looping xpra info outputting client side
Well, I found at least one more interesting case: I connected with just an xterm and a gedit launched from a second xterm... but when I opened the session info to the graphs page, I ran into the infinite loop xpra info bug of #985... which impacted the sound buffers pretty dramatically (looks like the max buffer levels adjusted pretty well to that madness):
@afarr: the interesting part would have come after you stopped getting those messages (closing the session info window), did the sound buffer level ever go back down again?
The bug in #985 is now fixed. Interestingly, since 0.15.x one cannot simulate those long latency spikes with this switch:
XPRA_FAKE_UI_LOCKUPS=500 xpra attach ...
because the sound is now processed without any UI thread intervention - which is great for overall latency.
You should be able to reproduce those big spikes with the jitter env var from comment:12 though. Or you can try again with the "buggy", pre-r10657 versions. If this is better at generating latency spikes, I'll have to try to emulate that behaviour.
file-name says it all...
walked to next room... brief spinners/jitter
@afarr: the interesting part would have come after you stopped getting those messages (closing the session info window), did the sound buffer level ever go back down again?
Uhh... once I close the session info window, I kind of lose the graph...
Anyway... I've here attached a graphical travelogue of a lap around the building with 0.16.0 r10624 osx client (failed to get hold of a post #985 fix client, so I had to go with a pre-bug version) - and left the timestamps on the screenshots, which will hopefully be at least interesting, if not useful, information.
I may be wrong, but the interesting stuff looks like what happens when I get spinners, when the session recovers initially... and how it stabilizes in the minute or so afterward... so I'll post three or four graphs inline... and leave it to you to look through the rest (and thereby not clutter the ticket too crazily).
You might, though, also find the difference between the walking away from the router graph (15.58.37) vs. walking back toward the router (16.06.01) interesting to compare.
In any case, if there are any cases that you'd like me to double-check, let me know (I'm particularly intrigued by what was happening at 16.01.35 ... looks like the graphs when I was inside and approaching the wi-fi routers... perhaps a portion of our floor is particularly porous?).
In any case...
Walking downstairs and out the front door... I started getting some serious spinners:
Then the spinners continued for a while:
Then the session spinners recovered:
... and finally the session and buffers seemed to stabilize:
I ran some more tests - walking away from routers around one corner, then a second, then turning around and coming back.
OSX client 0.16.0 r10624 (no session info bug) v. fedora 21 0.16.0 r10655.
I ran with --opengl=on
for one pass (osx latop with Intel Iris graphics card), then without for the second - only to then notice that the 0.16 client doesn't have the OpenGL blacklisted. I did get slightly different results though (was testing just by running cnn videos perhaps different content produced different results?), so I'll attach both sets.
I also tried running the same test with a password enabled and with encryption on, I'll attach those also.
only to then notice that the 0.16 client doesn't have the OpenGL blacklisted
Correct, see r9291.
This is due to the number of stability improvements in the OpenGL backend and the lack of maintenance and bugs found in the pixmap one.
walk-and-return graph travelogue starting point - sitting down
travelogue point two, up and walking away from one router, toward another - buffer seems to over compensate
travelogue stop three, stopping near second router - buffer still hasn't re-adjusted
travelogue point 4, walking past second router and around a corner, signal passing through walls
travelogue point 5, turn and walk around a second corner and down half a flight of stairs, seeing some spinners but signal stabilizes quickly - buffer copes very well
travelogue point 6, returning from around second corner and down some stairs; still around one corner - buffer adjusts very well, despite renewed spinners
Looking over those several piles of screenshots, it actually looks like the entire walk across the building (past a secondary router along the way) around a couple of corners and halfway down a flight of stairs has very little impact with one of the tests, the Level and the Max running as close together as any number of the other uninteresting shots previously attached.
Using encryption didn't seem to make the graphs any more interesting.
One of the tests did seem to show the Max bar diverging from the Level bar while walking away from the routers, but recovering as I started walking back toward the routers (Why it was different than the other pass when the encryption was also off I don't know, vicissitudes of the router? My hand was covering the wireless antenna?, or not covering it?)
In any case, here's a quick posting of that somewhat interesting case.
Started with a baseline at rest.
Then started walking, captured this while about halfway between initial and second router.
As I reached that second router, the Level continued at a relatively stable range, but the Max didn't adjust back down to match.
As I then walked around a corner, there were some latency "hills", at which point the Levels seemed to climb... which the Max seemed to have anticipated well - despite the Level plummeting a few times as the latency triggered some spinners.
As I then walked around another corner, and started walking down some stairs, the Max seemed to very well match the Level, mostly.
Walking back up the stairs and to only around one corner, the Max seemed to adjust much better to the Level changes, aside from the occasional spinner.
As I then approached the routers the Max seemed to match the Level even better, so I won't bother you with any more graphs to examine.
buffer level jumps when re-starting sound with control channel
Apologies in advance... but I found another interesting case. Whenever I stop/re-start the sound via control channel, no matter which codec (unsurprisingly), I see a jump in the initial sound buffer levels. Presumably this isn't something that should be expected regularly, but I thought I'd mention it.
Whenever I stop/re-start the sound via control channel, no matter which codec (unsurprisingly), I see a jump in the initial sound buffer levels
When the sound (re)starts, we set the initial level at ~450ms. You would see the same curve if you pop up session info as soon as you start the client.
The downward curve shown above is exactly the type of curve that we want to see when we try to push the sound buffer levels back down. (only difference is that the blue curve might be higher at that point, and we try to push it down)
For testing and simulating network conditions, see ticket:999#comment:1
I tried with XPRA_SOUND_SOURCE_JITTER=500
on a random mp3 playing website with chrome (http://www.lesmp3.com/ no video playing), and it looks like the max level adjusts pretty well to the "topography" of the sound buffer levels (in the neighborhood of 500 ms). the min level doesn't seem to be adjusting at all in this case though... for the most part rolling at a flat 0 ms.
Note: the min level are only adjusted temporarily to prevent underruns, as we can to keep the queue levels low. @afarr: still waiting for reproducible test cases, or evidence that we should close or even do more on this ticket. Re-assigning to you since @ashtonbarr8800 is not updating this either.
wav, listening to videos on youtube, with latency (frame total) at +/-157
wav, listening to videos on youtube, after latency (frame total) drops back to +/- 105
wav, at time when 125ms 100ms 23percent tc delay ended
wav as latency drops after ending tc delay
wav, as latency rises without tc being used
As long as I have been having issues with osx clients with no mp3 (#970 - comment 19) and fedora 21 servers with no vorbis (#1052) - decided to test the wav heuristics a bit.
Looks like there are issues when using wav
codec with latency (total frames) of 150 or more.
Testing using a tc setting of tc qdisc add dev eth0 root netem delay 125ms 100ms 23%
, I ran into the following overrun and less overrun graphs with wav:
With latency higher-
As latency drops-
Then, once the tc delays are stopped-
And, as the latency settles back toward usual-
And then, interestingly, as the latency rises again without turning to tc:
Not sure how often wav should be expected to be used in a network setting with latencies like this, but it definitely looked like an area where the heuristics seem to be missing a beat.
Checking a number of different tc settings with a connection using mp3, however, showed little of any (new) interest. I'll go through more thoroughly and then also try with a windows client a bit (with mp3 and wav and vorbis, if I can set up a server that will run it) and see if I find anything interesting.
For the most part though, it looks like the overrun buffer heuristics are doing well with un-crazy scenarios.
Follow ups: #970, #1074. #1075
One last couple of notes (Some last couple of...) — I won't bother to clutter with more screenshots though.
Testing some more with 0.17.0 clients (win32 r11649 (1 change) = 11653; osx r11687) against a 0.17.0 server (fedora 23 r11692); using tc
to induce delay and drop.
50 ms 50 ms 25%
the buffers are mostly good and only occasional overruns/stutters (often accompanied by spinners... oddly better sync'd than with 3% drop). win32/osx client about the same. Hard to regularly induce 'hiccups' even at 80ms 50ms 35%
delay.
80ms 50ms 35%
.
50ms 25ms 25%
I start getting hiccups/stutters... mostly in conjunction with spinners, but even up to 80ms 50ms 35%
the hiccups/stutters aren't very apparent ... until the tc hits a hot streak and the "level" graph starts dropping to 0 periodically (sometimes inducing spinners, sometimes just a sound stutter).
Ok, I don't think there's anything left to deal with in this ticket... any new issues will want a new ticket at this point.
Closing.
See also: #1617.
this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/849