Xpra: Ticket #1341: html5 client improvements: refactoring, mpeg4, scrolling, etc

See also:



Thu, 20 Oct 2016 03:49:35 GMT - Antoine Martin: status, description changed

Updates:

Still todo:


Mon, 31 Oct 2016 06:14:25 GMT - Antoine Martin:

Updates:


Thu, 03 Nov 2016 14:20:28 GMT - Antoine Martin:

r14387 enables scrolling, but this won't fire as it is because h264 is still disabled by default, maybe mpeg4 video needs doing first? (#1107)


Thu, 03 Nov 2016 17:53:30 GMT - Antoine Martin: attachment set

try to use the mpeg4 video from #1107


Fri, 04 Nov 2016 18:25:21 GMT - Antoine Martin: attachment set

updated patch


Fri, 04 Nov 2016 18:31:09 GMT - Antoine Martin:

Useful pointers for html5 video:

Not video related: browser feature compat table.


Sat, 05 Nov 2016 07:00:18 GMT - Antoine Martin: attachment set

this works! (on Firefox, with one frame of latency, no cleanup code, etc..)


Sat, 05 Nov 2016 12:54:26 GMT - Antoine Martin: attachment set

support mp3 sound via MediaSource? (google chrome only...)


Mon, 07 Nov 2016 13:32:18 GMT - Antoine Martin: attachment set

try more codecs to attempt to play in all browsers


Mon, 07 Nov 2016 13:33:44 GMT - Antoine Martin: attachment set

video and audio through mediasource api


Thu, 10 Nov 2016 10:52:48 GMT - Antoine Martin: attachment set

gives 3 working audio codecs with google-chrome, still none with Firefox


Fri, 11 Nov 2016 15:23:01 GMT - Antoine Martin: attachment set

ugly PoC hack for flushing frames from the ffmpeg encoder


Fri, 11 Nov 2016 15:35:52 GMT - Antoine Martin: attachment set

updates: buffer audio (now works with firefox), flush video (video shown is up to date)


Fri, 11 Nov 2016 15:54:36 GMT - Antoine Martin:

As of the patch above:

Still TODO:

etc

PS: using XPRA_SAVE_TO_FILE=1 then using ffmpeg -i test-vp8+webm-4.webm out%03d.png I can inspect individual frames as PNG and see that the stream is always up to date. So the problem is due to video buffering on the client side...


Mon, 14 Nov 2016 08:35:45 GMT - Antoine Martin:

Large update for handling sound via mediasource with aurora as fallback done in r14412 + r14413 + r14414 (mostly svn subclipse client f--- up!)

The video still lags, but this is not a problem with the video, the same happens with jpeg or png with Firefox! (the patch above just happened to work most of the time by repainting unnecessarily)


Mon, 14 Nov 2016 17:08:24 GMT - Antoine Martin: attachment set

patch updated for already merged audio bits


Tue, 15 Nov 2016 15:59:04 GMT - Antoine Martin:

Improvements and fixes:


About the status of the Broadway decoder:


About the mediasource video api:


Still TODO:


Wed, 16 Nov 2016 10:04:09 GMT - Antoine Martin:

Lots more improvements in r14434:

etc

Notes:


Thu, 17 Nov 2016 16:19:06 GMT - Antoine Martin:

More:


Fri, 18 Nov 2016 11:35:38 GMT - Antoine Martin:

Mistake: scaling was only added today in r14444


Wed, 23 Nov 2016 11:09:44 GMT - Antoine Martin:

Important update: r14474 (+ r14476 + r14477 derp fixup) worker stuff, see commit message for details.

Still TODO:


Thu, 24 Nov 2016 05:32:17 GMT - Antoine Martin:

Make sure we empty the queues fix: r14480. Maybe process_receive_queue should be using a while loop too rather than using recursion.

PS: new task: #1372, the html5 client should live in its own package.


More thoughts on the byte-copying: I'm just guessing that this is costly because it was the case with the python code when I had profiled it, many years ago now. Maybe the javascript engines are really really clever and optimize it somehow? But I doubt it can be as fast as memcpy.. which is already a penalty. It may be that most clients have enough CPU power to spare to do all this unnecessary copying, but every little saving helps.

That's why I had added code to send the packet header and data to the socket separately when the packets are big enough (r246 - 5 years ago!, 64KB is the threshold in the current code), this avoids concatenating them before sending which causes memory churn. And that's using a fast memcpy for doing the copying!

One way of dealing with this would be to queue the "Uint8Array"s directly, and then work a bit harder to extract the header from them. It sounds complicated, but in almost all cases, the header will be contained in a single array, with the data either appended to it or in the next array element(s). I'm sure concatenating multiple arrays into one big array is going to be faster than byte-copying. I'm not sure how much TCP packet aggregation websockets end up doing. It probably depends on all sorts of things from OS network configuration, congestion, browser, etc. We may need to re-queue left-over data from each process_receive_queue iteration, this can be stored in a dedicated variable.

Another note: do we really need to run the "process_xxxx_queue" timers all the time? Seems wasteful. Can't we just start scheduling the timers when we add elements to their queue (if not scheduled already) and stop them when the queue becomes empty?


Thu, 24 Nov 2016 12:48:33 GMT - Antoine Martin: priority changed

Much improved encoding selection for the html5 client with no video encodings available: r14485.

A bigger problem: paints have been completely broken since r14474.


Fri, 25 Nov 2016 04:42:37 GMT - Antoine Martin: owner, status changed

Paint fixed in r14490, ready for testing.


Thu, 22 Dec 2016 00:41:09 GMT - alas: owner changed

Tested some with 1.0 r14570 fedora 23 server - using both Chrome (55) on windows 8.1 and... well, Opera (41) on OSX 10.12.1.

With the default settings this html5 client is really really working well.

There are some probably not related to this ticket issues with scrolling events being lost when the session is shifted to a browser that's significantly smaller than the originally connected one (anecdotally, re-sizing the browser so that the canvas was large enough that the server-side chrome browser's scroll bar was visible, the scroll events began being caught); OSX laptop tracpad events result in dizzyingly & unmanageably fast scrolling (something about the tracpad sending more events with a smaller delta, apparently); and the fact that one has to remember that browser shortcuts, like control-t, will be captured by the local browser rather than being passed through to the server-side applications.

On to more relevant issues to this ticket...

2016-12-21 14:51:21,509 client display size is 1749x1106 with 1 screen:
2016-12-21 14:51:21,509   HTML (463x293 mm - DPI: 95x95)
2016-12-21 14:51:21,509     Canvas
2016-12-21 14:51:24,735 Error: expiring 6 missing damage ACKs,
2016-12-21 14:51:24,736  connection may be closed or closing,
2016-12-21 14:51:24,736  sequence numbers missing: 199, 200, 201, 203, 213, 217
2016-12-21 14:51:32,740 Error: expiring 1 missing damage ACK,
2016-12-21 14:51:32,740  connection may be closed or closing,
2016-12-21 14:51:32,740  sequence numbers missing: 220
2016-12-21 14:51:48,959 Warning: delayed region timeout
2016-12-21 14:51:48,960  region is 15 seconds old, will retry - bad connection?
2016-12-21 14:51:48,961  30 late responses:
2016-12-21 14:51:48,961   234 h264: 66s
2016-12-21 14:51:48,961   245 h264: 65s
2016-12-21 14:51:48,961   272 h264: 58s
2016-12-21 14:51:48,961   276 h264: 58s
2016-12-21 14:51:48,961   280 h264: 58s
2016-12-21 14:51:48,961   281 h264: 58s
2016-12-21 14:51:48,961   282 h264: 58s
2016-12-21 14:51:48,961   283 h264: 57s
2016-12-21 14:51:48,961   287 h264: 55s
2016-12-21 14:51:48,962   295 h264: 55s
2016-12-21 14:51:48,962   543 h264: 52s
2016-12-21 14:51:48,962   754 h264: 51s
2016-12-21 14:51:48,962  1022 h264: 49s
2016-12-21 14:51:48,962  1062 h264: 46s
2016-12-21 14:51:48,962  1068 h264: 42s
2016-12-21 14:51:48,962  1083 h264: 39s
2016-12-21 14:51:48,962  1103 h264: 37s
2016-12-21 14:51:48,962  1119 h264: 36s
2016-12-21 14:51:48,962  1120 h264: 36s
2016-12-21 14:51:48,963  1124 h264: 32s
2016-12-21 14:51:48,963  1129 h264: 27s
2016-12-21 14:51:48,963  1134 h264: 27s
2016-12-21 14:51:48,963  1139 h264: 27s
2016-12-21 14:51:48,963  1142 h264: 25s
2016-12-21 14:51:48,963  1147 h264: 24s
2016-12-21 14:51:48,963  1222 h264: 23s
2016-12-21 14:51:48,963  1233 h264: 22s
2016-12-21 14:51:48,963  1259 h264: 22s
2016-12-21 14:51:48,963  1284 h264: 21s
2016-12-21 14:51:48,963  1318 h264: 18s
2016-12-21 14:52:04,050 Warning: delayed region timeout
2016-12-21 14:52:04,051  region is 15 seconds old, will retry - bad connection?
2016-12-21 14:52:04,052  30 late responses:
2016-12-21 14:52:04,052   234 h264: 81s
2016-12-21 14:52:04,053   245 h264: 80s
2016-12-21 14:52:04,053   272 h264: 73s
2016-12-21 14:52:04,053   276 h264: 73s
2016-12-21 14:52:04,054   280 h264: 73s
2016-12-21 14:52:04,054   281 h264: 73s
2016-12-21 14:52:04,054   282 h264: 73s
...

... the painting at this point becomes extremely choppy and the entire session often becomes non-responsive for as much as 10 long seconds (occasionally something like a minute). I'll post a screenshot of a point I got it frozen at.

Client console logs:

startup complete
Client.js:123 audio media source open
Client.js:123 using audio codec string for vorbis+mka: audio/webm; codecs="vorbis"
Client.js:123 audio: requesting vorbis+mka stream from the server
Client.js:588 server connection is OK
Client.js:123 audio start of mediasource vorbis+mka stream
Client.js:958 audio play!
Client.js:123 close_video: video_source_buffer=undefined, media_source=undefined, video=undefined

Server logs:

2016-12-21 12:00:04,075 sound source using audio codec vorbis
2016-12-21 12:00:04,075 sound source using container format matroska
[31605:31656:1221/120010.466497:ERROR:browser_gpu_channel_host_factory.cc(113)] Failed to launch GPU process.
[31605:31656:1221/120010.468056:ERROR:browser_gpu_channel_host_factory.cc(113)] Failed to launch GPU process.
[31605:31656:1221/120010.468322:ERROR:browser_gpu_channel_host_factory.cc(113)] Failed to launch GPU process.
[31605:31656:1221/120010.468485:ERROR:browser_gpu_channel_host_factory.cc(113)] Failed to launch GPU process.
[31605:31656:1221/120010.468614:ERROR:browser_gpu_channel_host_factory.cc(113)] Failed to launch GPU process.
[31605:31656:1221/120010.468749:ERROR:browser_gpu_channel_host_factory.cc(113)] Failed to launch GPU process.
2016-12-21 12:00:11,156 client 1: close_video: video_source_buffer=undefined, media_source=undefined, video=undefined
[1:1:1221/120201.455646:ERROR:KeyboardEventManager.cpp(424)] Not implemented reached in static bool blink::KeyboardEventManager::currentCapsLockState()
2016-12-21 12:03:42,130 client 1: close_video: video_source_buffer=undefined, media_source=undefined, video=undefined
2016-12-21 12:03:53,866 client 1: audio queue overflowing: 250, stopping
2016-12-21 12:03:53,867 client 1: close_audio: audio_source_buffer=[object SourceBuffer], media_source=[object MediaSource], video=[object HTMLAudioElement]
2016-12-21 12:03:53,873 sound source stopping
2016-12-21 12:03:54,842 client 1: audio error
2016-12-21 12:03:54,843 client 1: close_audio: audio_source_buffer=null, media_source=null, video=null
[31605:31656:1221/120408.173414:ERROR:browser_gpu_channel_host_factory.cc(113)] Failed to launch GPU process.
[31605:31656:1221/120408.173835:ERROR:browser_gpu_channel_host_factory.cc(113)] Failed to launch GPU process.
[31605:31656:1221/120408.173895:ERROR:browser_gpu_channel_host_factory.cc(113)] Failed to launch GPU process.
[31605:31656:1221/120408.173930:ERROR:browser_gpu_channel_host_factory.cc(113)] Failed to launch GPU process.
[31605:31656:1221/120408.173970:ERROR:browser_gpu_channel_host_factory.cc(113)] Failed to launch GPU process.
[31605:31656:1221/120408.174002:ERROR:browser_gpu_channel_host_factory.cc(113)] Failed to launch GPU process.
2016-12-21 12:04:08,835 client 1: close_video: video_source_buffer=undefined, media_source=undefined, video=undefined
2016-12-21 12:04:59,822 client 1: close_video: video_source_buffer=undefined, media_source=undefined, video=undefined

Refreshing the browser tab that is connecting to the server, the sound will restart... but it crashes soon after again.

Not sure if you'd like any particular logs from the server side for any of these issues... or if you're ready to start making tickets for things like tracpads or non-responsive drop menu buttons in a javascript dialog box.

I'll re-assign this to you to look at and let me know what you want to pursue here and what you might want broken out into a more specific ticket.


Thu, 22 Dec 2016 00:42:08 GMT - alas: attachment set

Screenshot of scrolling with video


Tue, 27 Dec 2016 10:53:05 GMT - Antoine Martin: owner changed


Fri, 30 Dec 2016 02:28:56 GMT - rektide: cc set


Fri, 27 Jan 2017 00:41:51 GMT - alas:

Managed to make some time to test with both OSX 10.12.1 and Windows 8.1 against a 1.0.2 r14823 fedora 25 server.

As mentioned in #845, the sound seems to have no real problems aside from the poor network I'm using.

Testing with Chrome 55 and Opera 42 on the OSX and Chrome 56 (beta) on the Windows 8.1, I saw about the same performance on all. Wav would cut out within seconds (not hugely surprisingly); while the opus and vorbis both performed similarly, remaining stable until the client hit spinners and a ping timeout after about 3-5 minutes; and the mp3 (no mention of legacy) seemed to just run and run and clients didn't ping out for 10-20 minutes with that sound running.

As a comparison, I connected an OSX 1.0.1 r14723 client (with default presumably vorbis codec) and it took about the same 3-5 minutes to ping timeout.

I did run into a not-at-all-verbose sound error with the Windows 8.1 & Chrome 56, but wasn't able to reproduce at all.

2017-01-24 15:11:35,121 client 14: audio queue overflowing: 250, stopping
2017-01-24 15:11:35,126 client 14: close_audio: audio_source_buffer=[object SourceBuffer], media_source=[object MediaSource], video=[object HTMLAudioElement]
2017-01-24 15:11:35,129 client 14: audio error
2017-01-24 15:11:35,131 client 14: close_audio: audio_source_buffer=null, media_source=null, video=null
2017-01-24 15:11:35,143 sound source stopping

The scrolling, meanwhile, was very good with the defaults, but once I enabled experimental video (h264 I presume) and scrolled the canvas seemed to 'break up into little boxes', each of which seemed to be updating separately... presumably the scrolling code having issues... and I began seeing the 'delayed region timeout' errors. No sign of any new errors though.

I'll try to get a 2.0 build working so I can test that a bit too.


Tue, 31 Jan 2017 01:41:46 GMT - alas:

Ok, with a little help from Smo, got the 2.0 (r14909) server running and, as a bonus, seem to have upgraded to a less bandwidth-light network.

With that said, made some time to test sound with OSX 10.12.1 and Chrome 56 (the beta).

Looks like it's much more stable with all the codecs (well, didn't make time to try wav yet).

Vorbis and opus seemed about as stable as the (non-legacy) mp3 (vorbis didn't even timeout, must've found a hole in the bandwidth usage crowds).

Didn't manage to try with other browsers on OSX or any on windows ... but I'd be very surprised, after what I saw, to have them behave any differently (well, except maybe better).

Watching my ping times as I was testing, it looked like most of the issues I did see corresponded to super-jitter spikes and timeouts... like:

64 bytes from 10.0.32.138: icmp_seq=517 ttl=64 time=20.302 ms
Request timeout for icmp_seq 518
64 bytes from 10.0.32.138: icmp_seq=519 ttl=64 time=24.950 ms
64 bytes from 10.0.32.138: icmp_seq=520 ttl=64 time=21.777 ms
64 bytes from 10.0.32.138: icmp_seq=521 ttl=64 time=22.435 ms
64 bytes from 10.0.32.138: icmp_seq=522 ttl=64 time=22.710 ms
64 bytes from 10.0.32.138: icmp_seq=523 ttl=64 time=25.820 ms
64 bytes from 10.0.32.138: icmp_seq=524 ttl=64 time=29.762 ms
64 bytes from 10.0.32.138: icmp_seq=525 ttl=64 time=22.227 ms
Request timeout for icmp_seq 526
64 bytes from 10.0.32.138: icmp_seq=526 ttl=64 time=1012.982 ms
64 bytes from 10.0.32.138: icmp_seq=527 ttl=64 time=50.985 ms
64 bytes from 10.0.32.138: icmp_seq=528 ttl=64 time=231.962 ms

So that seems stable as is.

Turning the video encoding on, I'm still seeing (expected I expect) the RangeError issue, though rather pretty and regular.

2017-01-30 17:16:33,314 Warning: client decoding error:
2017-01-30 17:16:33,315
2017-01-30 17:16:33,315 client 1: error painting
2017-01-30 17:16:33,316 client 1: h264
2017-01-30 17:16:33,316 client 1: RangeError: Source is too large

I did manage to trigger one interesting error, but otherwise the behavior seems to be exactly as with the 1.0.2 server (canvas breaking up into little boxes when scrolling, etc).

2017-01-30 17:16:33,337 Warning: client decoding error:
2017-01-30 17:16:33,338
2017-01-30 17:16:33,338 Error: failed to encode x264 video frame:
2017-01-30 17:16:33,338  image pixels does not have 3 planes! (found 921600)
2017-01-30 17:16:33,338  source: XShmImageWrapper(BGRX: 20, 131, 640, 360)
2017-01-30 17:16:33,339  options: {'av-sync': True, 'scroll': True}
2017-01-30 17:16:33,339  encoder:
2017-01-30 17:16:33,339    profile             : baseline
2017-01-30 17:16:33,339    generation          : 247
2017-01-30 17:16:33,339    delayed             : 0
2017-01-30 17:16:33,339    height              : 360
2017-01-30 17:16:33,340    max-size            : (8192, 4096)
2017-01-30 17:16:33,340    preset              : medium
2017-01-30 17:16:33,340    frames              : 0
2017-01-30 17:16:33,340    quality             : 58
2017-01-30 17:16:33,340    lossless            : False
2017-01-30 17:16:33,340    frame-types         : {}
2017-01-30 17:16:33,340    width               : 640
2017-01-30 17:16:33,340    speed               : 34
2017-01-30 17:16:33,341    source              : unknown
2017-01-30 17:16:33,341    version             : 148
2017-01-30 17:16:33,341    tune                : zerolatency
2017-01-30 17:16:33,341    src_format          : YUV420P
2017-01-30 17:16:33,341    formats             : ['YUV422P', 'RGB', 'BGRX', 'BGR', 'YUV420P', 'BGRA', 'YUV444P']
2017-01-30 17:16:33,341    b-frames            : 0
2017-01-30 17:16:33,354 client 1: error painting
2017-01-30 17:16:33,355 client 1: h264
2017-01-30 17:16:33,355 client 1: RangeError: Source is too large
2017-01-30 17:16:33,356 Warning: client decoding error:
2017-01-30 17:16:33,356
2017-01-30 17:16:33,356 client 1: error painting
2017-01-30 17:16:33,357 client 1: h264
2017-01-30 17:16:33,357 client 1: RangeError: Source is too large

I suspect that, once that new issue is looked at, this ticket might be able to be closed and replaced by specific tickets for the video and such... unless you'd like me to check with firefox as a client first?


Tue, 31 Jan 2017 05:37:59 GMT - Antoine Martin: status changed; resolution set

Let's follow up in #1424, #1463


Sat, 23 Jan 2021 05:21:36 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/1341