Xpra: Ticket #1298: html5 client silently failing when connecting through the proxy

Start a server session (make sure it is the only one running for simplicity):

xpra start --start=xterm :100

Start a proxy server (same user account):

xpra proxy :10 --bind-tcp=0.0.0.0:10000 --tcp-auth=none --no-daemon

The python client can connect via the proxy without any problems:

xpra attach tcp/127.0.0.1:10000/

But the html5 client fails. Note: you have to use a custom URL to tell the proxy which username to use:

xdg-open http://localhost:10000/?username=$USER&debug=1

It connects to the proxy OK and the proxy starts forwarding to the real server:

proxy video encoders: x264
new proxy instance started
 for client tcp websocket: 127.0.0.1:10000 <- 127.0.0.1:59214
 and server unix-domain socket:  <- /run/user/1000/xpra/desktop-100
proxy instance now also available using unix domain socket:
 /run/user/1000/xpra/desktop-proxy-2909

The server sees the connection and responds to the hello:

Handshake complete; enabling connection
HTML5 Linux client version 1.0
 as 'antoine'
via Linux 4.6.7-300.fc24.x86_64 proxy version 1.0 on 'desktop'
 using jpeg as primary encoding also available:
  png, rgb32
 client root window size is 1468x1340 with 1 display:
  HTML (388x355 mm - DPI: 96x95)
    Canvas
server virtual display now set to 1792x1344 (best match for 1468x1340)
setting keyboard layout to 'gb'
DPI set to 96 x 96

But the connection eventually times out:

client has requested disconnection: server shutdown (client connection lost)

Looking at the traffic with ngrep:

ngrep -d lo "" port 10000

We see that the websocket traffic flowing to the client, but not back:

##
T 127.0.0.1:10002 -> 127.0.0.1:59604 [AP]
  .~QhP.....Q`.\....il5:hellod19:actual_desktop_siz.....
  (....)
#
T 127.0.0.1:10000 -> 127.0.0.1:59266 [AP]
  ..P.......l16:startup-completee
##
T 127.0.0.1:10000 -> 127.0.0.1:59266 [AP]
  ..P.......l6:cursor0:e
(..)
T 127.0.0.1:10000 -> 127.0.0.1:59266 [AP]
  ..P.......l4:pingi1472960641070ee
##
T 127.0.0.1:10000 -> 127.0.0.1:59266 [AP]
  ..P.......l4:pingi1472960651076ee
##
T 127.0.0.1:10000 -> 127.0.0.1:59266 [AP]
  ..P.......l4:pingi1472960661086ee
#

The html5 console shows that the client never received the hello packet:

parameter username is antoine
?username=antoine&debug=1:185 parameter password not supplied
?username=antoine&debug=1:185 parameter sound not supplied
?username=antoine&debug=1:185 parameter encoding not supplied
?username=antoine&debug=1:182 parameter debug is 1
?username=antoine&debug=1:185 parameter normal_fullscreen not supplied
?username=antoine&debug=1:185 parameter submit not supplied
?username=antoine&debug=1:185 parameter server not supplied
?username=antoine&debug=1:185 parameter port not supplied
?username=antoine&debug=1:185 parameter encryption not supplied
?username=antoine&debug=1:185 parameter key not supplied
?username=antoine&debug=1:185 parameter keyboard_layout not supplied
xpra_client.js:151 connecting to xpra server localhost:10002 with ssl: false
xpra_client.js:164 we have webworker support
xpra_client.js:173 we can use websocket in webworker
xpra_client.js:245 received a open packet
xpra_client.js:573 sending hello
xpra_client.js:514 return all encodings
xpra_client.js:587 hello capabilities: [object Object]
(then after the 1 minute timeout:)
received a close packet
xpra_client.js:146 connection closed: Did not receive hello before timeout reached, not an Xpra server?
xpra_client.js:853 close: undefined

Connecting directly to the server (ie: adding --bind-tcp=0.0.0.0:10001 and connecting to port 10001 from the html5 client) works fine. So there must be something different about the traffic coming from the proxy.



Sun, 04 Sep 2016 04:25:11 GMT - Antoine Martin: owner, status, description changed


Mon, 05 Sep 2016 04:30:41 GMT - Antoine Martin:

Workaround in r13564: we need this timeout for the initial connection to work, it hits the timeout then proceeds. The value is problematic: too low for slower connections, and introduces a one second delay when connecting. Not sure what the right fix is yet.


Sun, 25 Sep 2016 11:38:55 GMT - Antoine Martin: status changed; resolution set

Fixed properly in r13862 + r13865, see also ticket:1211#comment:3


Thu, 13 Oct 2016 10:40:46 GMT - Antoine Martin: status changed; resolution deleted

Happens again as of r14142

:(


Thu, 13 Oct 2016 10:43:22 GMT - Antoine Martin:

r14143 fixes that again. Again, I'm not sure the fix is correct.


Fri, 21 Oct 2016 10:06:35 GMT - Antoine Martin:

Fixing the proxy server had broken the regular servers... r14245 uses a different value for each so both should work now. r14246 also makes the proxy server timeout value configurable using the env var XPRA_PROXY_WS_TIMEOUT (defaults to 0.0)


Thu, 24 Nov 2016 16:03:49 GMT - Antoine Martin: owner, status changed

Just when you think this can't possible get more messed up than it is: the proxy service breaks this again, but only when started by systemd! (see #1335). It must be doing something to the process context that makes TCP sockets behave differently. We get EAGAIN and it then fails just like it did for win32 before.

r14487 uses blocking sockets on posix only, where this actually works. Tested and working on all platforms: start a proxy or shadow server on all supported platforms and you should get a working html client by hitting the server with a browser. Makes sense to test this with #1335.


Mon, 05 Dec 2016 17:00:55 GMT - Antoine Martin: owner, status changed

Found cases where the proxy would not work, again.

Back to whackamole.


Mon, 05 Dec 2016 17:46:50 GMT - Antoine Martin: attachment set

back to a timeout - this at least seems to work everywhere?


Tue, 06 Dec 2016 10:43:25 GMT - Antoine Martin: owner, status changed

Patch applied in r14499 as I cannot find a better way to fix this without making intrusive changes to the network layer.

Ready to test as per comment:7.


Mon, 20 Feb 2017 22:49:37 GMT - Smo: status changed; resolution set

Tested this on fedora 24, fedora 25, windows 7 and windows 10.


Sat, 23 Jan 2021 05:20:27 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/1298