xpra icon
Bug tracker and wiki

Opened 4 years ago

Closed 4 years ago

Last modified 3 years ago

#426 closed task (fixed)

multiplexing multiple xpra instances through one port

Reported by: Antoine Martin Owned by: Antoine Martin
Priority: major Milestone: 0.11
Component: server Version:
Keywords: Cc:

Description

The problem is that in some situations, the servers may well be sitting behind firewalls that only allow outgoing connections to selected ports (usually 80 and 443 for web browsing).
One solution to this problem is to use SSH or a VPN server running on one of those ports, but these options have their own problems (key exchange, shell account access, etc)

r4326 implements a proof of concept proxy server (accessible via "xpra proxy"): the user connects to this server and after authentication (if required - usually/probably should be) the packets are forwarded to the real server. The encryption and compression is only enabled between client and proxy, and not between proxy and server. (though we could quite easily add options to change this)


What is still needed to make this functional:

  • threading/processes: how many concurrent connections can we handle before this becomes the bottleneck? (at the moment, the POC server can only handle a single connection)
  • handle disconnection from either end gracefully
  • how do we lookup the real server to connect to?
    • a shell script could be a little expensive but is more flexible
    • a lookup file?

Other things this may be useful for:

  • this could be used for load balancing
  • security? (it is easier to isolate the servers from the clients)
  • we could make the proxy server stateful and let it deal with hardware video encoding: in VM hosted environments the guests do not have easy access to hardware, but the host does. The proxy could tell the server to send all frames as plain rgb+lz4, and then it would just replace the frames with video encoded data. (later, this could be taken one step further and the guests could use a shared memory mechanism to avoid using the virtual network for sending frames to the host)

Attachments (5)

auth-v3.patch (45.5 KB) - added by Antoine Martin 4 years ago.
splits authentication from server core, adds auth modules and keyfile so password file and encryption keyfile can be different
auth-v5.patch (61.5 KB) - added by Antoine Martin 4 years ago.
updated patch (broken multiprocessor support..)
auth-v6.patch (83.0 KB) - added by Antoine Martin 4 years ago.
updated patch using timers and custom code instead of gobject from the subprocesses
auth-v8.patch (85.1 KB) - added by Antoine Martin 4 years ago.
adds attempts at signal handling and process cleanup + 1 important server fix
auth-v10.patch (92.3 KB) - added by Antoine Martin 4 years ago.
many fixes (except encryption drop outs)

Download all attachments as: .zip

Change History (23)

comment:1 Changed 4 years ago by Antoine Martin

We probably want to make this *the* default proxy for all sessions on a local system (in TCP mode - SSH has user auth already), and allow system level authentication (via PAM on Linux, per platform auth), it should support "xpra list" too.
So we need to specify the username, probably add a "--username=NNNN" switch? (and/or support tcp:username@host:port).
Then there are threading issues, at the moment we start many threads per connection (2 for reading and 2 for writing), which is fine when you typically have just one connection active at a time, but this becomes a problem if we want to proxy for dozens of users. Even more so if the proxy handles picture encoding (one more thread..)
We also need to deal with session discoverability, and this is a good reason for moving the server sockets to /tmp/. Backwards compatibility can be achieved by symlinking to the old location, checking both locations, (and maybe even adding some code to the run-xpra script?)
Then ideally we would want some sort of privilege separation between the code that needs root (socket binding, connecting to xpra sockets in /tmp/) and the code that runs once authenticated (IO and encoding).

comment:2 Changed 4 years ago by Antoine Martin

Here's how I think this is going to work using the python multiprocessing module:

  • proxy runs as root (generally - not strictly required)
  • add an authentication option: --auth=pam|win32security|ldap|file|script...

(and maybe this can be used for regular servers too?)

  • when we receive a new connection, we process authentication via one of the auth modules
  • this auth module checks username/password/[display]?

(maybe think about modules that can use a challenge rather than using a plain password)
and returns: real uid, real server URI(s), [xpra env options], [session options]

  • the server can then launch a sub-process, passing it the socket connection (as per this example) and let it deal with changing uid, etc

Notes:

  • maybe disallow system auth if we're connecting from a non-encrypted TCP socket?
  • support "xpra list"
  • don't want to use sendmsg and completely separate processes
  • connecting to the real server: can't think of any disadvantages of doing it in the subprocess
Last edited 4 years ago by Antoine Martin (previous) (diff)

Changed 4 years ago by Antoine Martin

Attachment: auth-v3.patch added

splits authentication from server core, adds auth modules and keyfile so password file and encryption keyfile can be different

Changed 4 years ago by Antoine Martin

Attachment: auth-v5.patch added

updated patch (broken multiprocessor support..)

comment:3 Changed 4 years ago by Antoine Martin

Sigh. As explained here: Caution: python-multiprocessing, threads and glib don't mix

So the v5 patch does not run... as the idle_add calls never fire.

Changed 4 years ago by Antoine Martin

Attachment: auth-v6.patch added

updated patch using timers and custom code instead of gobject from the subprocesses

comment:4 Changed 4 years ago by Antoine Martin

The auth-v6.patch worksaround this by using custom code instead of gobject.
Lots of new improvements too:

  • username works
  • pam auth works and, setuid/setgid too
  • both threaded and multiprocessing modes work (controlled via env var)

What does not work yet / not done yet:

  • client connection fails most of the time (race with encryption setup, causes invalid packet)
  • hello filtering needs improvement (rencode / compression should use better values + use file overrides if we have any)
  • signal handling (not done) - see Python: Using KeyboardInterrupt with a Multiprocessing Pool, []
  • handle connection strings (and URIs) like: tcp:username@host:port
  • force kill subprocesses on exit?
  • restrict the subprocesses more: should not need file access (load passwords and keys beforehand?) or any new sockets, limit resource usage, prevent forking, prevent new imports, etc
  • invalid usernames should still trigger challenge (and avoid user enumeration)

etc..

Changed 4 years ago by Antoine Martin

Attachment: auth-v8.patch added

adds attempts at signal handling and process cleanup + 1 important server fix

Changed 4 years ago by Antoine Martin

Attachment: auth-v10.patch added

many fixes (except encryption drop outs)

comment:5 Changed 4 years ago by Antoine Martin

Works ok as of r4399. We have a number of auth modules we can choose from:

  • allow: always allows the user to login - dangerous / only for testing
  • fail: always fails authentication - useful for testing
  • file: looks up usernames and password in the password file (format changed)
  • pam: linux PAM authentication
  • win32: win32security authentication
  • sys is a virtual module which will choose win32 or pam

Once authenticated, the proxy server starts a new process as the user that successfully authenticated (with the uid and gid taken from the password database) and connects to the real server.
We choose the real display to connect to using the "display" capability (TODO: let client specify it) or choose the only session we find (if only one exists), or we fail.
The special case is with the file auth module, which allows us to specify authentication values which may not be valid system users (though a valid uid/gid pair is still required in that case) and a target display which may be a remote one (ie: "tcp:host:port")

Last edited 4 years ago by Antoine Martin (previous) (diff)

comment:6 Changed 4 years ago by Antoine Martin

Here's how you can use it with the file auth module (sys auth needs encryption to work as we refuse to send unencrypted system passwords over the sockets):

  • start the server
    xpra proxy :100 --bind-tcp=0.0.0.0:20000 --auth=file --password-file=./xpra-auth
    
  • add your user entries in the auth file, ie:
    echo "antoine|thepassword|1000|1000|tcp:testhost:10|ENV=VALUE|compression=0" >> ./xpra-auth
    
  • connect from the client:
    echo "thepassword" >> password.txt
    xpra attach --username=myusername --password-file=./password.txt $PROXYHOST:20000
    

This should cause the proxy to forward the connection to the display specified in the auth file (in the example above: tcp:testhost:10)

comment:7 Changed 4 years ago by Antoine Martin

Many important fixes in r4541, r4537 should make this a lot more usable now.

If things don't work as expected, check that you haven't got an old daemon/zombie running.
Note: as of r4557, one can add session options to the auth file (only two are supported so far as a proof of concept compression_level and lz4), ie:

username|password|1000|1000|tcp:localhost:10000|ENV=VALUE|compression_level=1;lz4=0

Feedback welcome!

Last edited 4 years ago by Antoine Martin (previous) (diff)

comment:8 Changed 4 years ago by Antoine Martin

Some important fixes in r4605, r4606, r4608, etc

I have identified the problem with the encryption: it isn't a problem with the encryption per se, the encryption just makes it more obvious.
When using the proxy server, we *always* end up dropping the first packet that the client sends after the hello. Normally, that's a "set_deflate" or one of two "server-settings" (if applicable) or the first of the three "clipboard-token"s...
So, when not using encryption, it's still wrong but we just don't notice because those packets aren't essential!
The AES decryption relies on the strict presence and order of the data, and the missing packet causes a corrupted stream and disconnection.

That's because when we close the proxy-side connection, we may still have a read blocked in IO wait state via socket.recv. When the next packet comes in, it gets to read it before closing down...
We either want to force exit the read loop early (not sure how), or get the data read and inject it into the subprocess (intrusive/ugly)... Or add a way to get the client to send a socket flush() (probably not enough to trigger a proxy read?) or to send a dummy unencrypted packet so we can close the connection? (also ugly but somewhat cleaner: everything exits with normal codepaths)

Last edited 4 years ago by Antoine Martin (previous) (diff)

comment:9 Changed 4 years ago by Antoine Martin

The socket race is fixed in r4614 and encryption now works in proxy mode too (still only between client and proxy - between proxy and proxied server would require more configuration options, and is not a priority at the moment)

Note: we use a socket timeout (defaults to 0.1s) to guarantee that the sockets are always in a consistent state when handing them over to the new subprocess.
This does slow down the initial connection (on average by half that delay, so about 50ms). The current value seems like a good compromise between polling too frequently (wasting CPU) and waiting too long.

r4615 allows this timeout to be configured via the XPRA_PROXY_SOCKET_TIMEOUT env var. (setting this value too high makes it much more noticeable and one can even set it so high that the connection will often timeout)


What is left for this release (the rest can go in an enhancement ticket for another release):

  • signal handling and subprocess exit
  • performance/testing

comment:10 Changed 4 years ago by Antoine Martin

Most of the documentation found in this ticket has been added to the wiki:

comment:11 Changed 4 years ago by Antoine Martin

Owner: changed from Antoine Martin to alas

As of r4735, the proxy server should be able to exit cleanly.
"xpra stop" now works against the main proxy process (one must be authenticated as the same user that runs that process)

I think that's enough for this ticket, please test and close if it all works as expected.
Please verify that the connection from the proxy to the real xpra server uses rencode and not bencode.


What we may want to add (in a new ticket):

  • proxy video encoding (#504), for taking advantage of #370 on the host since a VM will not have direct access to the hardware
  • optimize packet handling (avoid decoding then re-encoding things)
  • password authentication and encryption between the proxy server and the real servers
  • better support for "xpra detach" so we can force kill connections (since the proxy will be long lived)
Last edited 4 years ago by Antoine Martin (previous) (diff)

comment:12 Changed 4 years ago by Smo

Owner: changed from alas to Smo

This has been tested but not extensively. We are going to be testing this with 10+ clients and making sure there is nothing broken.

comment:13 Changed 4 years ago by Antoine Martin

With r5375 one can see a new socket for each proxy instance (this broke older versions which will need r5373 backported):

$ xpra list
Found the following xpra sessions:
	LIVE session at :proxy-28752
	LIVE session at :10
	LIVE session at :20

Which gives us an easier way of interacting and collecting information from proxy instances. It supports: "info", "version" and "stop".

comment:14 Changed 4 years ago by Smo

Is this normal when using the proxy.

]$ xpra list
Found the following xpra sessions:
        LIVE session at :100
        LIVE session at :17
        LIVE session at :proxy-20954
]$ xpra --username=username --password-file=./password.txt info :proxy-20954
server requested disconnect: this socket only handles 'hello', 'version' and 'stop' requests

comment:15 Changed 4 years ago by Antoine Martin

Hmmm, the warning message was wrong (fixed in r5878), "info" is handled, that's one of the main purposes of the proxy socket.

It works fine here... (as usual)

Is there anything in the proxy log? All I see (since it works):

New proxy instance control connection received: SocketConnection(/home/antoine/.xpra/desktop-proxy-25522)
Connection lost

comment:16 Changed 4 years ago by Antoine Martin

Owner: changed from Smo to Antoine Martin
Status: newassigned

Got it: don't use --username or --password-file. The proxy instance does not support any authentication at present (and I hope it never needs it), it is on a unix domain socket only, so regular unix permissions should be sufficient. Unless someone uses the proxy server and shared group sockets with --socket-dir...

  • r5880 gives a more helpful error message if you try to use authentication
  • r5881 adds information to the man page
Last edited 4 years ago by Antoine Martin (previous) (diff)

comment:17 Changed 4 years ago by Smo

Resolution: fixed
Status: assignedclosed

Tested this with 8 connections through the proxy with no issues.

comment:18 Changed 3 years ago by Antoine Martin

Some improvements worth mentioning here:

  • r9164: using blocking sockets after the connection is established (fewer timer wakeups)
  • r9163: re-compress window icon (avoids warning)

Both could be backported, but no rush. See also: ticket:838#comment:12

Last edited 3 years ago by Antoine Martin (previous) (diff)
Note: See TracTickets for help on using tickets.