Xpra: Ticket #172: remove the need to choose an unused DISPLAY when starting xpra

I am not aware of any solution that isn't racy or just plain wrong (making assumptions about where sockets should be and who can list/access them)

What we can do however, is try to get this patch merged into Xorg so we can just make our own randomly generated socket each time (patch from lindi):

diff --git a/src/xcb_util.c b/src/xcb_util.c
index 996ff25..fb0cd3d 100644
--- a/src/xcb_util.c
+++ b/src/xcb_util.c
@@ -139,6 +139,12 @@ static int _xcb_open(char *host, char *protocol, const int display)
     char file[sizeof(base) + 20];
     int filelen;
+    if(protocol && strcmp("unix-with-path", protocol) == 0)
+    {
+        /* display specifies Unix socket with explicit path */
+        return  _xcb_open_unix("unix", host);
+    }
+
     if(*host)
     {
 #ifdef DNETCONN


Sun, 05 Aug 2012 14:08:05 GMT - Antoine Martin: attachment set

X11 unix-with-path protocol patch


Tue, 23 Oct 2012 14:26:13 GMT - grumpfl:

How does this work with the ssh -X command? Is it possible to implement something similar here?


Tue, 23 Oct 2012 16:08:56 GMT - Antoine Martin:

Err no, ssh -X uses your local display, and we also connect to that with the client.

What we need is a virtual display on the server end, something that ssh does not, and almost certainly will not ever do.


Thu, 25 Oct 2012 13:11:07 GMT - grumpfl:

Right. But when I ssh -X into another machine, the DISPLAY environment variable is set to something local, like :3 in my test case. How does the server-side sshd obtain this display number? Perhaps xpra could do just the same?


Thu, 25 Oct 2012 15:25:39 GMT - Antoine Martin:

No it cannot, the "fake" display number used by ssh is forwarded via the ssh tunnel to the real display on your client box. One of the most important features of xpra is its ability to detach and re-attach later, this sort of connection forwarding would preclude that.


Fri, 09 Aug 2013 21:52:05 GMT - krlmlr:

This is unsolved for a year now. What about picking a display number at random, initializing the random number generator

?

Collision probability is very low, and a collision could perhaps be detected and then the next random display number is tried. In the second case (user name hash), the output of xpra list could be analyzed to avoid useless probing.


Sat, 10 Aug 2013 02:42:41 GMT - Antoine Martin:

This is unsolved for a year now

This is unsolved for much longer than that.

It is not as simple as you think it is.


Sat, 10 Aug 2013 20:59:19 GMT - krlmlr:

Well, I'm ready to learn something new.

Let's put it another way: Assume I already have an xpra session running for display :5557 on my local machine. When I do

xpra start :5557 --no-daemon

I receive the following error message:

xpra: error: You already have an xpra server running at :5557
  (did you want 'xpra upgrade'?)

Apparently, xpra internally executes the equivalent of xpra list and sees that there's another session running. Now, if we cheat xpra to use another socket-dir so that xpra list will be empty:

xpra start :5557 --no-daemon --socket-dir=/tmp/test-xpra

the output changes:

Fatal server error:
Server is already active for display 5557
        If this server is no longer running, remove /tmp/.X5557-lock
        and start again.

This is what I call a "collision" in my previous post. To me, it seems that such collisions can be detected very well, and xpra could handle them by simply trying another display.

What am I missing?


Sat, 10 Aug 2013 21:35:38 GMT - krlmlr:

Also, following http://stackoverflow.com/a/12169545/946850, recent X servers (1.13) support the -displayfd <fd> command line option: It will make the X server choose the display itself and write the display number back to the file descriptor <fd>.

On my Ubuntu 13.04 machine, with an active display 0, the command

Xvfb -displayfd 1

executes without error and writes 1 to stdout.

Corresponding SO answer: http://stackoverflow.com/a/18166600/946850


Sun, 11 Aug 2013 06:26:10 GMT - Antoine Martin:

Letting the Xorg server choose the display number using the

-displayfd

switch is totally the right way of doing it (the same process that creates the lock is also the one that checks it), it's a shame they haven't bothered to document it:

$ Xorg -version |& grep X.Org
X.Org X Server 1.14.2
$ Xorg -h |& grep -i displayfd | wc -l
0
$ man Xorg |& grep -i displayfd | wc -l
0



As for the difference between the xpra socket (which is used as some sort of lock) and the Xorg lockfile, they serve slightly different purpose unfortunately. One could conceivably run an xpra server on display NN and also an xpra shadow of the same display for example.


Looking at the implementation hurdles, OTOH:

etc..

So really, it looks doable, but I don't think I'll have the time to deal with it in the near future - so I am keeping the milestone set to "future", but feel free to submit a patch for merging sooner.


Sun, 11 Aug 2013 14:06:29 GMT - krlmlr:

Replying to antoine:

it's a shame they haven't bothered to document it:


They did, actually -- perhaps not everywhere where it would be relevant?

$ man Xserver |% grep displayfd
       -displayfd fd
               finding a free one, will write the display number back on this file descriptor as a newline-terminated string. \
               The -pn option is ignored when using -displayfd.


Looking at the implementation hurdles, OTOH:


Just a matter of checking the length of args in run_remote_server?


Mutually exclusive with --use-display, of course. Should be supported with upgrade as well. Why would shadow be a problem?


Why is this important? (On a side note: Currently, xpra will happily daemonize (and immediately terminate) when I run xpra start :0 on a graphical work station where display 0 is used.)


Indeed.

etc..

So really, it looks doable, but I don't think I'll have the time to deal with it in the near future - so I am keeping the milestone set to "future", but feel free to submit a patch for merging sooner.


Currently, my racy hack works for me, and I'll need to wait until Wheezy's successor to benefit from an implementation. Still, good to know that a "proper" solution is possible after all.


Mon, 12 Aug 2013 06:14:16 GMT - Antoine Martin:

we currently daemonize before we start the display - not sure how tricky that is to re-order

Why is this important?


Because the log filename (and socket filename) contain the DISPLAY value in them, and it is created when we daemonize.


Fri, 15 Nov 2013 14:10:31 GMT - Antoine Martin: owner, status, milestone changed

Seems worth doing, will try for 0.12


Sun, 16 Mar 2014 09:07:14 GMT - Antoine Martin: milestone changed

Too late for this release, will try for the next one, sorry.


Fri, 09 May 2014 14:19:42 GMT - Antoine Martin: milestone changed

out of time...


Sat, 24 May 2014 12:11:44 GMT - Benoit Gschwind: attachment set

patch enabling selection of DISPLAY automaticaly


Sat, 24 May 2014 13:27:45 GMT - Benoit Gschwind: attachment set

make xpra backward compatible


Mon, 26 May 2014 11:22:36 GMT - Antoine Martin: owner, status changed

Comments:


Mon, 26 May 2014 13:33:22 GMT - Benoit Gschwind:

Replying to totaam:

Comments:


Make it consistent with others xpra files (metadata_path and socket file)


I do not understand the issue, you mean that a user can add is current DISPLAY to the list for display in password-file ? I can re-add this, but we have to check for os.environ.has_key("DISPLAY") before, I will do so.


session_id is ":10" for legacy session and "S10" for new session, thus the log still being backward compatible if you use the legacy mode (if I didn't made mistake) Does it make sense to have new session mode backward compatible to legacy mode ? I prefer to encourage the use of the new session (for reasons discussed latter)


In conservative point of view I say yes, we can have a small patch. I will do so :D. But imo I will prefer to use the session id fashion. Using DISPLAY term is confusing because it's used to identify a session but also a X11 DISPLAY, which lead to confusion for us (developers) and for users. At this point I think about screen, which never speak about ttyX or virtual terminal.


Fine for me but I do not have any idea how to do so, maybe check Xorg --version or something like that (I would prefer a run time check because user can update Xorg while keeping old Xpra)


Because it like screen, and as using display it like calling a cat by is color instead of give him a name. Another point is that this allow the proxy to give a virtual session ID for different host but with same DISPLAY. You can also allow the user to give a nice name like 'my-shiny-xterm-session' at the end. This a lot more versatile.


I will do :)


as you prefer, I didn't find a good place


At this moment no one, but xpra list may do the job


I agree with you about it, I made this in that sense, I will update it to key-value file.


An other way to get the information may be to use the socket, I choose to not use it to avoid deadlock, but if we want we can remove the metadata file and store it globally withing Xpra server process and add a way to ask the server via the socket. (maybe this mechanism already exist)


which kind of session_string ? I will fix it if I have the whole use case view :)


I will fix it


Yes I didn't see where it is, but the use of display is bad idea in the code imo, I prefer to use it only to reference the real display.


I will fix it


Added to todo list


Understand a code from others is some time hard :D


Mon, 26 May 2014 14:10:56 GMT - Antoine Martin:

Make it consistent with others xpra files (metadata_path and socket file)


Exactly: it was consistent before, and now with this patch applied it isn't any more!

I do not understand the issue, you mean that a user can add is current DISPLAY to the list for display in password-file ? I can re-add this, but we have to check for os.environ.has_key("DISPLAY") before, I will do so.


Ah, I understand now: this was a bug, fixed in r6565 for trunk. (will backport later) This is important to prevent the proxy from trying to connect to itself in a loop, and must not be removed!

thus the log still being backward compatible


You're right, my bad, sorry.

Using DISPLAY term is confusing because it's used to identify a session but also a X11 DISPLAY, which lead to confusion for us (developers) and for users


I think many would argue that it keeps things simple, as there is only one identifier for two things that are very tightly coupled together!

Fine for me but I do not have any idea how to do so, maybe check Xorg --version or something like that (I would prefer a run time check because user can update Xorg while keeping old Xpra)

No problem, I can take care of this. The same way we already detect the availability of Xorg Xdummy and fallback to Xvfb. Those messing with versions (a very small minority of users) can take the responsability of updating their config files. Usually they'll be upgrading which is backwards compatible (whereas downgrading might break things). Runtime check is too onerous, especially for those starting lots of sessions all the time (ie: me!)

Because it like screen.. You can also allow the user to give a nice name like 'my-shiny-xterm-session' at the end. This a lot more versatile.


I agree this would be valuable, in fact it has been on my todo list for a while now. The way I was going to do this was using the "session-name" from xpra info.. But this patch doesn't do this yet, right? (just allows it)

At this moment no one, but xpra list may do the job


Good point. Assuming we go down that route, I still think it needs more than that, like an exit hook to cleanup the file when we know the session is terminating.

An other way to get the information may be to use the socket


It didn't before, but it does now: r6566 adds the "server.display" to xpra info:

xpra info  | grep -i display


Understand a code from others is some time hard :D


I know! So if there are any areas that aren't clear, post them and I can easily add more docstrings to trunk. Making the code more readable is always worth spending a few minutes doing, I just don't know what needs documenting better because I've spent so much time on it!


Tue, 27 May 2014 11:47:08 GMT - Benoit Gschwind: attachment set

enable Xpra to select display if user do not specify one (minimalist one)


Thu, 29 May 2014 09:42:16 GMT - Antoine Martin:

Reviewed:

In light of the amount of stuff that is clearly broken, did you really test this properly?


Thu, 29 May 2014 15:25:26 GMT - Benoit Gschwind: attachment set


Fri, 30 May 2014 10:02:52 GMT - Antoine Martin:

If you want me to review or merge something, please re-assign tickets to me.

I assume this was the case here. Comments:

Finally, there is a blocker problem: The Xorg logfile is created when we start the Xorg process, and so it comes up as ~/.xpra/Xorg.${DISPLAY}.log since we don't have the display number ready.


Fri, 30 May 2014 10:23:15 GMT - Antoine Martin: attachment set

updated patch with most of the comments addressed


Fri, 30 May 2014 10:34:09 GMT - Benoit Gschwind:

Replying to totaam:

If you want me to review or merge something, please re-assign tickets to me.

I assume this was the case here. Comments:


Code was tested, time not need to being imported, why ? I do not know.


I agree this is not needed.


3 second in select + 1 second in sleep make 4 second ;)


Was fixed by select but comment was not updated.

Finally, there is a blocker problem: The Xorg logfile is created when we start the Xorg process, and so it comes up as ~/.xpra/Xorg.${DISPLAY}.log since we don't have the display number ready.


Yes this is quite complicated to fix this issue

Best regards


Fri, 30 May 2014 10:35:01 GMT - Benoit Gschwind: owner changed


Fri, 30 May 2014 10:35:44 GMT - Benoit Gschwind:

totaam your are wellcome to merge any patch you need :)

Best regards


Fri, 30 May 2014 10:45:50 GMT - Benoit Gschwind:

The subprocess.Popen flag close_fds has been changed, maybe we should close all the other fds ourselves (all but the pipes we need) - I believe those were closed for a reason (wrong process getting the signals?)

I forgot to reply this, I do not know how to close file for the subprocess without closing those file for the current process


Fri, 30 May 2014 10:52:20 GMT - Antoine Martin: owner changed

A solution you can try to implement for the log file name problem: in the new "automatic display" mode, use a large and unique temporary value for the string substitution of DISPLAY in the xvfb command string (a uuid.uuid4() for example), then once the display is up you can easily find the log file created in sockdir (if one exists) and rename it to its "correct" name

I don't really like it, but I can't think of anything better, and something has to be done before I can merge this.

Then assuming this works out ok, we only need documentation updates to merge this.

(please submit patches re-based on the updated patch)


Fri, 30 May 2014 11:29:50 GMT - Benoit Gschwind:

I remembered why I forced 1 second of waiting outside select:

select can unblock imediatly which cannot leave the time to the subprocess to write the entire display name. if we unblock in chain for any reason (file interupt for exemple) we wait 0 sec. and get any result, with 1 seconde fo sleep you are sure to get at less 3 sec. for the child to process.


Fri, 30 May 2014 15:25:44 GMT - Antoine Martin:

Sure, then change it to a time based while loop instead, no need for sleep. Something like:

while elasped<10:
   select & read()

Sat, 31 May 2014 11:26:15 GMT - Benoit Gschwind: attachment set

fix the logfile location


Sat, 31 May 2014 11:29:35 GMT - Benoit Gschwind: owner changed

I fixed the location of logfile and the select


Sat, 31 May 2014 12:56:30 GMT - Antoine Martin:

Some notes:

man page updates and I think I can merge this.


Sat, 31 May 2014 13:13:56 GMT - Benoit Gschwind:

I made a mistake the timeout in select is invalid :

maybe using 1 second will be valid.


Sun, 01 Jun 2014 06:44:20 GMT - Antoine Martin:

Merged in r6613 (general info in commit message)

Including the changes from comment:28 and also:

Things left to do:


Sun, 01 Jun 2014 11:41:00 GMT - Antoine Martin:

File descriptor fixes in r6624


Tue, 10 Jun 2014 15:06:31 GMT - Antoine Martin:

Another fix: restore "skip display_name_check when shadowing or proxying" in r6712


Sat, 14 Jun 2014 12:58:36 GMT - Benoit Gschwind: attachment set


Sat, 14 Jun 2014 12:58:47 GMT - Benoit Gschwind: attachment set


Sat, 14 Jun 2014 13:00:01 GMT - Benoit Gschwind:

I added proposal to build setup.py from template and detect Xdummy / Xvfb and displayfd


Sat, 14 Jun 2014 13:12:26 GMT - Antoine Martin:

I'll have to check but I don't think this is going to work when building packages with rpmbuild / debuild, which do not allow the source to be modified in place.. and I don't know how to get distutils to generate files during the "install" phase.

(otherwise I would have replaced the existing xpra.conf template directories with a solution similar to this one a long time ago..)


Sat, 14 Jun 2014 13:14:06 GMT - Benoit Gschwind:

for ssh string maybe :

I do not like this syntax but at less it resolve the issue


Sat, 14 Jun 2014 15:54:41 GMT - Benoit Gschwind: attachment set


Sat, 14 Jun 2014 15:57:05 GMT - Benoit Gschwind:

This last patch is a draft of a build and install template.

It may be usefull to submit a proper build_tpl and install_tpl module to setuptools but I'm not an expert of setuptools


Mon, 16 Jun 2014 08:53:36 GMT - Antoine Martin:

Awesome stuff! So this is how you override the build or install steps (which I could not figure out how to do).

Merged in r6814 with the following tweaks:

The remaining TODO items are nice-to-have, but not showstoppers.


Tue, 17 Jun 2014 02:25:15 GMT - Antoine Martin:

bsd path fix in r6816.


Tue, 17 Jun 2014 13:18:30 GMT - Antoine Martin:

build fixes for rpmbuild / debuild in r6823.


Fri, 20 Jun 2014 09:45:27 GMT - Antoine Martin:

(ugly) build fix for win32 in r6853


Thu, 10 Jul 2014 21:33:23 GMT - Antoine Martin: status changed; resolution set

This will do for this release. Moving the ssh bits to #612.


Sun, 13 Jul 2014 16:06:36 GMT - Antoine Martin:

The bug that keeps on giving: I needed r6895 to get the xpra.conf installed into /etc/xpra/ when doing a simple: sudo ./setup.py install


Fri, 29 Aug 2014 06:01:40 GMT - Antoine Martin:

Yet another bug: Debian Wheezy ships with Xorg 1.12, but does not support -displayfd. We cannot assume Xdummy support implies displayfd support.


Thu, 20 Nov 2014 02:18:00 GMT - Antoine Martin:

Yet another missing case: r8126, backport in r8129.


Mon, 18 May 2015 04:18:17 GMT - Antoine Martin:

Yet another missing case: r9401, backport in r9402.


Tue, 02 Jun 2015 14:54:09 GMT - Antoine Martin:

Forgot to fix Debian Wheezy, now done in r9574.

Note: we just bump the version check to Xorg 1.13, Fedora supported it with 1.12 but that must have been using some patching. And that Fedora version is EOL, so we don't care.


Mon, 02 May 2016 05:03:14 GMT - Antoine Martin:

This has been broken by the 1.18.1 xorg update due to a bug in their new automatic log file renaming which duplicates what we already do, but gets it wrong - see #1192.


Tue, 30 Aug 2016 04:36:11 GMT - Antoine Martin:

r13502 removes the displayfd option since all the distros we support have it (also simplifies the code)


Mon, 12 Nov 2018 11:41:17 GMT - Antoine Martin:

From the ticket that keeps on giving, another bug: #2032.


Mon, 16 Nov 2020 09:59:07 GMT - Antoine Martin:

8 years later and it turns out that we were 100% correct: Allow display names to be arbitrary paths: This lets the local listening socket live in a protected directory where other users cannot spoof it.


Sat, 23 Jan 2021 04:47:29 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/172