I am not aware of any solution that isn't racy or just plain wrong (making assumptions about where sockets should be and who can list/access them)
What we can do however, is try to get this patch merged into Xorg so we can just make our own randomly generated socket each time (patch from lindi):
diff --git a/src/xcb_util.c b/src/xcb_util.c index 996ff25..fb0cd3d 100644 --- a/src/xcb_util.c +++ b/src/xcb_util.c @@ -139,6 +139,12 @@ static int _xcb_open(char *host, char *protocol, const int display) char file[sizeof(base) + 20]; int filelen; + if(protocol && strcmp("unix-with-path", protocol) == 0) + { + /* display specifies Unix socket with explicit path */ + return _xcb_open_unix("unix", host); + } + if(*host) { #ifdef DNETCONN
X11 unix-with-path protocol patch
How does this work with the ssh -X
command? Is it possible to implement something similar here?
Err no, ssh -X
uses your local display, and we also connect to that with the client.
What we need is a virtual display on the server end, something that ssh does not, and almost certainly will not ever do.
Right. But when I ssh -X
into another machine, the DISPLAY
environment variable is set to something local, like :3
in my test case. How does the server-side sshd
obtain this display number? Perhaps xpra
could do just the same?
No it cannot, the "fake" display number used by ssh is forwarded via the ssh tunnel to the real display on your client box. One of the most important features of xpra is its ability to detach and re-attach later, this sort of connection forwarding would preclude that.
This is unsolved for a year now. What about picking a display number at random, initializing the random number generator
$USER
?
Collision probability is very low, and a collision could perhaps be detected and then the next random display number is tried. In the second case (user name hash), the output of xpra list
could be analyzed to avoid useless probing.
This is unsolved for a year now
This is unsolved for much longer than that.
It is not as simple as you think it is.
Well, I'm ready to learn something new.
Let's put it another way: Assume I already have an xpra session running for display :5557 on my local machine. When I do
xpra start :5557 --no-daemon
I receive the following error message:
xpra: error: You already have an xpra server running at :5557 (did you want 'xpra upgrade'?)
Apparently, xpra internally executes the equivalent of xpra list
and sees that there's another session running. Now, if we cheat xpra to use another socket-dir
so that xpra list
will be empty:
xpra start :5557 --no-daemon --socket-dir=/tmp/test-xpra
the output changes:
Fatal server error: Server is already active for display 5557 If this server is no longer running, remove /tmp/.X5557-lock and start again.
This is what I call a "collision" in my previous post. To me, it seems that such collisions can be detected very well, and xpra could handle them by simply trying another display.
What am I missing?
Also, following http://stackoverflow.com/a/12169545/946850, recent X servers (1.13) support the -displayfd <fd>
command line option: It will make the X server choose the display itself and write the display number back to the file descriptor <fd>
.
On my Ubuntu 13.04 machine, with an active display 0, the command
Xvfb -displayfd 1
executes without error and writes 1
to stdout.
Corresponding SO answer: http://stackoverflow.com/a/18166600/946850
Letting the Xorg server choose the display number using the
-displayfd
switch is totally the right way of doing it (the same process that creates the lock is also the one that checks it), it's a shame they haven't bothered to document it:
$ Xorg -version |& grep X.Org X.Org X Server 1.14.2 $ Xorg -h |& grep -i displayfd | wc -l 0 $ man Xorg |& grep -i displayfd | wc -l 0
As for the difference between the xpra socket (which is used as some sort of lock) and the Xorg lockfile, they serve slightly different purpose unfortunately. One could conceivably run an xpra server on display NN and also an xpra shadow of the same display for example.
Looking at the implementation hurdles, OTOH:
--use-display
? upgrade
? and xpra shadow
?
setsid
and close_fds=True
so passing the file descriptor is going to be more complicated
CentOS 6.3
and older, Debian wheezy, Ubuntu precise, etc)
etc..
So really, it looks doable, but I don't think I'll have the time to deal with it in the near future - so I am keeping the milestone set to "future", but feel free to submit a patch for merging sooner.
Replying to antoine:
it's a shame they haven't bothered to document it:
They did, actually -- perhaps not everywhere where it would be relevant?
$ man Xserver |% grep displayfd -displayfd fd finding a free one, will write the display number back on this file descriptor as a newline-terminated string. \ The -pn option is ignored when using -displayfd.
Looking at the implementation hurdles, OTOH:
- command line parsing will need to change as the display value becomes optional - unless we define some sort of magic value to imply this "auto" mode
Just a matter of checking the length of args
in run_remote_server
?
- potential problems/conflicts with
--use-display
?upgrade
? andxpra shadow
?
Mutually exclusive with --use-display
, of course. Should be supported with upgrade
as well. Why would shadow
be a problem?
- we currently daemonize before we start the display - not sure how tricky that is to re-order
Why is this important? (On a side note: Currently, xpra will happily daemonize (and immediately terminate) when I run xpra start :0
on a graphical work station where display 0 is used.)
- we run the vfb via
setsid
andclose_fds=True
so passing the file descriptor is going to be more complicated
Indeed.
- man page updates, "xpra --help" updates, wiki updates
- dealing with platforms without the feature: probably via a build time switch to disable the feature (ie:
CentOS 6.3
and older, Debian wheezy, Ubuntu precise, etc)etc..
So really, it looks doable, but I don't think I'll have the time to deal with it in the near future - so I am keeping the milestone set to "future", but feel free to submit a patch for merging sooner.
Currently, my racy hack works for me, and I'll need to wait until Wheezy's successor to benefit from an implementation. Still, good to know that a "proper" solution is possible after all.
we currently daemonize before we start the display - not sure how tricky that is to re-order
Why is this important?
Because the log filename (and socket filename) contain the DISPLAY
value in them, and it is created when we daemonize.
Seems worth doing, will try for 0.12
Too late for this release, will try for the next one, sorry.
out of time...
patch enabling selection of DISPLAY automaticaly
make xpra backward compatible
Comments:
dotxpra.log_path
returns a path, why change it to a filename?
proxy_virtual_display
loop detection in proxy_server
? Isn't this potentially dangerous?
s/display/session/
)
Xvfb
and also older versions of Xorg
which do not have the -displayfd
command line option: this should probably be a flag in /etc/xpra/xpra.conf
, we can detect it at install time. Then if the feature is not supported, the display will remain a required argument for starting the server.
save_display_name
and load_display_name
: maybe those belong in dotxpra
instead?
display=VALUE
? If we ever need to re-use the file, then we can just add properties and remain backwards compatible, otherwise... problems later.
or "unknown"
added to them)
pick_session
probably breaks connecting to single proxy sockets which will not match the regular expression added
normalize_local_session_id
: I like the use of re
a lot (no idea why the code was this convoluted!). But, it needs to include an error message as before for the failure case, and it must return the normalized string (ie: :0
for :0.0
)
local_display = path[len(base):]
- at that point, it's no longer a display, right? should probably be renamed to server_socket
or something, very minor detail compared to the comments above
xpra/scripts/server.py
imports load_display_name
but does not use it (another detail)
print "YYYY"
, print socket_path
, (no problem ;)
Replying to totaam:
Comments:
dotxpra.log_path
returns a path, why change it to a filename?
Make it consistent with others xpra files (metadata_path and socket file)
- why remove
proxy_virtual_display
loop detection inproxy_server
? Isn't this potentially dangerous?
I do not understand the issue, you mean that a user can add is current DISPLAY to the list for display in password-file ?
I can re-add this, but we have to check for os.environ.has_key("DISPLAY") before, I will do so.
- the patch renames the log file to use the session id, it should continue to use the display to keep things backwards compatible
session_id is ":10" for legacy session and "S10" for new session, thus the log still being backward compatible if you use the legacy mode (if I didn't made mistake)
Does it make sense to have new session mode backward compatible to legacy mode ? I prefer to encourage the use of the new session (for reasons discussed latter)
- can't we split this patch and just deal with choosing a display first? Just the very simple case: when you don't specify a display for the start command and one is given to you, nothing else. Then we can worry about exposing the chosen display somehow. Why is necessary to do it all in one? It would make it much more manageable and easier to merge (a few dozen lines vs hundreds). (even if half of it is just semi automatic search and replace
s/display/session/
)
In conservative point of view I say yes, we can have a small patch. I will do so :D. But imo I will prefer to use the session id fashion. Using DISPLAY term is confusing because it's used to identify a session but also a X11 DISPLAY, which lead to confusion for us (developers) and for users. At this point I think about screen, which never speak about ttyX or virtual terminal.
- we need to deal with
Xvfb
and also older versions ofXorg
which do not have the-displayfd
command line option: this should probably be a flag in/etc/xpra/xpra.conf
, we can detect it at install time. Then if the feature is not supported, the display will remain a required argument for starting the server.
Fine for me but I do not have any idea how to do so, maybe check Xorg --version
or something like that (I would prefer a run time check because user can update Xorg while keeping old Xpra)
- why do you want to identify sessions using this new session id? (a bit like screen) It seems to me like this is a step backwards in that the pid will be harder to identify than the display.
Because it like screen, and as using display it like calling a cat by is color instead of give him a name. Another point is that this allow the proxy to give a virtual session ID for different host but with same DISPLAY. You can also allow the user to give a nice name like 'my-shiny-xterm-session' at the end. This a lot more versatile.
- TODO: add time out to avoid infite blocking - important!
I will do :)
save_display_name
andload_display_name
: maybe those belong indotxpra
instead?
as you prefer, I didn't find a good place
- who is going to cleanup those metadata files? when? (especially stray ones..)
At this moment no one, but xpra list may do the job
- if we use this approach, I'm not sure I like assuming only the display is in that file... if we are going to have to carry extra metadata, we might as well make it a more generic solution, like a properties file with any key-value pairs (like the config files), and use
display=VALUE
? If we ever need to re-use the file, then we can just add properties and remain backwards compatible, otherwise... problems later.
I agree with you about it, I made this in that sense, I will update it to key-value file.
- what if the metadata file is missing? what happens? (at the very least the info messages need a
or "unknown"
added to them)
An other way to get the information may be to use the socket, I choose to not use it to avoid deadlock, but if we want we can remove the metadata file and store it globally withing Xpra server process and add a way to ask the server via the socket. (maybe this mechanism already exist)
- as it is,
pick_session
probably breaks connecting to single proxy sockets which will not match the regular expression added
which kind of session_string ? I will fix it if I have the whole use case view :)
normalize_local_session_id
: I like the use ofre
a lot (no idea why the code was this convoluted!). But, it needs to include an error message as before for the failure case, and it must return the normalized string (ie::0
for:0.0
)
I will fix it
local_display = path[len(base):]
- at that point, it's no longer a display, right? should probably be renamed toserver_socket
or something, very minor detail compared to the comments above
Yes I didn't see where it is, but the use of display is bad idea in the code imo, I prefer to use it only to reference the real display.
xpra/scripts/server.py
importsload_display_name
but does not use it (another detail)
I will fix it
- man page updates, wiki updates, etc.. lots more work there (assuming that we do want to go ahead with this approach - and I am not convinced yet)
Added to todo list
print "YYYY"
,print socket_path
, (no problem ;)
Understand a code from others is some time hard :D
Make it consistent with others xpra files (metadata_path and socket file)
Exactly: it was consistent before, and now with this patch applied it isn't any more!
I do not understand the issue, you mean that a user can add is current DISPLAY to the list for display in password-file ? I can re-add this, but we have to check for os.environ.has_key("DISPLAY") before, I will do so.
Ah, I understand now: this was a bug, fixed in r6565 for trunk. (will backport later)
This is important to prevent the proxy from trying to connect to itself in a loop, and must not be removed!
thus the log still being backward compatible
You're right, my bad, sorry.
Using DISPLAY term is confusing because it's used to identify a session but also a X11 DISPLAY, which lead to confusion for us (developers) and for users
I think many would argue that it keeps things simple, as there is only one identifier for two things that are very tightly coupled together!
Fine for me but I do not have any idea how to do so, maybe check Xorg --version or something like that (I would prefer a run time check because user can update Xorg while keeping old Xpra)
No problem, I can take care of this. The same way we already detect the availability of Xorg Xdummy and fallback to Xvfb. Those messing with versions (a very small minority of users) can take the responsability of updating their config files. Usually they'll be upgrading which is backwards compatible (whereas downgrading might break things). Runtime check is too onerous, especially for those starting lots of sessions all the time (ie: me!)
Because it like screen.. You can also allow the user to give a nice name like 'my-shiny-xterm-session' at the end. This a lot more versatile.
I agree this would be valuable, in fact it has been on my todo list for a while now.
The way I was going to do this was using the "session-name
" from xpra info
..
But this patch doesn't do this yet, right? (just allows it)
At this moment no one, but xpra list may do the job
Good point. Assuming we go down that route, I still think it needs more than that, like an exit hook to cleanup the file when we know the session is terminating.
An other way to get the information may be to use the socket
It didn't before, but it does now: r6566 adds the "server.display
" to xpra info:
xpra info | grep -i display
Understand a code from others is some time hard :D
I know!
So if there are any areas that aren't clear, post them and I can easily add more docstrings to trunk. Making the code more readable is always worth spending a few minutes doing, I just don't know what needs documenting better because I've spent so much time on it!
enable Xpra to select display if user do not specify one (minimalist one)
Reviewed:
sys.sleep
" there is no such thing in python (you meant time.sleep
)
len(buff)
" - there is no "buff
", there is a "buf
"
fix_log_file
should be a no-op when we have the display name already (which will still be the case for many): the log file should be created with the correct name in that case, we only do work and rename files for the "automatic" case
In light of the amount of stuff that is clearly broken, did you really test this properly?
If you want me to review or merge something, please re-assign tickets to me.
I assume this was the case here. Comments:
main.py
should not have been included
time
module, but without importing it - code was not tested?
sleep
, for what purpose? select
is enough.
os.read
will return something immediately with at least one byte
subprocess.Popen
flag close_fds
has been changed, maybe we should close all the other fds ourselves (all but the pipes we need) - I believe those were closed for a reason (wrong process getting the signals?)
Xvfb
does support -displayfd
, despite being completely undocumented anywhere that I could find
select_log_file
and printing log location messages was a bit wrong
sys.stderr
and sys.stdout
and of their aliases stderr
and stdout
Finally, there is a blocker problem:
The Xorg logfile is created when we start the Xorg process, and so it comes up as ~/.xpra/Xorg.${DISPLAY}.log
since we don't have the display number ready.
updated patch with most of the comments addressed
Replying to totaam:
If you want me to review or merge something, please re-assign tickets to me.
I assume this was the case here. Comments:
- changes to
main.py
should not have been included- you were using the
time
module, but without importing it - code was not tested?
Code was tested, time not need to being imported, why ? I do not know.
- code comments said "gracious 1.0 second" near
sleep
, for what purpose?select
is enough.
I agree this is not needed.
- docstring said "4.0" seconds, but used the value 3.0
3 second in select + 1 second in sleep make 4 second ;)
- you should check the file descriptors returned by select to decide if something can be read
- TODO: add time out to avoid infite blocking - why would it block? If select, says we can read from it,
os.read
will return something immediately with at least one byte
Was fixed by select but comment was not updated.
- The
subprocess.Popen
flagclose_fds
has been changed, maybe we should close all the other fds ourselves (all but the pipes we need) - I believe those were closed for a reason (wrong process getting the signals?)- note: on my system,
Xvfb
does support-displayfd
, despite being completely undocumented anywhere that I could find- the logic around
select_log_file
and printing log location messages was a bit wrong- inconsistent use of
sys.stderr
andsys.stdout
and of their aliasesstderr
andstdout
- the message about log file location should be the same as before
- some of the docstrings were incorrect (and not just the spelling)
Finally, there is a blocker problem: The Xorg logfile is created when we start the Xorg process, and so it comes up as
~/.xpra/Xorg.${DISPLAY}.log
since we don't have the display number ready.
Yes this is quite complicated to fix this issue
Best regards
totaam your are wellcome to merge any patch you need :)
Best regards
The subprocess.Popen flag close_fds has been changed, maybe we should close all the other fds ourselves (all but the pipes we need) - I believe those were closed for a reason (wrong process getting the signals?)
I forgot to reply this, I do not know how to close file for the subprocess without closing those file for the current process
A solution you can try to implement for the log file name problem: in the new "automatic display" mode, use a large and unique temporary value for the string substitution of DISPLAY
in the xvfb
command string (a uuid.uuid4()
for example), then once the display is up you can easily find the log file created in sockdir
(if one exists) and rename it to its "correct" name
I don't really like it, but I can't think of anything better, and something has to be done before I can merge this.
Then assuming this works out ok, we only need documentation updates to merge this.
(please submit patches re-based on the updated patch)
I remembered why I forced 1 second of waiting outside select:
select can unblock imediatly which cannot leave the time to the subprocess to write the entire display name. if we unblock in chain for any reason (file interupt for exemple) we wait 0 sec. and get any result, with 1 seconde fo sleep you are sure to get at less 3 sec. for the child to process.
Sure, then change it to a time based while loop instead, no need for sleep. Something like:
while elasped<10: select & read()
fix the logfile location
I fixed the location of logfile and the select
Some notes:
time.time()
instead of datetime
?
print xorg_log_file
and print f0, f1
xvfb_cmd[xvfb_cmd.index('-logfile')+1]
possible out of bounds (only if the command is invalid, but still - an assert would be nice)
man page updates and I think I can merge this.
I made a mistake the timeout in select is invalid :
maybe using 1 second will be valid.
Merged in r6613 (general info in commit message)
Including the changes from comment:28 and also:
DISPLAY
will be substituted in all cases, not just when -logfile
is found! (amongst other benefits: more readable code, re-used in select_log_file
, etc)
setsid
)
time.time()
and prevent tiny unlikely race leading to negative timeout using max
Things left to do:
displayfd=yes
to the global config file during install step (in a way that does not conflict with packaging: without modifying the files in place..)
displayfd
, and if it doesn't we need to detect that this is what happened and tell the user
File descriptor fixes in r6624
Another fix: restore "skip display_name_check when shadowing or proxying" in r6712
I added proposal to build setup.py from template and detect Xdummy / Xvfb and displayfd
I'll have to check but I don't think this is going to work when building packages with rpmbuild / debuild, which do not allow the source to be modified in place.. and I don't know how to get distutils to generate files during the "install" phase.
(otherwise I would have replaced the existing xpra.conf template directories with a solution similar to this one a long time ago..)
for ssh string maybe :
ssh:host:port:
=> select display automatically
ssh:host:display
ssh:host:port:display
I do not like this syntax but at less it resolve the issue
This last patch is a draft of a build and install template.
It may be usefull to submit a proper build_tpl and install_tpl module to setuptools but I'm not an expert of setuptools
Awesome stuff! So this is how you override the build or install steps (which I could not figure out how to do).
Merged in r6814 with the following tweaks:
xpra.conf
xpra.conf
, tried to separate client and server parts, and added more comments to it, also removed client only config (rarely ever used anyway)
detect_xorg_setup
returns a boolean, transformed into a string later instead
xpra_conf = None
- did nothing
shutil
import in opengl codepath (now global import)
displayfd
support to Xorg server version 1.12 onwards (tested with Fedora 17 / Xorg version 1.12)
/etc/xpra
not /etc/
os.path.join
rather than +
The remaining TODO items are nice-to-have, but not showstoppers.
bsd path fix in r6816.
build fixes for rpmbuild / debuild in r6823.
(ugly) build fix for win32 in r6853
This will do for this release. Moving the ssh bits to #612.
The bug that keeps on giving: I needed r6895 to get the xpra.conf
installed into /etc/xpra/
when doing a simple: sudo ./setup.py install
Yet another bug: Debian Wheezy ships with Xorg 1.12, but does not support -displayfd
. We cannot assume Xdummy support implies displayfd support.
Yet another missing case: r8126, backport in r8129.
Yet another missing case: r9401, backport in r9402.
Forgot to fix Debian Wheezy, now done in r9574.
Note: we just bump the version check to Xorg 1.13, Fedora supported it with 1.12 but that must have been using some patching. And that Fedora version is EOL, so we don't care.
This has been broken by the 1.18.1 xorg update due to a bug in their new automatic log file renaming which duplicates what we already do, but gets it wrong - see #1192.
r13502 removes the displayfd option since all the distros we support have it (also simplifies the code)
From the ticket that keeps on giving, another bug: #2032.
8 years later and it turns out that we were 100% correct: Allow display names to be arbitrary paths: This lets the local listening socket live in a protected directory where other users cannot spoof it.
this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/172