xpra icon
Bug tracker and wiki

Opened 12 months ago

Closed 5 months ago

#1654 closed defect (needinfo)

xpra start with proxy fails / xvfb error

Reported by: mviereck Owned by: mviereck
Priority: critical Milestone: 2.3
Component: server Version: 2.2.x
Keywords: start proxy xvfb Cc:

Description

System: debian 9 with xpra v2.1.2-r16903
A regular start of xpra fails. Disabling the proxy mode with --start-via-proxy=no works fine.

$ xpra start :30
2017-10-01 23:12:33,925 server failure: disconnected before the session could be established
2017-10-01 23:12:33,926 server requested disconnect: server error (failed to start a new session)
Warning: cannot use the system proxy for 'start' subcommand,
 FAILURE
Entering daemon mode; any further errors will be reported to:
  /run/user/1000/xpra/:30.log
$ cat   /run/user/1000/xpra/:30.log
2017-10-01 23:12:34,931 Warning: xpra start from an existing 'xfce' desktop session
2017-10-01 23:12:34,931  pulseaudio forwarding may not work
2017-10-01 23:12:34,932  try using a clean environment, a dedicated user,
2017-10-01 23:12:34,932  or disable xpra's "pulseaudio" option
(EE) 
Fatal server error:
(EE) Server is already active for display 30
	If this server is no longer running, remove /tmp/.X30-lock
	and start again.
(EE) 
(EE) 
Please consult the The X.Org Foundation support 
	 at http://wiki.x.org
 for help. 
(EE) 
2017-10-01 23:12:34,956 
2017-10-01 23:12:34,957 Xvfb command has terminated! xpra cannot continue
2017-10-01 23:12:34,957  if the display is already running, try a different one,
2017-10-01 23:12:34,957  or use the --use-display flag
2017-10-01 23:12:34,957 
2017-10-01 23:12:34,958 killing xvfb with pid 6704
2017-10-01 23:12:34,958 failed to kill xvfb process with pid 6704:
2017-10-01 23:12:34,958  [Errno 3] No such process
# systemctl status xpra
● xpra.service - Xpra System Server
   Loaded: loaded (/lib/systemd/system/xpra.service; enabled; vendor preset: enabled)
   Active: active (running) since Sun 2017-10-01 15:38:52 CEST; 7h ago
     Docs: https://xpra.org/trac/wiki/Service
           man:xpra
 Main PID: 613 (xpra)
    Tasks: 1 (limit: 4915)
   Memory: 118.4M
      CPU: 18.407s
   CGroup: /system.slice/xpra.service
           └─613 /usr/bin/python /usr/bin/xpra proxy :14500 --daemon=no --tcp-auth=sys --ssl=on --ssl-cert=/etc/xpra/ssl-cert.pem --bind=none --auth=peercred --socket-dirs=/run/xpra --socket-permissions=666 --log-

Okt 01 16:54:37 debian9 xpra[613]:  display :5 is already active!
Okt 01 23:12:18 debian9 xpra[613]: New unix-domain connection received on /run/xpra/system
Okt 01 23:12:19 debian9 python[6457]: PAM _pam_load_conf_file: unable to open /etc/pam.d/system-auth
Okt 01 23:12:19 debian9 python[6457]: PAM _pam_load_conf_file: unable to open /etc/pam.d/system-auth
Okt 01 23:12:19 debian9 python[6457]: PAM _pam_load_conf_file: unable to open /etc/pam.d/system-auth
Okt 01 23:12:19 debian9 python[6457]: PAM _pam_load_conf_file: unable to open /etc/pam.d/postlogin
Okt 01 23:12:19 debian9 xpra[613]: Entering daemon mode; any further errors will be reported to:
Okt 01 23:12:19 debian9 xpra[613]:   /run/user/1000/xpra/:30.log
Okt 01 23:12:33 debian9 xpra[613]: Error: failed to start server subprocess:
Okt 01 23:12:33 debian9 xpra[613]:  failed to identify the new server display!

If I try the same display number afterwards, xpra start :30 --start-via-proxy=no fails, the display number is already in use. xpra start :31 --start-via-proxy=no works fine.

Attachments (1)

xpra_info_socket.log (15.7 KB) - added by mviereck 7 months ago.

Download all attachments as: .zip

Change History (13)

comment:1 Changed 12 months ago by mviereck

As a strange side effect, I can not start X applications on display :0 afterwards. No protocol specified. XAUTHORITY is set and the file still contains a cookie. Maybe xpra changed and damaged the cookie?

comment:2 Changed 11 months ago by Antoine Martin

Owner: changed from Antoine Martin to mviereck

Sorry, works fine here.
Please try commenting out DEBUG=auth,proxy in /etc/default/xpra and restarting the service:

systemctl restart xpra.service

You should then be able to see much more debug information using journalctl.

You can also check the health of the system wide proxy server using:

xpra info socket:/run/xpra/system

comment:3 Changed 11 months ago by Antoine Martin

I missed the really obvious part where you already had an X11 server running on display :30.
I think that xpra is working as intended here: if you wan't to re-use an existing display server, specify the --use-display flag.
If there is a bug, it might actually be that the non-proxy start re-uses the display without needed the --use-display flag, not the other way around.

If somehow xpra failed to kill the display when stopping, this would also be a different bug: we would need the debug log of the xpra server shutting down to diagnose that.

(you can enable debugging after the server is started using "xpra control" or the server's dbus interface)

comment:4 Changed 11 months ago by Antoine Martin

Resolution: needinfo
Status: newclosed

2.2 is nearly ready

comment:5 Changed 7 months ago by mviereck

The bug still persists, now tested with xpra v2.3-r18051.

I missed the really obvious part where you already had an X11 server running on display :30.

I did not run X on :30. Only xpra can have done that. It's the same regardless which display number I choose.

xpra server log of xpra start :40:

2018-02-27 14:27:45,092 Warning: xpra start from an existing 'xfce' desktop session
2018-02-27 14:27:45,092  pulseaudio forwarding may not work
2018-02-27 14:27:45,093  try using a clean environment, a dedicated user,
2018-02-27 14:27:45,093  or disable xpra's "pulseaudio" option
2018-02-27 14:27:45,099 cannot access python uinput module:
2018-02-27 14:27:45,099  No module named uinput
(EE) 
Fatal server error:
(EE) Server is already active for display 40
	If this server is no longer running, remove /tmp/.X40-lock
	and start again.
(EE) 
(EE) 
Please consult the The X.Org Foundation support 
	 at http://wiki.x.org
 for help. 
(EE) 
2018-02-27 14:27:45,139 
2018-02-27 14:27:45,139 Xvfb command has terminated! xpra cannot continue
2018-02-27 14:27:45,140  if the display is already running, try a different one,
2018-02-27 14:27:45,140  or use the --use-display flag
2018-02-27 14:27:45,140 
2018-02-27 14:27:45,140 killing xvfb with pid 2495
2018-02-27 14:27:45,141 failed to kill xvfb process with pid 2495:
2018-02-27 14:27:45,141  [Errno 3] No such process

journalctl -u xpra after enabling DEBUG=auth,proxy and xpra start :40:

Feb 27 14:24:57 debian9 xpra[633]: New unix-domain connection received on /run/xpra/system
Feb 27 14:24:57 debian9 xpra[633]: Error: failed to start server subprocess:
Feb 27 14:24:57 debian9 xpra[633]:  display :30 is already active!
Feb 27 14:27:27 debian9 xpra[633]: New unix-domain connection received on /run/xpra/system
Feb 27 14:27:29 debian9 python[1308]: PAM _pam_load_conf_file: unable to open /etc/pam.d/system-auth
Feb 27 14:27:29 debian9 python[1308]: PAM _pam_load_conf_file: unable to open /etc/pam.d/system-auth
Feb 27 14:27:29 debian9 python[1308]: PAM _pam_load_conf_file: unable to open /etc/pam.d/system-auth
Feb 27 14:27:29 debian9 python[1308]: PAM _pam_load_conf_file: unable to open /etc/pam.d/postlogin
Feb 27 14:27:29 debian9 xpra[633]: Entering daemon mode; any further errors will be reported to:
Feb 27 14:27:29 debian9 xpra[633]:   /run/user/1000/xpra/:40.log
Feb 27 14:27:42 debian9 xpra[633]: Error: failed to start server subprocess:
Feb 27 14:27:42 debian9 xpra[633]:  failed to identify the new server display!
Feb 27 14:30:06 debian9 systemd[1]: Stopping Xpra System Server...
Feb 27 14:30:06 debian9 xpra[633]: got signal SIGTERM, exiting
Feb 27 14:30:07 debian9 systemd[1]: Stopped Xpra System Server.
Feb 27 14:30:07 debian9 systemd[1]: Started Xpra System Server.
Feb 27 14:30:09 debian9 xpra[11524]: wrote pid 11524 to '/run/xpra.pid'
Feb 27 14:30:10 debian9 xpra[11524]: serving html content from: /usr/share/xpra/www
Feb 27 14:30:10 debian9 xpra[11524]: xpra proxy version 2.3-r18051 64-bit
Feb 27 14:30:10 debian9 xpra[11524]:  uid=0 (root), gid=0 (root)
Feb 27 14:30:10 debian9 xpra[11524]:  running with pid 11524 on Linux Debian stable-updates sid
Feb 27 14:30:10 debian9 xpra[11524]: xpra is ready.
Feb 27 14:30:25 debian9 xpra[11524]: New unix-domain connection received on /run/xpra/system
Feb 27 14:30:25 debian9 xpra[11524]: processing structured info request from unix-domain socket:/run/xpra/system

Attachment: Output of xpra info socket:/run/xpra/system

Changed 7 months ago by mviereck

Attachment: xpra_info_socket.log added

comment:6 Changed 7 months ago by Antoine Martin

Priority: majorcritical
Resolution: needinfo
Status: closedreopened

So it looks like the proxy server starts a session but never finds it, gives up, then the "xpra start" command falls back to a direct start, but then it finds that the vfb server is already active (leftover from the proxy start) and fails.

I am tempted to just turn off the proxy on Debian and be done with it.

comment:7 Changed 7 months ago by Antoine Martin

Status: reopenednew

I managed to reproduce the problem on one Debian system.
I believe this is caused by xauth timeouts, r18643 tries to workaround that problem and free the locks if possible. (see also r18641)
Found the hint by looking at the backed up X11 log file (not the one from the second attempt to start the vfb), and this answer: Why am I getting this message from xauth: “timeout in locking authority file /home/<user>/.Xauthority”?

Starting a session may still fail the first time you run it, but hopefully not after that. If you still have timeouts, you may want to verify that your $XAUTHORITY files have the correct permissions.

This is likely to have been caused by running versions without this fix: r18040 (2.2.0 and 2.2.1), which could have left the xauth file owned by root.

The alternative fix would be to always use our own XAUTHORITY file.

Last edited 7 months ago by Antoine Martin (previous) (diff)

comment:8 Changed 7 months ago by Antoine Martin

r18646 might be a better fix (in trunk) and it is simple but I am reluctant to backport it as the impact could be big and cause problems on some platforms. (for example: logind / systemd might decide to cleanup this directory in some circumstances)

Moved it to a ticket: #1776

Last edited 7 months ago by Antoine Martin (previous) (diff)

comment:9 Changed 7 months ago by mviereck

you may want to verify that your $XAUTHORITY files have the correct permissions.

The permissions seem to be ok, the file is recreated on every gdm login. I did not find a lock file for the cookie.

-rwx------ 1 lauscher lauscher 102 Feb 28 12:10 /run/user/1000/gdm/Xauthority

After failing xpra start :40:

-rw------- 1 lauscher lauscher 104 Feb 28 13:46 /run/user/1000/gdm/Xauthority

The alternative fix would be to always use our own XAUTHORITY file.

I assume that is the best way to go. I always use a self-created XAUTHORITY at a custom location if I set up another X server. If I want to change the authentication file from display :0 I use a copy.

logind / systemd might decide to cleanup this directory in some circumstances

hm, ok, I cannot assess that.

If the whole issue breaks down to cookie permissions than the bug is not as hard as it seems to me first. I am already used to always run xpra with --start-via-proxy=no. If the proxy works well in future versions, I may get better startup performance as I don't need to run several xpra server instances.

Just I see r18646 is already in the repository. I've installed it and check again after next reboot. Is it intended that the latest beta is not above the latest stable now (apt automatically gives me the latest stable instead of the latest beta)? It has been different in the past iirc.

comment:10 Changed 7 months ago by mviereck

Now with r18646. The system proxy still fails, but the fallback works:

lauscher@debian9:~$ xpra start :50
2018-02-28 14:23:18,484 connection timed out
Warning: cannot use the system proxy for 'start' subcommand,
 TIMEOUT
 more information may be available in your system log
lauscher@debian9:~$ Entering daemon mode; any further errors will be reported to:
  /run/user/1000/xpra/:50.log

lauscher@debian9:~$ cat /run/user/1000/xpra/:50.log
2018-02-28 14:23:20,989 Warning: xpra start from an existing 'xfce' desktop session
2018-02-28 14:23:20,990  pulseaudio forwarding may not work
2018-02-28 14:23:20,990  try using a clean environment, a dedicated user,
2018-02-28 14:23:20,990  or disable xpra's "pulseaudio" option
2018-02-28 14:23:20,995 cannot access python uinput module:
2018-02-28 14:23:20,995  No module named uinput

X.Org X Server 1.19.2
Release Date: 2017-03-02
X Protocol Version 11, Revision 0
Build Operating System: Linux 4.9.0-4-amd64 x86_64 Debian
Current Operating System: Linux debian9 4.9.0-6-amd64 #1 SMP Debian 4.9.82-1+deb9u2 (2018-02-21) x86_64
Kernel command line: BOOT_IMAGE=/vmlinuz-4.9.0-6-amd64 root=UUID=0096536e-ebbe-4bf7-9fce-bc8f2eece1f9 ro quiet
Build Date: 16 October 2017  08:19:45AM
xorg-server 2:1.19.2-1+deb9u2 (https://www.debian.org/support) 
Current version of pixman: 0.34.0
	Before reporting problems, check http://wiki.x.org
	to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
	(++) from command line, (!!) notice, (II) informational,
	(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(++) Log file: "/run/user/1000/xpra/Xorg.:50.log", Time: Wed Feb 28 14:23:21 2018
(++) Using config file: "/etc/xpra/xorg.conf"
(==) Using system config directory "/usr/share/X11/xorg.conf.d"

TIMEOUT
more information may be available in your system log

I am not sure where to look. If you need more information, please tell me.

comment:11 Changed 7 months ago by Antoine Martin

Milestone: 2.22.3
Version: 2.1.x2.2.x

The permissions seem to be ok

OK, just a bit strange to have the executable bit on a data file.

The alternative fix would be to always use our own XAUTHORITY file.

I assume that is the best way to go

Done, see #1776.

I may get better startup performance as I don't need to run several xpra server instances.

Using the system wide proxy does not help with startup performance, if anything it can make things a tad slower.
It is meant to:

  • integrate with logind for proper multi-seat integration: #1105
  • allow better device emulation: #1611

Is it intended that the latest beta is not above the latest stable now (apt automatically gives me the latest stable instead of the latest beta)?

That depends on your repository config.
As of today (2018-03-01), the beta repo should give you the latest 2.3, the stable repo should give you 2.2.4.
Works for me here. (there was a time last year when a beta build slipped into the stable repo for one distro, can't remember which one - for about a day before I noticed)

I am not sure where to look.

The system log you can usually watch with journalctl.
But it is likely to just tell you the same thing: to look at /run/user/1000/xpra/:50.log.
The trick is that the fallback startup makes a backup of that log file before creating a new one, so you should be able to find a /run/user/1000/xpra/:50.log.bak with the server log from the first attempt, the one from the proxy start that failed.

xauth issues can manifest themselves as:

  • xauth warnings in that file
  • a long delay between the initial logging messages before the vfb + xauth start command, and the remainder of the server log

If that doesn't help, set:

DEBUG=auth,proxy,util,x11

in /etc/defaults/xpra, and then systemctl daemon-reload && systemctl restart xpra.socket.
Then journalctl will show a lot more details about how it:

  • receives the request to start a new session on behalf of the user
  • starts the new session
  • identifies its sockets
  • communicates the results with the "xpra start" command

comment:12 Changed 5 months ago by Antoine Martin

Resolution: needinfo
Status: newclosed

Not heard back, closing.

Note: See TracTickets for help on using tickets.