Xpra: Ticket #2795: can't recover crashed xpra server

I've been using xpra server for a while running on RHEL 6.6. I recently upgraded my Windows 10 client from xpra 2.x to xpra 3.x. I don't know if the problem I'm seeing is correlated to the new client. In the past, I was able to maintain the same xpra session for months. I pretty much leave the Windows 10 client connected 24/7, even when my laptop is locked for the evening.

The problem is that relatively frequently now (about once a week), the xpra server process crashes while I'm away from my computer, causing the client to disconnect. At the time that this happens, I may for example be running a job overnight which is periodically printing to my gnome-terminal session. Initially, I'm not aware that the server has died, so I attempt to reconnect. Once I can't reconnect, I ssh to the machine running the server and confirm that the xpra server process has exited entirely, leaving the associated Xorg process running.

If I try to recover the Xorg session with the --use-display command line switch to xpra start, the server fails to start due to an XError:

2020-06-02 23:41:05,505 Error: cannot start the xpra server
Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/xpra/scripts/server.py", line 1532, in run_server
  File "/usr/lib64/python2.6/site-packages/xpra/server/server_base.py", line 272, in init_components
  File "/usr/lib64/python2.6/site-packages/xpra/x11/server.py", line 407, in load_existing_windows
  File "/usr/lib64/python2.6/site-packages/xpra/gtk_common/error.py", line 168, in __exit__
  File "/usr/lib64/python2.6/site-packages/xpra/gtk_common/error.py", line 100, in _exit
    raise XError(get_X_error(error))
XError: XError: BadDamage (invalid Damage parameter)
2020-06-02 23:41:05,508 XError: BadDamage (invalid Damage parameter)

Interestingly, I am able to start an xpra shadow session attached to the running instance of Xorg, but since there is no window manager, it is not very usable.

I also tried killing ALL X11 applications connected to the running Xorg instance, except for my gnome-terminal, but I still get the error when trying to start a new xpra server and recover my Xorg session.

I'd really like to figure out some way to recover my Xorg session by restarting xpra when this happens, because it seems to be occurring more and more often all of a sudden.

Wed, 03 Jun 2020 06:18:41 GMT - Antoine Martin: owner, description changed

the xpra server process crashes while I'm away from my computer

When the server crashes, what's in your server log? (could be something like the memleak reported in #2730)

When you can't recover with --use-display, please try adding -d all and posting the log file. (or at least -d x11)

Looks like there's an override-redirect window in a weird state, r26594 should help with that: at least the other windows should be loaded properly and the server should start. (you can apply it by hand to your server version) But I still need to understand the root error to fix it properly.

Thu, 04 Jun 2020 17:48:14 GMT - Thomas Esposito: attachment set

Thu, 04 Jun 2020 17:52:39 GMT - Thomas Esposito:

I was able to recover my Xorg instance by manually applying your patch. I have attached the log file with '-d all' for my recovery attempt prior to the patch.

I had already logged in through PuTTY and killed as many x11 applications as possible hoping to kill the one that was causing the problem. At this point, I have my gnome-terminal back, but I'm not sure which other application window was causing the problem. Could it have been something transient like an error dialog or a tooltip?

Thu, 04 Jun 2020 17:58:04 GMT - Thomas Esposito:

I'll try to remember to check the server log next time it crashes.

Also, I actually have a "prefixed" installation of xpra that I "installed" by manually extracting rpms to a user-writeable directory. This "prefix", along with the username, hostname, and other sensitive info are edited out of the log.

Fri, 05 Jun 2020 02:28:49 GMT - Antoine Martin: status changed; resolution set

Thanks for the logs.

Here's the cause:

[36m2020-06-04 13:23:16,008 Warning: failed to manage client window 0x735ae4:[0m
848	[36m2020-06-04 13:23:16,008  window 0x735ae4 disappeared already[0m
849	[36m2020-06-04 13:23:16,008
850	Traceback (most recent call last):
851	  File "/usr/lib64/python2.6/site-packages/xpra/x11/gtk2/wm.py", line 352, in _manage_client
852	    win = WindowModel(self._root, gdkwindow, desktop_geometry)
853	  File "/usr/lib64/python2.6/site-packages/xpra/x11/gtk2/models/window.py", line 163, in __init__
854	    self.call_setup()
855	  File "/usr/lib64/python2.6/site-packages/xpra/x11/gtk2/models/core.py", line 229, in call_setup
856	    self._composite.setup()
857	  File "/usr/lib64/python2.6/site-packages/xpra/x11/gtk2/composite.py", line 55, in setup
858	    WindowDamageHandler.setup(self)
859	  File "/usr/lib64/python2.6/site-packages/xpra/x11/gtk2/window_damage.py", line 62, in setup
860	    raise Unmanageable("window %#x disappeared already" % xid)
861	Unmanageable: window 0x735ae4 disappeared already[0m
862	[36m2020-06-04 13:23:16,079 XGetSelectionOwner(_XSETTINGS_S0)=0[0m
863	[36m2020-06-04 13:23:16,079 Fetching current XSettings data, owner=None[0m
864	[36m2020-06-04 13:23:16,079 default_xsettings=None[0m
865	[31m2020-06-04 13:23:16,079 Error: cannot start the xpra server
866	Traceback (most recent call last):
867	  File "/usr/lib64/python2.6/site-packages/xpra/scripts/server.py", line 1531, in run_server
868	    app.init(opts)
869	  File "/usr/lib64/python2.6/site-packages/xpra/x11/server.py", line 195, in init
870	    X11ServerBase.init(self, opts)
871	  File "/usr/lib64/python2.6/site-packages/xpra/x11/x11_server_base.py", line 94, in init
872	    self.x11_init()
873	  File "/usr/lib64/python2.6/site-packages/xpra/gtk_common/error.py", line 168, in __exit__
874	    trap._exit(True)
875	  File "/usr/lib64/python2.6/site-packages/xpra/gtk_common/error.py", line 100, in _exit
876	    raise XError(get_X_error(error))
877	XError: XError: BadWindow[0m
878	[31m2020-06-04 13:23:16,126 XError: BadWindow[0m

I am going to close this as fixed, even though the underlying bug is not really fixed. But until I can reproduce it myself, this will have to do.

Sat, 23 Jan 2021 06:01:04 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/2795