Xpra: Ticket #2304: 2.5+ individual windows become frozen and unkillable

Client: xpra 3.0-20190506r22647-1 Xubuntu 18.04.2

Server: Xubuntu 16.04.6 xpra 2.5.1-r22431-1 and/or 3.0-20190506r22647-1

If I start up an xfce4-terminal inside xpra and "mess about" quickly with the menus, I get to a point where I can't type into the terminal or properly close it. Clicking the close button seems to eventually kill the underlying process, but the window remains visible in the client. Clicking in the zombie window causes underlying windows to pop-up, maybe suggesting that the window no longer exists on the server?

I have also seen this with gnucash and Google Chrome, but it seems to be most easily reproducible with xfce4-terminal.

Sat, 18 May 2019 11:21:13 GMT - tc424: attachment set

(Attempted) demostration

Sat, 18 May 2019 11:22:10 GMT - tc424: attachment set

Server log file

Sat, 18 May 2019 11:27:47 GMT - Antoine Martin: owner changed

Please capture xpra info after this happens. Does the server still respond? Does it happen if you start the server with:

XPRA_SYNCHRONIZE=1 xpra start ..

Does it happen if you run the server on Ubuntu 18.04 or newer?

Sat, 18 May 2019 11:33:04 GMT - tc424: attachment set

Sat, 18 May 2019 11:38:50 GMT - tc424:

Yes, seems to happen if I start a server on the 18.04 machine as well (interesting as I believe xfce4-terminal is Gtk3 based on 18.04, as opposed to using Gtk2 on 16.04.)

Using --sync-xvfb and x11vnc on the 16.04 machine, seems to confirm that the server and the client get out of sync - the client thinks there's a still a highlight active in the menu bar, which isn't there on Xvfb. Clicking to close window closes it on Xvfb but not the client.

Sat, 18 May 2019 11:41:18 GMT - tc424:

And yes, still seems to happen with XPRA_SYNCHRONIZE on.

Sat, 18 May 2019 11:41:59 GMT - tc424:

Oh, and yes, the server is otherwise responsive - other applications work, it shutdown when requested, etc.

Sat, 18 May 2019 11:47:22 GMT - Antoine Martin:

Does this happen with the gtk3 client? (install python3-xpra and run as python3 /usr/bin/xpra. Try XPRA_SYNCHRONIZE with the client. Also try enabling / disabling opengl.

Sat, 18 May 2019 12:09:38 GMT - tc424:

No difference with/without opengl.

No difference with XPRA_SYNCHRONIZE on the client.

The python3 client DOES make a difference - haven't yet managed to reproduce, will keep trying. Subjectively performance feels very slightly slower generally with python3, and specifically quite a lot worse when playing with the menus in xfce4-terminal.

I also have a font hinting issue with python3, but that'll be another ticket I guess..

Sat, 18 May 2019 12:15:26 GMT - tc424:

(Just realised I have sync-xvfb still on, which may partly explain performance issue.)

Much weirdness though - I left the xfce4-terminal I was playing with open, disconnected the python3 client, reconnected with the python2 client - and discovered the terminal has keyboard input locked out. Disconnect and reconnect with python3 client again and the keyboard input is fine.

So seems like the different clients are interpreting the server state differently?

Sat, 18 May 2019 16:01:46 GMT - Antoine Martin: owner, priority, status changed

I can reproduce the problem on Fedora so I should be able to fix it.

The server log is full of client decoding error: unknown cause.

Thu, 23 May 2019 15:00:24 GMT - Antoine Martin:

More complete server log output:

2019-05-23 21:55:38,070 Warning: client decoding error:
2019-05-23 21:55:38,070  unknown cause
(... many more ...)
2019-05-23 21:55:57,593 Warning: mmap area is full!
2019-05-23 21:55:57,594  we need to store 2372664 bytes but only have 1733442 free space left

So not only is there a decoding error but we fail to free the mmap area which eventually causes it to overflow.

The strange thing is that the client output is clean, without any warnings there. The server is responding and xpra info looks healthy. One thing that does look odd is that the xfce4-terminal window doesn't seem to have widget-focus with python3: the menus are slightly greyed out.

Thu, 23 May 2019 16:12:50 GMT - Antoine Martin: status changed; resolution set

The mmap bug is unrelated and is only made easier to trigger by this bug, this is fixed in r22770.

This bug was quite hard to find. It was caused by r22384 and is fixed in r22773. Re-using the same variable for an inner iterator caused the wrong window to be removed from our active list, all subsequent paint events ended up being dropped.

I will spin up new builds with this fix.

Thanks for the bug report!

Sat, 22 Jun 2019 12:39:41 GMT - tc424:

Just to confirm this doesn't seem to be reproducible here any more - thank you once again :)

Sat, 23 Jan 2021 05:47:43 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/2304