xpra icon
Bug tracker and wiki

Opened 2 months ago

Last modified 2 months ago

#2211 assigned defect

gtkperf + xpra info hangs

Reported by: Antoine Martin Owned by: Antoine Martin
Priority: critical Milestone: 2.5
Component: server Version: 2.4.x
Keywords: Cc:

Description (last modified by Antoine Martin)

Running gtkperf -a in a loop, and running xpra info in another loop in parallel eventually causes the server to hang.
It shows new connection but times them all out.

Running gtkperf -a in a loop alone does not crash.

Running xpra info alone in a loop does not crash or leak any memory.

Attachments (3)

info.txt (81.8 KB) - added by Antoine Martin 2 months ago.
xpra info of a hung python2 server after ~10000 gtkperf windows are shown
info-py3.txt (172.7 KB) - added by Antoine Martin 2 months ago.
python3 server just before it hung
verify-main-thread.patch (6.9 KB) - added by Antoine Martin 2 months ago.
verify that we are in the main thread in more places

Download all attachments as: .zip

Change History (7)

comment:1 Changed 2 months ago by Antoine Martin

Description: modified (diff)
Status: newassigned

comment:2 Changed 2 months ago by Antoine Martin

The server goes AWOL following this message in the server log:

(Xpra:7593): Gdk-WARNING **: 21:51:39.364: GdkWindow 0x400d81 unexpectedly destroyed
/usr/lib64/python3.7/site-packages/xpra/x11/models/window.py:327: Warning: g_object_get_qdata: assertion 'G_IS_OBJECT (object)' failed
  self.corral_window = None
/usr/lib64/python3.7/site-packages/xpra/x11/models/window.py:327: Warning: g_object_set_qdata_full: assertion 'G_IS_OBJECT (object)' failed
  self.corral_window = None
/usr/lib64/python3.7/site-packages/xpra/x11/models/window.py:327: Warning: g_object_remove_toggle_ref: assertion 'G_IS_OBJECT (object)' failed
  self.corral_window = None
2019-03-18 21:51:48,306 Warning: timeout on screen updates for window 1,
2019-03-18 21:51:48,306  already delayed for more than 15 seconds

(Xpra:7593): Gdk-WARNING **: 21:51:53.976: GdkWindow 0x400def unexpectedly destroyed
2019-03-18 21:51:55,098 Error: connection timed out: unix-domain socket:/run/user/1000/xpra/desktop-20

Even with python2, things can degenerate, though not as fast and not as badly.

Changed 2 months ago by Antoine Martin

Attachment: info.txt added

xpra info of a hung python2 server after ~10000 gtkperf windows are shown

Changed 2 months ago by Antoine Martin

Attachment: info-py3.txt added

python3 server just before it hung

Changed 2 months ago by Antoine Martin

Attachment: verify-main-thread.patch added

verify that we are in the main thread in more places

comment:3 Changed 2 months ago by Antoine Martin

r22114 adds extra thread checking to the gtk X11 context manager code, and the patch above adds yet more checks. But none of those trigger any warnings.

Could it be a race condition that is made more likely by the extra delay introduced by the "xpra info" query?

comment:4 Changed 2 months ago by Antoine Martin

Priority: blockercritical

When the server seems stuck, it is consuming a lot of CPU time in the X11 event loop.
Simply killing the gtkperf client loop and waiting for things to settle eventually brings it back to life, so I'm lowering the priority as it's not a crash, "just" an X11 DoS.

Is GTK3 that much slower? (New ticket: #2219) Less efficient? Gets more events?

The hang is often preceded by:

2019-03-18 23:56:53,637 Warning: timeout on screen updates for window 1,
2019-03-18 23:56:53,637  already delayed for more than 15 seconds

Which could happen if the main thread is too busy to answer.

How can we avoid getting in this situation, or detect it and deal with it?
Maybe as part of the move to wayland, we can move away from GTK? We already have window models, do we really need those GDK windows to wrap the X11 window?

Last edited 2 months ago by Antoine Martin (previous) (diff)
Note: See TracTickets for help on using tickets.