xpra icon
Bug tracker and wiki

Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#420 closed defect (fixed)

Xpra does not hold the gtk global mutex while calling into gtk

Reported by: thefloweringash Owned by: Antoine Martin
Priority: major Milestone:
Component: platforms Version: 0.8.x
Keywords: freebsd threads Cc:

Description

Xpra 0.8.8 on FreeBSD crashes on startup with an error from pthread_mutex_unlock (ports/177800).

The gtk documentation on threads says:

GTK+ is "thread aware" but not thread safe — it provides a global lock controlled by gdk_threads_enter()/gdk_threads_leave() which protects all use of GTK+. That is, only one thread can use GTK+ at any given time.

However I can't find any use of gdk_threads_enter() or gdk_threads_leave() within Xpra's source.

This crash on FreeBSD is caused by gtk_main() calling GDK_THREADS_LEAVE() to unlock the global mutex, with the assumption that it is currently locked by the caller. When attempting to unlock an unlocked mutex (or a mutex held by another thread), FreeBSD returns EPERM, which gtk quickly promotes to an abort() (in g_mutex_unlock(), g_thread_abort()).

Xpra works on Linux since Linux (apparently?) returns 0 in the case of unlocking an unlocked mutex, and as far as I'm aware, POSIX leaves this as undefined behavior (http://stackoverflow.com/a/1778821).

Applying the obvious patch (https://gist.github.com/thefloweringash/6302597) allows Xpra to start and accept an xterm, but I haven't yet been able to attach due to a socket.error (Socket is not connected), which I'm assuming is unrelated.

Change History (13)

comment:1 Changed 7 years ago by Antoine Martin

You do not need gdk_threads_enter() / gdk_threads_leave() if your application only accesses GTK from the UI thread.
This is what xpra does unless proven otherwise.
You never need to add those calls around gtk.main, for sure.


Looks to me like something is broken on your build (GTK? Python?).
Please try the latest releases, 0.8.x is not supported, and even 0.9.x is unlikely to get another update, so you should be testing with 0.10.1 or later (trunk).

comment:2 Changed 7 years ago by thefloweringash

The gtk examples in the threads page do wrap gtk_main() with gtk_threads_enter()/gtk_threads_leave(), and though I can't quickly find anything definitive on the subject, this post turned up by a quick google seems to agree with my analysis.

http://blogs.operationaldynamics.com/andrew/software/gnome-desktop/gtk-thread-awareness

I tested with 0.8.8 since I defaulted to the version in ports. I'll test 0.10.1.

comment:3 Changed 7 years ago by Antoine Martin

For more information on how xpra uses the UI thread, see:

Note the absence of enter/leave anywhere.

  • I am using a separate thread to run my code, but the application (or the UI) hangs:
    • "1. Allow only the main thread to touch the GUI (gtk) part, while letting other threads do background work" - is what we do
    • "2. Allow any thread to do GUI stuff. Warning: people doing win32 pygtk programming have said that having non-main threads doing GUI stuff in win32 doesn't work. So this programming style is really not recommended." - we cannot use this approach because it would break win32 (and probably others too)

Now it is possible that there is some code somewhere that improperly calls some gtk ui code from a non-main thread, if that is the case we really want to hunt it down and fix it, you should be able to get a backtrace for us using gdb.

What can be misleading on threading with (py)gtk is posts like this one: Multi-threaded GTK applications – Part 1: Misconceptions which state: It turns out that you’re supposed to put gdk_threads_enter()/leave() around the call you make to gtk_main(). - that is only true if you intend to use threading (multiple threads modifying the UI) - which is not our case (for the reasons explained above).

Please also note: no matter if you are doing PyGTK calls from a separate thread or not, you must compile PyGTK with --enable-threads - I assume this is your case (I would have expected the enter/leave calls to error out if it was not the case)

Last edited 7 years ago by Antoine Martin (previous) (diff)

comment:4 Changed 7 years ago by thefloweringash

gdb session of xpra 0.10.1 https://gist.github.com/thefloweringash/6303230

EDIT: a gdb session for the following simplified version https://gist.github.com/thefloweringash/6303264

#!/usr/local/bin/python2.7
import gtk.gdk
gtk.gdk.threads_init()
gtk.main()
Last edited 7 years ago by thefloweringash (previous) (diff)

comment:5 Changed 7 years ago by Antoine Martin

Resolution: invalid
Status: newclosed

Then the problem is with FreeBSD, this code is valid and can be found in many applications, including the textbook examples from the pygtk FAQ linked above.

comment:6 Changed 7 years ago by thefloweringash

Short version

FreeBSD compatibility for 0.10.1 is a very small patch: https://gist.github.com/thefloweringash/6315767. Since 0.8.8, most of the gtk.gdk.threads_init() calls have been changed to the more precise glib.threads_init() (formerly gobject.threads_init()). After applying this change to server.py I can successfully interact with an xterm over xpra.

Long version

After investigating further I have to disagree. The PyGTK samples run without aborting on FreeBSD, and I find none of the posts linked to be in error. I do not believe the code sample in #comment:4 is valid. At the risk of being overly verbose, I hope a somewhat thorough exploration might still be generally useful, since there seems to be some disagreement on this topic.

I propose the following condition:

  • You must wrap gdk_threads_enter() and gdk_threads_leave() around gtk_main() if you have previously called gdk_threads_init().

or for the sake of avoiding ambiguity, in pygtk terms

  • You must wrap gtk.gdk.threads_enter() and gtk.gdk.threads_leave() around gtk.main() if you have previously called gtk.gdk.threads_init().

It is important to note that glib.threads_init() (source) (docs) corresponds to g_thread_init(), and does not call gdk_threads_init(). That is, it is enabling glib thread safety, and not engaging the gtk global mutex.

In the PyGTK FAQ 20.6. The approach labelled 1 satisfies the condition as it does not call gtk.gdk.threads_init(). The approach labelled 2, as demoed on the mailing list also satisfies the condition the as it both calls gtk.gdk.threads_init() and wraps gtk.main() with gtk.gdk.threads_enter() and gtk.gdk.threads_leave().

The "Misconceptions" post explains that after gdk_threads_init(), the call to gtk_main() will unlock the global mutex, which will already be unlocked without gdk_threads_enter(). In my opinion unlocking an unlocked mutex is a bug, and on glib on FreeBSD it's an abort(). The comments for g_mutex_unlock() make it quite explicitly undefined behavior.

comment:7 Changed 7 years ago by Antoine Martin

Thanks for the details and the patches.

The straightforward build fixes are merged in r4216 and I will apply them to the 0.10.x branch before the next point release.

The threading patch was merged in r4217, I am hoping this one will get more testing on a variety of platforms (especially win32 which has had problems in the past) before I apply to 0.10.x - this sort of change makes me nervous.

(I have briefly read the links and it looks like you are right about enter/leave being needed once we call gtk.gdk.threads_init())

comment:8 Changed 7 years ago by Antoine Martin

r4790 is required to prevent a server crash on OpenBSD (and probably FreeBSD and others too) - will backport to v0.10.x.

comment:9 Changed 7 years ago by Antoine Martin

  • r4794 does the same thing for the gtk2 client (now also works on FreeBSD - although there are unrelated libav problems there...)
  • r4796 adds the same code to the launcher and does some refactoring


All backported to v0.10.x in r4799

Last edited 7 years ago by Antoine Martin (previous) (diff)

comment:10 Changed 7 years ago by Antoine Martin

Resolution: invalid
Status: closedreopened

Re-opening so I can change the resolution: this bug was *not* invalid.

comment:11 Changed 7 years ago by Antoine Martin

Resolution: fixed
Status: reopenedclosed

comment:12 Changed 7 years ago by Antoine Martin

Damn: this caused a regression, see #485

comment:13 Changed 7 years ago by Antoine Martin

And yet another one: #497

So. Much. Pain.

Note: See TracTickets for help on using tickets.