This undesirable behaviour appears to be new-ish (last 2 weeks or so).
With Ubuntu 16.10 on both client and server, xpra server will crash unexpectedly. LibreOffice? application is particularly vulnerable: open a spreadsheet containing data, try to scroll using mouse-wheel. Slow scrolling often works; faster scrolling kills the server.
Sample log file entry from server shows version number at top, X problems at bottom:
$ tail -n 60 \:82.log 2017-03-27 07:39:59,925 New unix-domain connection received on /home/loew/.xpra/infty-82 2017-03-27 07:39:59,930 Handshake complete; enabling connection 2017-03-27 07:40:00,017 Python/Gtk2 Linux Ubuntu 16.04 xenial client version 2.0-r15319 64-bit 2017-03-27 07:40:00,017 connected from 'mpdhost' as 'loew' - 'Philip Loewen' 2017-03-27 07:40:00,017 using rgb as primary encoding 2017-03-27 07:40:00,017 also available: 2017-03-27 07:40:00,017 h264, vp9, vp8, png, png/P, png/L, rgb24, jpeg, rgb32 2017-03-27 07:40:00,018 client root window size is 1680x1050 with 1 display: 2017-03-27 07:40:00,019 :0.0 (444x277 mm - DPI: 96x96) workarea: 1680x1024 2017-03-27 07:40:00,019 monitor 1 (474x296 mm - DPI: 90x90) 2017-03-27 07:40:00,019 monitor 2 1280x720 (339x190 mm - DPI: 95x96) 2017-03-27 07:40:00,046 setting key repeat rate from client: 500ms delay / 30ms interval 2017-03-27 07:40:00,047 setting keymap: rules=evdev, model=pc105, layout=us 2017-03-27 07:40:00,144 client 1: Attached to ssh/loew@infty.math.ubc.ca/:82 (press Control-C to detach) 2017-03-27 07:40:01,133 client 1: Error: printing disabled: 2017-03-27 07:40:01,133 client 1: No module named cups 2017-03-27 07:40:03,145 client 1: Warning: the sound output process has failed to start 2017-03-27 07:40:03,965 client 1: Warning: the opus+ogg sound sink has stopped 2017-03-27 07:40:05,329 Introspect error on :1.6:/org/libreoffice: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.ServiceUnknown: The name :1.6 was not provided by any .service files 2017-03-27 07:40:05,330 Executing introspect queue due to error 2017-03-27 07:40:05,330 Introspect error on :1.6:/org/libreoffice/window/14680136: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.ServiceUnknown: The name :1.6 was not provided by any .service files 2017-03-27 07:40:05,330 Executing introspect queue due to error 2017-03-27 07:40:05,332 Introspect error on :1.6:/org/libreoffice/menus/appmenu: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.ServiceUnknown: The name :1.6 was not provided by any .service files 2017-03-27 07:40:05,332 Executing introspect queue due to error 2017-03-27 07:40:05,332 Error: failed to query org.gtk.Actions at /org/libreoffice: 2017-03-27 07:40:05,333 (DBusException(dbus.String(u'The name :1.6 was not provided by any .service files'),),) 2017-03-27 07:40:05,333 Error: failed to query org.gtk.Actions at /org/libreoffice: 2017-03-27 07:40:05,333 (DBusException(dbus.String(u'The name :1.6 was not provided by any .service files'),),) 2017-03-27 07:40:05,333 Error: failed to query org.gtk.Actions at /org/libreoffice/window/14680136: 2017-03-27 07:40:05,333 (DBusException(dbus.String(u'The name :1.6 was not provided by any .service files'),),) 2017-03-27 07:40:05,333 Error: failed to query org.gtk.Actions at /org/libreoffice/window/14680136: 2017-03-27 07:40:05,333 (DBusException(dbus.String(u'The name :1.6 was not provided by any .service files'),),) 2017-03-27 07:40:05,334 Error: failed to query org.gtk.Menus at /org/libreoffice/menus/appmenu: 2017-03-27 07:40:05,334 (DBusException(dbus.String(u'The name :1.6 was not provided by any .service files'),),) 2017-03-27 07:40:08,588 Introspect error on :1.6:/org/libreoffice: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.ServiceUnknown: The name :1.6 was not provided by any .service files 2017-03-27 07:40:08,588 Executing introspect queue due to error 2017-03-27 07:40:08,588 Introspect error on :1.6:/org/libreoffice/window/14680365: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.ServiceUnknown: The name :1.6 was not provided by any .service files 2017-03-27 07:40:08,588 Executing introspect queue due to error 2017-03-27 07:40:08,588 Introspect error on :1.6:/org/libreoffice/menus/appmenu: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.ServiceUnknown: The name :1.6 was not provided by any .service files 2017-03-27 07:40:08,589 Executing introspect queue due to error 2017-03-27 07:40:08,589 Error: failed to query org.gtk.Actions at /org/libreoffice: 2017-03-27 07:40:08,589 (DBusException(dbus.String(u'The name :1.6 was not provided by any .service files'),),) 2017-03-27 07:40:08,590 Error: failed to query org.gtk.Actions at /org/libreoffice: 2017-03-27 07:40:08,590 (DBusException(dbus.String(u'The name :1.6 was not provided by any .service files'),),) 2017-03-27 07:40:08,591 Error: failed to query org.gtk.Actions at /org/libreoffice/window/14680365: 2017-03-27 07:40:08,591 (DBusException(dbus.String(u'The name :1.6 was not provided by any .service files'),),) 2017-03-27 07:40:08,591 Error: failed to query org.gtk.Actions at /org/libreoffice/window/14680365: 2017-03-27 07:40:08,591 (DBusException(dbus.String(u'The name :1.6 was not provided by any .service files'),),) 2017-03-27 07:40:08,591 Error: failed to query org.gtk.Menus at /org/libreoffice/menus/appmenu: 2017-03-27 07:40:08,591 (DBusException(dbus.String(u'The name :1.6 was not provided by any .service files'),),) 2017-03-27 07:47:30,588 client 1: ignoring draw received for a window which is not realized yet! The program 'Xpra' received an X Window System error. This probably reflects a bug in the program. The error was 'BadDrawable (invalid Pixmap or Window parameter)'. (Details: serial 83017 error_code 9 request_code 14 minor_code 0) (Note to programmers: normally, X errors are reported asynchronously; that is, you will receive the error a while after causing it. To debug your program, run it with the --sync command line option to change this behavior. You can then get a meaningful backtrace from your debugger if you break on the gdk_x_error() function.)
I get this with client "xpra v2.0-r15319" on Ubuntu 16.10, but also with a recent xpra client on Mac OSX. So far I have seen it *only* when running LibreOffice? ... in both spreadsheet and word processing modes.
I think this problem started with a comparatively recent xpra update.
Here is the command I use when launching the server:
dbus-launch xpra start \ --start-child=/usr/bin/urxvt \ --encoding=rgb \ --opengl=no \ --pulseaudio=no \ --mdns=no \ :82 2> /dev/null &
The LibreOffice? version on the server is 5.1.6.2, Build ID: 1:5.1.6~rc2-0ubuntu1~xenial1, CPU Threads: 4; OS Version: Linux 4.4; UI Render: default; Locale: en-CA (en_CA.UTF-8); Calc: group
Faster scrolling triggers the "new and enhanced" scrolling detection code: #1426. Does this go away if you start the server with:
XPRA_SCROLL_ENCODING=0 xpra start ...
If this is reproducible, I should be able to fix that quickly. Strange thing is that the scrolling detection code got tested pretty thoroughly. Mostly with Firefox here, but the application that triggers the scrolling detection code shouldn't make any difference.
FYI minor things you may not know:
dbus-launch xpra
should not be needed - the default config should have dbus-launch=dbus-launch ...
(see xpra showconfig | grep dbus
)
opengl=no
doesn't do anything on the server
start-child=CMD
is only useful with exit-with-children
, you may want to use start=CMD
instead
Regrettably, the suggested change to an environment variable did not solve the problem. (I accepted all other suggestions in comment:1.)
Further info from the client side: sometimes the last thing printed on the controlling terminal window before the server dies is a line like this:
2017-03-27 09:55:43,112 ignoring draw received for a window which is not realized yet!
ubuntu 16.04 crash log with "-d all"
This bug can be reproduced, but not very reliably... sometimes the bug triggers immediately, sometimes it takes much longer. It also only triggers on Ubuntu and not Fedora because the Ubuntu version uses short-lived tooltip windows showing the row number - this is what triggers the bug.
Adding debug statements to all the calls following do_xpra_child_map_event
also seems to make the problem go away! (Heisenbug) Making it harder to track down.
This sort of low level X11 bug usually happens when we make X11 calls without synchronizing (calling the gdk X11 flush logic). Problem is that I cannot see any naked X11 calls in there, all the X11 calls in the window model code is already guarded by "xsync" context managers. So r15436 may well "fix" the problem in that it will be much harder to hit, but not necessarily completely fixed either.. At least I can't seem to hit the bug anymore.
Not sure why this only started occurring in the last 2 weeks, as this code has not really changed for years...
@Philip D Loewen: does that fix the problem for you? (new beta builds posted for Ubuntu 16.x)
Installed r15436 and tested lightly -- looks good so far. Will test aggressively through the day (UTC-7 here) and report again.
(I crashed the server also with the LibreOffice word processor and even the Thunderbird email client -- so the tooltips in LO Calc would have been a coal-mine canary of sorts rather than the single key issue. Heisenbug indeed.)
Normal usage that formerly crashed the server now just works. I didn't test very aggressively, but I would optimistically call this issue fixed. Thanks a lot, Antoine, for whatever you did. (And, as usual, for the vast amount of work you did building this excellent system in the first place!)
Thanks!
These symptoms re-appeared with considerable intensity about a month ago, but I was too busy to report. The only new item in the logs is this:
2018-03-25 20:25:04,003 Missing property or wrong property type WM_TRANSIENT_FOR (window) Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/xpra/x11/gtk_x11/prop.py", line 153, in prop_get data = X11Window.XGetWindowProperty(get_xwindow(target), key, atom, etype) File "xpra/x11/bindings/window_bindings.pyx", line 915, in xpra.x11.bindings.window_bindings._X11WindowBindings.XGetWindowProperty (xpra/x11/bindings/window_bindings.c:9883) PropertyError: no such window 2018-03-25 20:32:05,044 Missing property or wrong property type WM_TRANSIENT_FOR (window) Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/xpra/x11/gtk_x11/prop.py", line 153, in prop_get data = X11Window.XGetWindowProperty(get_xwindow(target), key, atom, etype) File "xpra/x11/bindings/window_bindings.pyx", line 915, in xpra.x11.bindings.window_bindings._X11WindowBindings.XGetWindowProperty (xpra/x11/bindings/window_bindings.c:9883) PropertyError: no such window The program 'Xpra' received an X Window System error. This probably reflects a bug in the program. The error was 'BadWindow (invalid Window parameter)'. (Details: serial 393361 error_code 3 request_code 3 minor_code 0) (Note to programmers: normally, X errors are reported asynchronously; that is, you will receive the error a while after causing it. To debug your program, run it with the --sync command line option to change this behavior. You can then get a meaningful backtrace from your debugger if you break on the gdk_x_error() function.)
This is Ubuntu 16.04.4 LTS ... counting down to the 18.04 LTS upgrade!
It is harder to reproduce, but I am seeing the bug.
I was hoping that the missing wrapping added in r18865 would fix this, but it doesn't.
I am getting a fairly reliable SIGBUS
doing memcpy-avc-unaligned
from mmap_write
which is running in the encode thread. As if the XShmImageWrapper
buffer we copy from was no longer valid.
Philip D Loewen: the SIGBUS
may not be related, so it is worth applying r18865 and turning off mmap if you hit it. I am now unable to reproduce the crash - but I'll keep trying.
If you can still reproduce the crash, please run the server with:
XPRA_X_SYNC=1 xpra start -d all
And post the last ~100 lines of debug output.
Instead of manually inspecting all the places where we can inadvertently end up calling an X11 function, r18869 adds a context checker in all the X11 functions that can raise an error and fixes some naked calls detected that way (some backported: r18874 for v2.2.x and r18875 for v1.0.x). Nothing major was found but this should prevent future bugs as any unmanaged calls will now trigger a scary error message. The runtime cost should be minimal.
PS: more in r18894, r18921, r18910, r18905, r18932. See also #1798
Sorry for going quiet -- I had come disconnected from this ticket and was not aware that progress was being made. I'll pay special attention to the performance of version 2.2.6-r18969-1 over the next few days and report back on what I observe. Thanks a lot!
New since 2.2.6: r19002, r19055, r19125.
Feel free to re-open if you can reproduce the problems or if you see one of those scary warnings. (and please include the stacktrace that goes with it)
this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/1475