xpra icon
Bug tracker and wiki

Opened 8 months ago

Closed 8 months ago

#1475 closed defect (fixed)

X errors crash server

Reported by: Philip D Loewen Owned by: Philip D Loewen
Priority: critical Milestone: 2.1
Component: core Version: trunk
Keywords: X Window System Error, LibreOffice Cc:

Description

This undesirable behaviour appears to be new-ish (last 2 weeks or so).

With Ubuntu 16.10 on both client and server, xpra server will crash unexpectedly. LibreOffice? application is particularly vulnerable: open a spreadsheet containing data, try to scroll using mouse-wheel. Slow scrolling often works; faster scrolling kills the server.

Sample log file entry from server shows version number at top, X problems at bottom:

 $ tail -n 60 \:82.log
2017-03-27 07:39:59,925 New unix-domain connection received on /home/loew/.xpra/infty-82
2017-03-27 07:39:59,930 Handshake complete; enabling connection
2017-03-27 07:40:00,017 Python/Gtk2 Linux Ubuntu 16.04 xenial client version 2.0-r15319 64-bit
2017-03-27 07:40:00,017  connected from 'mpdhost' as 'loew' - 'Philip Loewen'
2017-03-27 07:40:00,017  using rgb as primary encoding
2017-03-27 07:40:00,017  also available:
2017-03-27 07:40:00,017   h264, vp9, vp8, png, png/P, png/L, rgb24, jpeg, rgb32
2017-03-27 07:40:00,018  client root window size is 1680x1050 with 1 display:
2017-03-27 07:40:00,019   :0.0 (444x277 mm - DPI: 96x96) workarea: 1680x1024
2017-03-27 07:40:00,019     monitor 1 (474x296 mm - DPI: 90x90)
2017-03-27 07:40:00,019     monitor 2 1280x720 (339x190 mm - DPI: 95x96)
2017-03-27 07:40:00,046 setting key repeat rate from client: 500ms delay / 30ms interval
2017-03-27 07:40:00,047 setting keymap: rules=evdev, model=pc105, layout=us
2017-03-27 07:40:00,144 client 1: Attached to ssh/loew@infty.math.ubc.ca/:82 (press Control-C to detach)
2017-03-27 07:40:01,133 client 1: Error: printing disabled:
2017-03-27 07:40:01,133 client 1:  No module named cups
2017-03-27 07:40:03,145 client 1: Warning: the sound output process has failed to start
2017-03-27 07:40:03,965 client 1: Warning: the opus+ogg sound sink has stopped
2017-03-27 07:40:05,329 Introspect error on :1.6:/org/libreoffice: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.ServiceUnknown: The name :1.6 was not provided by any .service files
2017-03-27 07:40:05,330 Executing introspect queue due to error
2017-03-27 07:40:05,330 Introspect error on :1.6:/org/libreoffice/window/14680136: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.ServiceUnknown: The name :1.6 was not provided by any .service files
2017-03-27 07:40:05,330 Executing introspect queue due to error
2017-03-27 07:40:05,332 Introspect error on :1.6:/org/libreoffice/menus/appmenu: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.ServiceUnknown: The name :1.6 was not provided by any .service files
2017-03-27 07:40:05,332 Executing introspect queue due to error
2017-03-27 07:40:05,332 Error: failed to query org.gtk.Actions at /org/libreoffice:
2017-03-27 07:40:05,333  (DBusException(dbus.String(u'The name :1.6 was not provided by any .service files'),),)
2017-03-27 07:40:05,333 Error: failed to query org.gtk.Actions at /org/libreoffice:
2017-03-27 07:40:05,333  (DBusException(dbus.String(u'The name :1.6 was not provided by any .service files'),),)
2017-03-27 07:40:05,333 Error: failed to query org.gtk.Actions at /org/libreoffice/window/14680136:
2017-03-27 07:40:05,333  (DBusException(dbus.String(u'The name :1.6 was not provided by any .service files'),),)
2017-03-27 07:40:05,333 Error: failed to query org.gtk.Actions at /org/libreoffice/window/14680136:
2017-03-27 07:40:05,333  (DBusException(dbus.String(u'The name :1.6 was not provided by any .service files'),),)
2017-03-27 07:40:05,334 Error: failed to query org.gtk.Menus at /org/libreoffice/menus/appmenu:
2017-03-27 07:40:05,334  (DBusException(dbus.String(u'The name :1.6 was not provided by any .service files'),),)
2017-03-27 07:40:08,588 Introspect error on :1.6:/org/libreoffice: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.ServiceUnknown: The name :1.6 was not provided by any .service files
2017-03-27 07:40:08,588 Executing introspect queue due to error
2017-03-27 07:40:08,588 Introspect error on :1.6:/org/libreoffice/window/14680365: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.ServiceUnknown: The name :1.6 was not provided by any .service files
2017-03-27 07:40:08,588 Executing introspect queue due to error
2017-03-27 07:40:08,588 Introspect error on :1.6:/org/libreoffice/menus/appmenu: dbus.exceptions.DBusException: org.freedesktop.DBus.Error.ServiceUnknown: The name :1.6 was not provided by any .service files
2017-03-27 07:40:08,589 Executing introspect queue due to error
2017-03-27 07:40:08,589 Error: failed to query org.gtk.Actions at /org/libreoffice:
2017-03-27 07:40:08,589  (DBusException(dbus.String(u'The name :1.6 was not provided by any .service files'),),)
2017-03-27 07:40:08,590 Error: failed to query org.gtk.Actions at /org/libreoffice:
2017-03-27 07:40:08,590  (DBusException(dbus.String(u'The name :1.6 was not provided by any .service files'),),)
2017-03-27 07:40:08,591 Error: failed to query org.gtk.Actions at /org/libreoffice/window/14680365:
2017-03-27 07:40:08,591  (DBusException(dbus.String(u'The name :1.6 was not provided by any .service files'),),)
2017-03-27 07:40:08,591 Error: failed to query org.gtk.Actions at /org/libreoffice/window/14680365:
2017-03-27 07:40:08,591  (DBusException(dbus.String(u'The name :1.6 was not provided by any .service files'),),)
2017-03-27 07:40:08,591 Error: failed to query org.gtk.Menus at /org/libreoffice/menus/appmenu:
2017-03-27 07:40:08,591  (DBusException(dbus.String(u'The name :1.6 was not provided by any .service files'),),)
2017-03-27 07:47:30,588 client 1: ignoring draw received for a window which is not realized yet!
The program 'Xpra' received an X Window System error.
This probably reflects a bug in the program.
The error was 'BadDrawable (invalid Pixmap or Window parameter)'.
  (Details: serial 83017 error_code 9 request_code 14 minor_code 0)
  (Note to programmers: normally, X errors are reported asynchronously;
   that is, you will receive the error a while after causing it.
   To debug your program, run it with the --sync command line
   option to change this behavior. You can then get a meaningful
   backtrace from your debugger if you break on the gdk_x_error() function.)

I get this with client "xpra v2.0-r15319" on Ubuntu 16.10, but also with a recent xpra client on Mac OSX. So far I have seen it *only* when running LibreOffice? ... in both spreadsheet and word processing modes.

I think this problem started with a comparatively recent xpra update.

Here is the command I use when launching the server:

dbus-launch xpra start              \
  --start-child=/usr/bin/urxvt      \
  --encoding=rgb                    \
  --opengl=no                       \
  --pulseaudio=no                   \
  --mdns=no                         \
  :82 2> /dev/null &

The LibreOffice? version on the server is 5.1.6.2, Build ID: 1:5.1.6~rc2-0ubuntu1~xenial1, CPU Threads: 4; OS Version: Linux 4.4; UI Render: default; Locale: en-CA (en_CA.UTF-8); Calc: group

Attachments (1)

ubuntu-16.04-dall-crash.log (76.0 KB) - added by Antoine Martin 8 months ago.
ubuntu 16.04 crash log with "-d all"

Download all attachments as: .zip

Change History (7)

comment:1 Changed 8 months ago by Antoine Martin

Milestone: 2.1
Priority: majorcritical
Status: newassigned

Faster scrolling triggers the "new and enhanced" scrolling detection code: #1426.
Does this go away if you start the server with:

XPRA_SCROLL_ENCODING=0 xpra start ...

If this is reproducible, I should be able to fix that quickly. Strange thing is that the scrolling detection code got tested pretty thoroughly. Mostly with Firefox here, but the application that triggers the scrolling detection code shouldn't make any difference.


FYI minor things you may not know:

  • dbus-launch xpra should not be needed - the default config should have dbus-launch=dbus-launch ... (see xpra showconfig | grep dbus)
  • opengl=no doesn't do anything on the server
  • start-child=CMD is only useful with exit-with-children, you may want to use start=CMD instead

comment:2 Changed 8 months ago by Philip D Loewen

Regrettably, the suggested change to an environment variable did not solve the problem. (I accepted all other suggestions in comment:1.)

Further info from the client side: sometimes the last thing printed on the controlling terminal window before the server dies is a line like this:

2017-03-27 09:55:43,112 ignoring draw received for a window which is not realized yet!
Last edited 8 months ago by Antoine Martin (previous) (diff)

Changed 8 months ago by Antoine Martin

Attachment: ubuntu-16.04-dall-crash.log added

ubuntu 16.04 crash log with "-d all"

comment:3 Changed 8 months ago by Antoine Martin

Owner: changed from Antoine Martin to Philip D Loewen
Status: assignednew

This bug can be reproduced, but not very reliably... sometimes the bug triggers immediately, sometimes it takes much longer. It also only triggers on Ubuntu and not Fedora because the Ubuntu version uses short-lived tooltip windows showing the row number - this is what triggers the bug.
Adding debug statements to all the calls following do_xpra_child_map_event also seems to make the problem go away! (Heisenbug) Making it harder to track down.

This sort of low level X11 bug usually happens when we make X11 calls without synchronizing (calling the gdk X11 flush logic).
Problem is that I cannot see any naked X11 calls in there, all the X11 calls in the window model code is already guarded by "xsync" context managers. So r15436 may well "fix" the problem in that it will be much harder to hit, but not necessarily completely fixed either.. At least I can't seem to hit the bug anymore.

Not sure why this only started occurring in the last 2 weeks, as this code has not really changed for years...

@Philip D Loewen: does that fix the problem for you?
(new beta builds posted for Ubuntu 16.x)

comment:4 Changed 8 months ago by Philip D Loewen

Installed r15436 and tested lightly -- looks good so far. Will test aggressively through the day (UTC-7 here) and report again.

(I crashed the server also with the LibreOffice? word processor and even the Thunderbird email client -- so the tooltips in LO Calc would have been a coal-mine canary of sorts rather than the single key issue. Heisenbug indeed.)

Last edited 8 months ago by Philip D Loewen (previous) (diff)

comment:5 Changed 8 months ago by Philip D Loewen

Normal usage that formerly crashed the server now just works. I didn't test very aggressively, but I would optimistically call this issue fixed. Thanks a lot, Antoine, for whatever you did. (And, as usual, for the vast amount of work you did building this excellent system in the first place!)

comment:6 Changed 8 months ago by Antoine Martin

Resolution: fixed
Status: newclosed

Thanks!

Note: See TracTickets for help on using tickets.