xpra icon
Bug tracker and wiki

Opened 12 days ago

Closed 10 days ago

#2024 closed defect (fixed)

xpra crashes when showing File Dialog of Mono Applications

Reported by: Alexey Stukalov Owned by: Alexey Stukalov
Priority: critical Milestone: 2.5
Component: server Version: 2.4.x
Keywords: Cc:

Description

I am trying to run MaxQuant (as LD_PRELOAD=/usr/lib/libasan.so xpra start :69 --start=/usr/bin/maxquant), which is .NET Framework 4.5 application, under ArchLinux using Mono 5.16. Locally it runs fine, but under both xpra 2.3.4 and 2.4 it crashes when I try to open "File..." dialog. Here's the stack trace:

AddressSanitizer:DEADLYSIGNAL
=================================================================
==12405==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7fb953f1a715 bp 0x7ffc5e85aad0 sp 0x7ffc5e85a248 T0)
==12405==The signal is caused by a READ memory access.
==12405==Hint: address points to the zero page.
    #0 0x7fb953f1a714 in __strlen_avx2 (/usr/lib/libc.so.6+0x15f714)
    #1 0x7fb95405950b in __interceptor_strlen /build/gcc/src/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:299
    #2 0x7fb953a2e510 in PyString_FromString (/usr/lib/libpython2.7.so.1.0+0xb6510)
    #3 0x7fb9465b82b2 in __pyx_pf_4xpra_3x11_8bindings_13core_bindings_16_X11CoreBindings_12XGetAtomName xpra/x11/bindings/core_bindings.c:3496
    #4 0x7fb9465b7cf7 in __pyx_pw_4xpra_3x11_8bindings_13core_bindings_16_X11CoreBindings_13XGetAtomName xpra/x11/bindings/core_bindings.c:3437
    #5 0x7fb953a5e6d6 in PyEval_EvalFrameEx (/usr/lib/libpython2.7.so.1.0+0xe66d6)
    #6 0x7fb953a5e9cf in PyEval_EvalFrameEx (/usr/lib/libpython2.7.so.1.0+0xe69cf)
    #7 0x7fb953a5e9cf in PyEval_EvalFrameEx (/usr/lib/libpython2.7.so.1.0+0xe69cf)
    #8 0x7fb953a5e9cf in PyEval_EvalFrameEx (/usr/lib/libpython2.7.so.1.0+0xe69cf)
    #9 0x7fb953a5e9cf in PyEval_EvalFrameEx (/usr/lib/libpython2.7.so.1.0+0xe69cf)
    #10 0x7fb953ab90d9 in PyEval_EvalCodeEx (/usr/lib/libpython2.7.so.1.0+0x1410d9)
    #11 0x7fb953a6397e in PyEval_EvalFrameEx (/usr/lib/libpython2.7.so.1.0+0xeb97e)
    #12 0x7fb953ab90d9 in PyEval_EvalCodeEx (/usr/lib/libpython2.7.so.1.0+0x1410d9)
    #13 0x7fb953a6397e in PyEval_EvalFrameEx (/usr/lib/libpython2.7.so.1.0+0xeb97e)
    #14 0x7fb953a5e9cf in PyEval_EvalFrameEx (/usr/lib/libpython2.7.so.1.0+0xe69cf)
    #15 0x7fb953a5e9cf in PyEval_EvalFrameEx (/usr/lib/libpython2.7.so.1.0+0xe69cf)
    #16 0x7fb953ab90d9 in PyEval_EvalCodeEx (/usr/lib/libpython2.7.so.1.0+0x1410d9)
    #17 0x7fb953a63dbe in PyEval_EvalFrameEx (/usr/lib/libpython2.7.so.1.0+0xebdbe)
    #18 0x7fb953a5e9cf in PyEval_EvalFrameEx (/usr/lib/libpython2.7.so.1.0+0xe69cf)
    #19 0x7fb953ab90d9 in PyEval_EvalCodeEx (/usr/lib/libpython2.7.so.1.0+0x1410d9)
    #20 0x7fb953a4ae0e in function_call.lto_priv.233 (/usr/lib/libpython2.7.so.1.0+0xd2e0e)
    #21 0x7fb953a020e2 in PyObject_Call (/usr/lib/libpython2.7.so.1.0+0x8a0e2)
    #22 0x7fb953a6e49e in instancemethod_call.lto_priv.148 (/usr/lib/libpython2.7.so.1.0+0xf649e)
    #23 0x7fb953a020e2 in PyObject_Call (/usr/lib/libpython2.7.so.1.0+0x8a0e2)
    #24 0x7fb953aa6de3 in slot_tp_init.lto_priv.1132 (/usr/lib/libpython2.7.so.1.0+0x12ede3)
    #25 0x7fb953a202c4 in type_call.lto_priv.59 (/usr/lib/libpython2.7.so.1.0+0xa82c4)
    #26 0x7fb953a020e2 in PyObject_Call (/usr/lib/libpython2.7.so.1.0+0x8a0e2)
    #27 0x7fb953a6380d in PyEval_EvalFrameEx (/usr/lib/libpython2.7.so.1.0+0xeb80d)
    #28 0x7fb953a5e9cf in PyEval_EvalFrameEx (/usr/lib/libpython2.7.so.1.0+0xe69cf)
    #29 0x7fb953ab90d9 in PyEval_EvalCodeEx (/usr/lib/libpython2.7.so.1.0+0x1410d9)
    #30 0x7fb953a4ae0e in function_call.lto_priv.233 (/usr/lib/libpython2.7.so.1.0+0xd2e0e)
    #31 0x7fb953a020e2 in PyObject_Call (/usr/lib/libpython2.7.so.1.0+0x8a0e2)
    #32 0x7fb953a6e49e in instancemethod_call.lto_priv.148 (/usr/lib/libpython2.7.so.1.0+0xf649e)
    #33 0x7fb953a020e2 in PyObject_Call (/usr/lib/libpython2.7.so.1.0+0x8a0e2)
    #34 0x7fb953ab87d0 in PyEval_CallObjectWithKeywords (/usr/lib/libpython2.7.so.1.0+0x1407d0)
    #35 0x7fb94a8f0cdb  (/usr/lib/python2.7/site-packages/gobject/_gobject.so+0x17cdb)
    #36 0x7fb94a8b33d4 in g_closure_invoke (/usr/lib/libgobject-2.0.so.0+0x303d4)
    #37 0x7fb94a89f99e  (/usr/lib/libgobject-2.0.so.0+0x1c99e)
    #38 0x7fb94a8a66df in g_signal_emitv (/usr/lib/libgobject-2.0.so.0+0x236df)
    #39 0x7fb94a8e9337  (/usr/lib/python2.7/site-packages/gobject/_gobject.so+0x10337)
    #40 0x7fb945ebb644 in __Pyx_PyObject_Call xpra/x11/gtk2/gdk_bindings.c:21153
    #41 0x7fb945e66299 in __pyx_f_4xpra_3x11_4gtk2_12gdk_bindings__maybe_send_event xpra/x11/gtk2/gdk_bindings.c:11188
    #42 0x7fb945e6ed34 in __pyx_f_4xpra_3x11_4gtk2_12gdk_bindings__route_event xpra/x11/gtk2/gdk_bindings.c:12154
    #43 0x7fb945e80d40 in __pyx_f_4xpra_3x11_4gtk2_12gdk_bindings_x_event_filter xpra/x11/gtk2/gdk_bindings.c:13863
    #44 0x7fb949b45e1e  (/usr/lib/libgdk-x11-2.0.so.0+0x56e1e)
    #45 0x7fb949b4717f  (/usr/lib/libgdk-x11-2.0.so.0+0x5817f)
    #46 0x7fb949b48c89  (/usr/lib/libgdk-x11-2.0.so.0+0x59c89)
    #47 0x7fb949b48d2e  (/usr/lib/libgdk-x11-2.0.so.0+0x59d2e)
    #48 0x7fb94adda3ce in g_main_context_dispatch (/usr/lib/libglib-2.0.so.0+0x6b3ce)
    #49 0x7fb94addbf88  (/usr/lib/libglib-2.0.so.0+0x6cf88)
    #50 0x7fb94addcf61 in g_main_loop_run (/usr/lib/libglib-2.0.so.0+0x6df61)
    #51 0x7fb949ecedf2 in gtk_main (/usr/lib/libgtk-x11-2.0.so.0+0x12bdf2)
    #52 0x7fb94a5688c9  (/usr/lib/python2.7/site-packages/gtk-2.0/gtk/_gtk.so+0x1888c9)
    #53 0x7fb953a63a42 in PyEval_EvalFrameEx (/usr/lib/libpython2.7.so.1.0+0xeba42)
    #54 0x7fb953a5e9cf in PyEval_EvalFrameEx (/usr/lib/libpython2.7.so.1.0+0xe69cf)
    #55 0x7fb953a5e9cf in PyEval_EvalFrameEx (/usr/lib/libpython2.7.so.1.0+0xe69cf)
    #56 0x7fb953ab90d9 in PyEval_EvalCodeEx (/usr/lib/libpython2.7.so.1.0+0x1410d9)
    #57 0x7fb953a63dbe in PyEval_EvalFrameEx (/usr/lib/libpython2.7.so.1.0+0xebdbe)
    #58 0x7fb953ab90d9 in PyEval_EvalCodeEx (/usr/lib/libpython2.7.so.1.0+0x1410d9)
    #59 0x7fb953a6397e in PyEval_EvalFrameEx (/usr/lib/libpython2.7.so.1.0+0xeb97e)
    #60 0x7fb953ab90d9 in PyEval_EvalCodeEx (/usr/lib/libpython2.7.so.1.0+0x1410d9)
    #61 0x7fb953a63dbe in PyEval_EvalFrameEx (/usr/lib/libpython2.7.so.1.0+0xebdbe)
    #62 0x7fb953a5e9cf in PyEval_EvalFrameEx (/usr/lib/libpython2.7.so.1.0+0xe69cf)
    #63 0x7fb953ab90d9 in PyEval_EvalCodeEx (/usr/lib/libpython2.7.so.1.0+0x1410d9)
    #64 0x7fb953ad8309 in PyEval_EvalCode (/usr/lib/libpython2.7.so.1.0+0x160309)
    #65 0x7fb953ae3a80 in run_mod (/usr/lib/libpython2.7.so.1.0+0x16ba80)
    #66 0x7fb953ae5396 in PyRun_FileExFlags (/usr/lib/libpython2.7.so.1.0+0x16d396)
    #67 0x7fb953ae5c83 in PyRun_SimpleFileExFlags (/usr/lib/libpython2.7.so.1.0+0x16dc83)
    #68 0x7fb953ac1082 in Py_Main (/usr/lib/libpython2.7.so.1.0+0x149082)
    #69 0x7fb953ddf222 in __libc_start_main (/usr/lib/libc.so.6+0x24222)
    #70 0x55eecd54b779 in _start (/usr/bin/python2.7+0x779)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/usr/lib/libc.so.6+0x15f714) in __strlen_avx2
==12405==ABORTING

So it seems X11's XGetAtomName() returns NULL, which crashes v[:]. Indeed, when I tried to fix it by returning None for v==NULL, the crash was gone. But neither I see the dialog window (the UI goes into modal mode for the invisible window, which I can close by pressing "Esc" on the keyboard).

Change History (12)

comment:1 Changed 12 days ago by Antoine Martin

Owner: changed from Antoine Martin to Alexey Stukalov

alyst: does r20915 fix the crash the same way you did?

The documentation for XGetAtomName says nothing about NULL return values, but that seems reasonable - it is just strange that we never hit it before.

Do you have a more simple test case I could use to reproduce the problem with the modal window? maxquant is not open-source.
If not, it is going to be very hard to debug. You can always attach the -d metadata,window log to this ticket.

comment:2 in reply to:  1 Changed 12 days ago by Alexey Stukalov

Replying to Antoine Martin:

alyst: does r20915 fix the crash the same way you did?

Yes. But then these modal dialogs do not show up.

The documentation for XGetAtomName says nothing about NULL return values, but that seems reasonable - it is just strange that we never hit it before.

XLIB Reference Manual R5 mentions NULL (which also generates BadAtom error).

Do you have a more simple test case I could use to reproduce the problem with the modal window? maxquant is not open-source.
If not, it is going to be very hard to debug. You can always attach the -d metadata,window log to this ticket.

I'll try to reproduce it with some simple opensource dotnet app.

comment:3 Changed 11 days ago by Alexey Stukalov

I can reproduce this behaviour with this simplest Folder dialog Mono app. With the patched (r20915 + r20918) xpra the dialog doesn't show up, although the app goes into modal mode. Locally the dialog pops up normally.

comment:4 Changed 11 days ago by Antoine Martin

Please provide steps to build and run this example on Fedora, where there is no "csc" and "mcs" barfs out.

comment:5 Changed 11 days ago by Alexey Stukalov

Unfortunately, I don't Fedora at hand, but this worked for me (ArchLinux, Mono 5.16.0.179-1):

mcs /reference:System.Drawing.dll /reference:System.Windows.Forms.dll mono_test.csc

comment:6 Changed 11 days ago by Antoine Martin

Thanks, turns out that there's an ABI problem, worked around with:

TERM=xterm mcs ...

I can reproduce the problem, will fix.
The window looks like this:

process_new_common: [37, 3690, 1130, 292, 273, \
    {
    'size-constraints': {'position': (22, 22), 'minimum-size': (102, -5)}, \
    'xid': '0xe00011', 'title': 'FolderBrowserDialog', 'client-machine': 'desktop', \
    'icon-title': '', 'group-leader-xid': 665, 'window-type': ('NORMAL',), 'decorations': 126, 
    'class-instance': (), 'set-initial-position': True
    }], metadata={..}, OR=False

comment:7 Changed 11 days ago by Antoine Martin

Owner: changed from Alexey Stukalov to Antoine Martin
Priority: majorcritical
Status: newassigned

comment:8 Changed 10 days ago by Antoine Martin

When the window shows up properly (which is most of the time), the file selection dialog window looks like this:

client @16.663 process_new_common: [71, 3675, 1105, 322, 324, \
    {
    'size-constraints': {'position': (3675, 1104), 'minimum-size': (302, 227)}, \
    'xid': '0xe00020', 'title': 'Browse For Folder', 'client-machine': 'desktop', \
    'icon-title': '', 'group-leader-xid': 665, 'window-type': ('NORMAL',), 'decorations': 30, \
    'skip-taskbar': True, 'class-instance': (), 'set-initial-position': True \
   }], metadata={..}, OR=False

When it doesn't work, the server log shows:

2018-11-04 02:14:45,797 Missing property or wrong property type _NET_WM_WINDOW_TYPE (['atom'])
2018-11-04 02:14:45,797 Missing property or wrong property type WM_TRANSIENT_FOR (window)
2018-11-04 02:14:45,797 Missing property or wrong property type _MOTIF_WM_HINTS (motif-hints)
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/xpra/x11/models/window.py", line 369, in do_xpra_property_notify_event
    BaseWindowModel.do_xpra_property_notify_event(self, event)
  File "/usr/lib64/python2.7/site-packages/xpra/x11/models/core.py", line 478, in do_xpra_property_notify_event
    self._handle_property_change(str(event.atom))
  File "/usr/lib64/python2.7/site-packages/xpra/x11/models/core.py", line 490, in _handle_property_change
    handler(self)
  File "/usr/lib64/python2.7/site-packages/xpra/x11/models/core.py", line 519, in _handle_protocols_change
    protocols = X11Window.XGetWMProtocols(self.xid)
  File "/usr/lib64/python2.7/site-packages/xpra/gtk_common/error.py", line 170, in __exit__
    trap._exit()
  File "/usr/lib64/python2.7/site-packages/xpra/gtk_common/error.py", line 102, in _exit
    raise XError(get_X_error(error))
xpra.gtk_common.error.XError: XError: BadWindow

r20925 + r20945: improves the handling for those errors.

When things are OK, the window tree looks like this:

  Root window id: 0x299 (the root window) "Xpra"
  Parent window id: 0x0 (none)
     16 children:
     0x400158 "Xpra-CorralWindow-0xe00020": ()  322x324+2581+189  +2581+189
        1 child:
        0xe00020 "Browse For Folder": ()  322x324+0+0  +2581+189
           1 child:
           0xe00021 (has no name): ()  330x351+0+0  +2581+189
              5 children:
              0xe00027 "Cancel": ()  80x23+227+285  +2808+474
                 1 child:
                 0xe00028 (has no name): ()  80x23+0+0  +2808+474
              0xe00029 "OK": ()  80x23+135+285  +2716+474
                 1 child:
                 0xe0002a (has no name): ()  80x23+0+0  +2716+474
              0xe0002b "Make New Folder": ()  105x23+15+285  +2596+474
                 1 child:
                 0xe0002c (has no name): ()  105x23+0+0  +2596+474
              0xe0002d (has no name): ()  292x212+15+60  +2596+249
                 1 child:
                 0xe0002e (has no name): ()  288x208+2+2  +2598+251
                    1 child:
                    0xe0002f (has no name): ()  16x208+272+0  +2870+251
                       1 child:
                       0xe00030 (has no name): ()  16x208+0+0  +2870+251
              0xe00031 (has no name): ()  292x40+15+14  +2596+203
                 1 child:
                 0xe00032 (has no name): ()  292x40+0+0  +2596+203
     0x40014c "Xpra-CorralWindow-0xe00011": ()  292x273+2933+226  +2933+226
        1 child:
        0xe00011 "FolderBrowserDialog": ()  292x273+0+0  +2933+226
           1 child:
           0xe00012 (has no name): ()  300x300+0+0  +2933+226
              2 children:
              0xe00013 (has no name): ()  292x22+0+251  +2933+477
                 1 child:
                 0xe00014 (has no name): ()  292x22+0+0  +2933+477
              0xe00015 (has no name): ()  292x26+0+0  +2933+226
                 1 child:
                 0xe00016 (has no name): ()  292x26+0+0  +2933+226

It can go MIA just by focusing away.
We still have a model, it's still meant to be shown, but it's not there...

But when it's hidden from the start, it comes up without being reparented into one of our corral windows:

  Root window id: 0x299 (the root window) "Xpra"
  Parent window id: 0x0 (none)
     16 children:
     0x40002b "Xpra-CorralWindow-0xc00022": ()  499x316+2131+257  +2131+257
        1 child:
        0xc00022 "antoine@desktop:~/Downloads": ("xterm" "XTerm")  499x316+0+0  +2131+257
           1 child:
           0xc0002d (has no name): ()  499x316+0+0  +2131+257
              1 child:
              0xc00033 (has no name): ()  14x316+-1+-1  +2130+256
     0xe00072 "Browse For Folder": ()  322x324+3675+1105  +3675+1105
        1 child:
        0xe00073 (has no name): ()  330x351+0+0  +3675+1105
           5 children:
           0xe00079 "Cancel": ()  80x23+227+285  +3902+1390
              1 child:
              0xe0007a (has no name): ()  80x23+0+0  +3902+1390
           0xe0007b "OK": ()  80x23+135+285  +3810+1390
              1 child:
              0xe0007c (has no name): ()  80x23+0+0  +3810+1390
           0xe0007d "Make New Folder": ()  105x23+15+285  +3690+1390
              1 child:
              0xe0007e (has no name): ()  105x23+0+0  +3690+1390
           0xe0007f (has no name): ()  292x212+15+60  +3690+1165
              1 child:
              0xe00080 (has no name): ()  288x208+2+2  +3692+1167
                 1 child:
                 0xe00081 (has no name): ()  16x208+272+0  +3964+1167
                    1 child:
                    0xe00082 (has no name): ()  16x208+0+0  +3964+1167
           0xe00083 (has no name): ()  292x40+15+14  +3690+1119
              1 child:
              0xe00084 (has no name): ()  292x40+0+0  +3690+1119

Maybe because we hit errors during window setup? (the property errors?)

With lots of debug enabled, I see this:

2018-11-04 23:11:20,800 Window.read_initial_X11_properties()
2018-11-04 23:11:20,800 Warning: failed to manage client window 0xe00020:
2018-11-04 23:11:20,800  XError: BadAtom
2018-11-04 23:11:20,800 
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/xpra/x11/gtk_x11/wm.py", line 335, in _manage_client
    win = WindowModel(self._root, gdkwindow, desktop_geometry, self.size_constraints)
  File "/usr/lib64/python2.7/site-packages/xpra/x11/models/window.py", line 175, in __init__
    self.call_setup()
  File "/usr/lib64/python2.7/site-packages/xpra/x11/models/core.py", line 232, in call_setup
    raise Unmanageable(e)
Unmanageable: XError: BadAtom

comment:9 Changed 10 days ago by Antoine Martin

The problem comes from the fact that mono apps set an invalid value (zero) for one of the atoms in the _NET_WM_STATE window property. When we try to manage the window, we parse the properties and fail the whole window because of this property..
As per the docs, this property is managed by the window manager and not the application itself: _NET_WM_STATE.

That said, we should handle bogus values more gracefully.

Last edited 10 days ago by Antoine Martin (previous) (diff)

comment:10 Changed 10 days ago by Antoine Martin

Owner: changed from Antoine Martin to Alexey Stukalov
Status: assignednew

The minimal fix to get things working with mono application is in r20953.
We log a warning and continue.

r20954 changes the X11 property access code to use synchronized calls, so the errors will be emitted at that point and we can continue to ignore them during the initial window setup.
This change is potentially more problematic: this can reduce server throughput on corner cases (ie: gtkperf) and could also cause regressions.

@alyst: does that work for you?

comment:11 Changed 10 days ago by Alexey Stukalov

Yes, thanks a lot! I've just built r20955 version and now the file/folder dialogs are displayed.

There are other minor artifacts (I don't see button text in file/folder dialogs in MaxQuant app (the simple test works fine, also the labels are present when running locally); sometimes these dialog boxes behave strangely when resized; sometime dropdown menus are aligned to the center of the window and not to the menubar), but I haven't discovered a reproducible pattern so far.

comment:12 Changed 10 days ago by Antoine Martin

Resolution: fixed
Status: newclosed

Yes, thanks a lot! I've just built r20955 version and now the file/folder dialogs are displayed.

r20953 backports this to the v2.4.x branch.

There are other minor artifacts

If you can provide reproducible test cases for those, we should be able to fix them.

Note: See TracTickets for help on using tickets.