Xpra: Ticket #64: xpra server dies on client's connection attempt

I've just downloaded & built xpra on an OpenBSD system with Python 2.6

In one xterm, I run:

$ ''python install/bin/xpra start :13''
$ ''export DISPLAY=:13''
$ ''cat ~/.xpra/pinky.yary.ack.org-13.log''
Xlib:  extension "RANDR" missing on display ":13.0".
Xlib:  extension "RANDR" missing on display ":13.0".
Xlib:  extension "RANDR" missing on display ":13.0".
Randr not supported: X server does not support required extension Randr
randr enabled: False
failed to load dbus notifications forwarder: No module named dbus.service
xpra is ready.
 $ ''ps|grep -i xp''
21818 ??  S       0:00.60 /usr/local/bin/python install/bin/xpra start :13
31898 ??  S       0:01.09 Xvfb-for-Xpra-:13 +extension Composite -screen 0 3840
 $ ''xclock &''

In another xterm, I try to attach, hoping to see the xclock. $ ''install/bin/xpra attach :13''

cannot import pynotify wrapper (turning notifications off) : No module named dbus.glib
cannot import pynotify wrapper (turning notifications off) : No module named pynotify
'['setxkbmap', '-print']' failed with exit code 252
your keyboard mapping will probably be incorrect unless you are using a 'us' layout
'['setxkbmap', '-query']' failed with exit code 252
connection lost: Error writing to connection: [Errno 32] Broken pipe
Connection lost
 $ ''ps|grep -i xp''
31898 ??  S       0:01.24 Xvfb-for-Xpra-:13 +extension Composite -screen 0 3840

It appears that trying to attach is killing the xpra server (but not xvfb- perhaps that should be another ticket, Xvfb should go away if xpra does). There are some warnings but I don't know which are information vs fatal. Can you help me?

Thanks



Tue, 03 Jan 2012 19:41:31 GMT - Antoine Martin: owner, status changed

Hah, it has been a while since I last tried to run on *BSDs, can you try running setxkbmap -print and setxkbmap -query by hand to see what the problem is? It shouldn't cause the Broken pipe error, but still, it might give us a clue. It is strange that there aren't any messages at the server end.. It might help if you ran them with "-d all"

As for the Xvfb not going away, you are right about that, it is right at the top of my TODO list already (but feel free to create a ticket).


Tue, 03 Jan 2012 19:42:46 GMT - Antoine Martin:

If you have access to a Linux machine, it might be useful to try that as both a server and a client, so we will then know which end is causing the problem on *BSD.


Tue, 03 Jan 2012 21:08:51 GMT - Yary:

$ setxkbmap -print Couldn't find rules file (base)

$ setxkbmap -query Couldn't find rules file (base)

Looks like a problem with my x setup? I can run xclock, xterm, emacs etc OK... I wonder if I can fix this keyboard issue. Got a hint from the web, trying it out:

$ setxkbmap -rules /etc/X11/xkb/rules/xfree98

$ setxkbmap -print

xkb_keymap {
        xkb_types     { include "complete"      };
        xkb_compat    { include "complete"      };
};

$ setxkbmap -query

rules:      /etc/X11/xkb/rules/xfree98
model:      pc105
layout:     us

Still the same errors on trying to attach a client after running that, though. :-(

I don't have a linux box around here. If I set one up, I'll install xpra on it and try it from there.


Tue, 03 Jan 2012 21:13:12 GMT - Yary:

The errors aren't exactly the same after setting xkbmap:

$ install/bin/xpra attach :13

cannot import pynotify wrapper (turning notifications off) : No module named dbus.glib
cannot import pynotify wrapper (turning notifications off) : No module named pynotify
connection lost: Error writing to connection: [Errno 32] Broken pipe
Connection lost

$ cat ~/.xpra/pinky.yary.ack.org-13.log

Xlib:  extension "RANDR" missing on display ":13.0".
Xlib:  extension "RANDR" missing on display ":13.0".
Xlib:  extension "RANDR" missing on display ":13.0".
Randr not supported: X server does not support required extension Randr
randr enabled: False
failed to load dbus notifications forwarder: No module named dbus.service
xpra is ready.
New connection received

Tue, 03 Jan 2012 21:25:00 GMT - Antoine Martin:

You can safely ignore the pynotify messages: those are just warnings.

It is now showing that it received the client connection.. but not much else. Adding -d all to the command lines will help.


Tue, 03 Jan 2012 23:16:06 GMT - Yary: attachment set

Logs for server & client with "-d all" on


Tue, 03 Jan 2012 23:17:22 GMT - Yary:

Turned on -d all. At this point, I should attach the files, they're verbose... done.


Wed, 04 Jan 2012 10:35:55 GMT - Antoine Martin:

Will take a look asap, in the meantime maybe you can try to disable as many features as possible and see if that helps:

etc. Maybe one of those causes problems during initialization, maybe mmap?


Wed, 04 Jan 2012 14:17:46 GMT - Antoine Martin:

Hmmm, it looks like the server never gets passed the "hello" stage:

New connection received
read thread: got data 'l5:hellod20:__prerelease_version8:0.0.7.324:belli1e9:clipboardi1e7:cursorsi1e15:damage_sequencei1e7:'...

Can you run the server with --no-daemon in gdb and get a backtrace? (please also try comment:7 above first)


Wed, 04 Jan 2012 17:55:55 GMT - Yary:

I tried running install/bin/xpra start :13 --no-randr --disable-mmap --no-clipboard --no-pulseaudio with install/bin/xpra attach :13 --disable-mmap --no-clipboard --no-pulseaudio --no-keyboard-sync and the server still exits when the client attaches.

I started with no-deamon and attached with gdb:

$ install/bin/xpra  start :13 --no-randr --disable-mmap --no-clipboard --no-pulseaudio --no-daemon
$ gdb /usr/local/bin/python 29092

Attaching with install/bin/xpra attach :13 --disable-mmap --no-clipboard --no-pulseaudio --no-keyboard-sync gave me this in gdb:

New connection received
Program received signal SIGSEGV, Segmentation fault.
[Switching to process 29092, thread 0x85738400]
0x07dbab3f in vgetargskeywords () from /usr/local/lib/libpython2.6.so.1.0
(gdb) bt
#0  0x07dbab3f in vgetargskeywords () from /usr/local/lib/libpython2.6.so.1.0
#1  0x07dbb24b in _PyArg_ParseTupleAndKeywords_SizeT ()
   from /usr/local/lib/libpython2.6.so.1.0
#2  0x07de8636 in pattern_match () from /usr/local/lib/libpython2.6.so.1.0
#3  0x07d544fc in PyCFunction_Call () from /usr/local/lib/libpython2.6.so.1.0
#4  0x07dacdca in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.6.so.1.0
#5  0x07dad9a1 in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.6.so.1.0
#6  0x07dad9a1 in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.6.so.1.0
#7  0x07dad9a1 in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.6.so.1.0
#8  0x07dad9a1 in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.6.so.1.0
...

continues in PyEval_EvalFrameEx for many frames, then:

...
#248 0x07dad9a1 in PyEval_EvalFrameEx ()
   from /usr/local/lib/libpython2.6.so.1.0
#249 0x07dad9a1 in PyEval_EvalFrameEx ()
   from /usr/local/lib/libpython2.6.so.1.0
#250 0x07dad9a1 in PyEval_EvalFrameEx ()
   from /usr/local/lib/libpython2.6.so.1.0
#251 0x07dae9f8 in PyEval_EvalCodeEx () from /usr/local/lib/libpython2.6.so.1.0
#252 0x07d43619 in function_call () from /usr/local/lib/libpython2.6.so.1.0
#253 0x07d1a818 in PyObject_Call () from /usr/local/lib/libpython2.6.so.1.0
#254 0x07daa55b in PyEval_EvalFrameEx ()
   from /usr/local/lib/libpython2.6.so.1.0
#255 0x07dad9a1 in PyEval_EvalFrameEx ()
   from /usr/local/lib/libpython2.6.so.1.0
#256 0x07dad9a1 in PyEval_EvalFrameEx ()
   from /usr/local/lib/libpython2.6.so.1.0
#257 0x07dae9f8 in PyEval_EvalCodeEx () from /usr/local/lib/libpython2.6.so.1.0
#258 0x07d4352a in function_call () from /usr/local/lib/libpython2.6.so.1.0
---Type <return> to continue, or q <return> to quit---
#259 0x07d1a818 in PyObject_Call () from /usr/local/lib/libpython2.6.so.1.0
#260 0x07d29468 in instancemethod_call ()
   from /usr/local/lib/libpython2.6.so.1.0
#261 0x07d1a818 in PyObject_Call () from /usr/local/lib/libpython2.6.so.1.0
#262 0x07da76d4 in PyEval_CallObjectWithKeywords ()
   from /usr/local/lib/libpython2.6.so.1.0
#263 0x07dda63b in t_bootstrap () from /usr/local/lib/libpython2.6.so.1.0
#264 0x0f2d95be in _thread_start ()
    at /usr/src/lib/libpthread/uthread/uthread_create.c:242
#265 0x0000002b in ?? ()
#266 0x00000000 in ?? ()

Thu, 05 Jan 2012 11:54:52 GMT - Antoine Martin:

I installed OpenBSD 5.0 in a VM and gave it a quick test, and got the same result.

I also tried running without any of the C bindings (applying the "no-server" patch) and it still crashed! (patches pending for that - current trunk needs fixing for this patch to work again)

Looks to me like it is crashing in plain standard python code, not much to do with xpra at all as far as I can see. Sorry, I am out of ideas. The same code works fine on OSX, Win32, many different types of Linux, (even Solaris and FreeBSD last time I tried those)


Thu, 05 Jan 2012 15:39:18 GMT - Yary:

I'm not much of a Python hacker. Could you run xpra start :13 --no-daemon under idle or with pdb to see which line crashes? The "pattern_match" and "_PyArg_ParseTupleAndKeywords_SizeT" frames make me think that, when it's reading the first client command, Python is asked to either to read from an uninitialized string, or write to unallocated memory/too small a buffer- perhaps a string that is coming from C code. Could happen if returning a string allocated automatically in the stack, not malloc'ed in the heap. Pure speculation on my part.


Thu, 05 Jan 2012 18:22:41 GMT - Antoine Martin:

It dies here when calling self._read_decoder.process(). I suspect this is because the bencode code uses recursion too much and it hit the limit which must be lower on OpenBSD. The strange thing is that I then added some new tests to test_bencode.py in r399, using the same data it had earlier crashed on (I also tried with larger datasets as per the test), and it managed to run them fine. So whatever is going on, it's obscure and only affects OpenBSD...


Thu, 05 Jan 2012 19:10:17 GMT - Yary:

That info about recursion is helpful, and I should have seen that with the deep backtrace. My stack size is limited to 4MB, but I can change that... let's try a 64MB stack (ulimit -s 65336 in bash)... hmm still a segfault. Oh well. Thanks for all the help.


Sat, 07 Jan 2012 10:44:19 GMT - Antoine Martin: attachment set

remove incremental bdecoder stuff we don't need


Sat, 07 Jan 2012 10:45:46 GMT - Antoine Martin:

I just could not leave this alone... so the good news is that I have a workaround (apply the patch above - to trunk, probably applies cleanly to 0.0.7.32 too - haven't checked), the bad news is that I don't really want to merge it as-is just yet. Something will have to be done though because this is just too inefficient and ugly.. Let me know if that works for you.


Sat, 07 Jan 2012 10:47:29 GMT - Antoine Martin:

Oh, and connection from OpenBSD is very very slow - not sure why, will take a look at that too.


Wed, 11 Jan 2012 11:06:12 GMT - Antoine Martin: status changed; resolution set

merged in r408, works fine on OpenBSD now.


Thu, 12 Jan 2012 23:03:16 GMT - Yary:

The server works, and I can get xclock from my OpenBSD box to display on my Windows box using xpra v0.0.7.32 client over plink. Yay! Thanks for the work on that.

But, when I try to start an xterm, it displays for a second, and then the xpra server crashes again :-( even with the --no-randr --disable-mmap --no-clipboard --no-pulseaudio --no-daemon flags.

I've started using a mix of tmux- a little too lightweight- and TightVNC- a little too much- for my persistent session needs on that box.


Fri, 13 Jan 2012 05:42:49 GMT - Antoine Martin:

Can you confirm which svn revision you are using on the server? The server crashes? Have you got a backtrace for that? Can you re-connect the client or is the server completely gone?, is the Xvfb still there? (FYI: if so you can use --use-display to re-connect an xpra server to it)

I have tested OpenBSD with xterm and glxgears quite a bit and although r408 introduced its own set of problems (which could cause the connection to drop at seemingly random intervals), these were fixed as of r421


Sat, 14 Jan 2012 20:02:05 GMT - Yary:

My xpra is at r432

Intersting, this time I was able to start an xterm, and it crashed when I hit "ctrl-D" inside the xterm to close it. It did leave Xvfb behind. Stacktrace:

Program received signal SIGSEGV, Segmentation fault.
[Switching to process 11574, thread 0x8af10800]
0x0068b99e in __pyx_f_8wimpiggy_8lowlevel_8bindings_argbdata_to_pixdata ()
   from /home/yary/xpra/install/lib/python/wimpiggy/lowlevel/bindings.so
(gdb) bt
#0  0x0068b99e in __pyx_f_8wimpiggy_8lowlevel_8bindings_argbdata_to_pixdata ()
   from /home/yary/xpra/install/lib/python/wimpiggy/lowlevel/bindings.so
#1  0x0068c833 in __pyx_pf_8wimpiggy_8lowlevel_8bindings_58get_cursor_image ()
   from /home/yary/xpra/install/lib/python/wimpiggy/lowlevel/bindings.so
#2  0x027f29fa in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.6.so.1.0
#3  0x027f49f8 in PyEval_EvalCodeEx () from /usr/local/lib/libpython2.6.so.1.0
#4  0x0278952a in function_call () from /usr/local/lib/libpython2.6.so.1.0
#5  0x02760818 in PyObject_Call () from /usr/local/lib/libpython2.6.so.1.0
#6  0x0276f468 in instancemethod_call ()
   from /usr/local/lib/libpython2.6.so.1.0
#7  0x02760818 in PyObject_Call () from /usr/local/lib/libpython2.6.so.1.0
#8  0x027ed6d4 in PyEval_CallObjectWithKeywords ()
   from /usr/local/lib/libpython2.6.so.1.0
#9  0x027611fc in PyObject_CallObject ()
   from /usr/local/lib/libpython2.6.so.1.0
#10 0x0e4ad3c4 in init_gobject ()
   from /usr/local/lib/python2.6/site-packages/gtk-2.0/gobject/_gobject.so
#11 0x0f816e33 in g_closure_invoke ()
   from /usr/local/lib/libgobject-2.0.so.2600.0
#12 0x0f82ebb8 in g_signal_handlers_block_matched ()
   from /usr/local/lib/libgobject-2.0.so.2600.0
#13 0x0f830f60 in g_signal_emitv ()
   from /usr/local/lib/libgobject-2.0.so.2600.0
#14 0x0e4a46b6 in init_gobject ()
   from /usr/local/lib/python2.6/site-packages/gtk-2.0/gobject/_gobject.so
#15 0x0279a4e6 in PyCFunction_Call () from /usr/local/lib/libpython2.6.so.1.0
#16 0x02760818 in PyObject_Call () from /usr/local/lib/libpython2.6.so.1.0
#17 0x0069560f in __pyx_pf_8wimpiggy_8lowlevel_8bindings_74_maybe_send_event ()
   from /home/yary/xpra/install/lib/python/wimpiggy/lowlevel/bindings.so
#18 0x0279a4fc in PyCFunction_Call () from /usr/local/lib/libpython2.6.so.1.0
#19 0x02760818 in PyObject_Call () from /usr/local/lib/libpython2.6.so.1.0
#20 0x00696918 in __pyx_pf_8wimpiggy_8lowlevel_8bindings_75_route_event ()
   from /home/yary/xpra/install/lib/python/wimpiggy/lowlevel/bindings.so
#21 0x0279a4fc in PyCFunction_Call () from /usr/local/lib/libpython2.6.so.1.0
#22 0x02760818 in PyObject_Call () from /usr/local/lib/libpython2.6.so.1.0
#23 0x0069f8aa in __pyx_f_8wimpiggy_8lowlevel_8bindings_x_event_filter ()
   from /home/yary/xpra/install/lib/python/wimpiggy/lowlevel/bindings.so
#24 0x032f6bc1 in gdk_drag_action_get_type ()
   from /usr/local/lib/libgdk-x11-2.0.so.2200.0
#25 0x032f8444 in gdk_drag_action_get_type ()
   from /usr/local/lib/libgdk-x11-2.0.so.2200.0
#26 0x032fa218 in gdk_drag_action_get_type ()
   from /usr/local/lib/libgdk-x11-2.0.so.2200.0
#27 0x032fa68f in gdk_drag_action_get_type ()
   from /usr/local/lib/libgdk-x11-2.0.so.2200.0
#28 0x0170ee47 in g_main_context_dispatch ()
   from /usr/local/lib/libglib-2.0.so.2600.0
#29 0x017127be in g_main_context_prepare ()
   from /usr/local/lib/libglib-2.0.so.2600.0
#30 0x01712bc7 in g_main_loop_run () from /usr/local/lib/libglib-2.0.so.2600.0
#31 0x05141de4 in gtk_main () from /usr/local/lib/libgtk-x11-2.0.so.2200.0
#32 0x014b1d10 in init_gtk ()
   from /usr/local/lib/python2.6/site-packages/gtk-2.0/gtk/_gtk.so
#33 0x027f29fa in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.6.so.1.0
#34 0x027f39a1 in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.6.so.1.0
#35 0x027f49f8 in PyEval_EvalCodeEx () from /usr/local/lib/libpython2.6.so.1.0
#36 0x027f2501 in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.6.so.1.0
#37 0x027f49f8 in PyEval_EvalCodeEx () from /usr/local/lib/libpython2.6.so.1.0
#38 0x027f2501 in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.6.so.1.0
#39 0x027f49f8 in PyEval_EvalCodeEx () from /usr/local/lib/libpython2.6.so.1.0
#40 0x027f4b33 in PyEval_EvalCode () from /usr/local/lib/libpython2.6.so.1.0
#41 0x028100e6 in run_mod () from /usr/local/lib/libpython2.6.so.1.0
#42 0x0281019e in PyRun_FileExFlags () from /usr/local/lib/libpython2.6.so.1.0
#43 0x0281189f in PyRun_SimpleFileExFlags ()
   from /usr/local/lib/libpython2.6.so.1.0
#44 0x02811fea in PyRun_AnyFileExFlags ()
   from /usr/local/lib/libpython2.6.so.1.0
#45 0x0281e549 in Py_Main () from /usr/local/lib/libpython2.6.so.1.0
#46 0x1c0008a2 in main ()

Trace from xpra server ends with

keycode_from_name(None,d,['control']) level=0, group=0, keycode=40
keycode_from_name(None,d,['control']) level=0, group=0, keycode=40
keycode_from_name(None,Control_L,[]) level=0, group=0, keycode=37
Segmentation fault (core dumped)

It seems with just an xterm running, xpra works until I dismiss the xterm process.

Now I've restarted xpra server with --use-display and am re-running the original test: start xclock & - displays OK start xterm - displays OK click on xterm window to bring it into focus- server crashes. Windows client says: connection lost: empty marker in read queue backtrace:

(gdb) c
Continuing.
[New process 2486, thread 0x8960f000]
[New process 2486]
[New process 2486, thread 0x82ee4800]
Program received signal SIGSEGV, Segmentation fault.
[Switching to process 2486, thread 0x8960f000]
0x06ab299e in __pyx_f_8wimpiggy_8lowlevel_8bindings_argbdata_to_pixdata ()
   from /home/yary/xpra/install/lib/python/wimpiggy/lowlevel/bindings.so
(gdb) bt
#0  0x06ab299e in __pyx_f_8wimpiggy_8lowlevel_8bindings_argbdata_to_pixdata ()
   from /home/yary/xpra/install/lib/python/wimpiggy/lowlevel/bindings.so
#1  0x06ab3833 in __pyx_pf_8wimpiggy_8lowlevel_8bindings_58get_cursor_image ()
   from /home/yary/xpra/install/lib/python/wimpiggy/lowlevel/bindings.so
#2  0x0a6af9fa in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.6.so.1.0
#3  0x0a6b19f8 in PyEval_EvalCodeEx () from /usr/local/lib/libpython2.6.so.1.0
#4  0x0a64652a in function_call () from /usr/local/lib/libpython2.6.so.1.0
#5  0x0a61d818 in PyObject_Call () from /usr/local/lib/libpython2.6.so.1.0
#6  0x0a62c468 in instancemethod_call ()
   from /usr/local/lib/libpython2.6.so.1.0
#7  0x0a61d818 in PyObject_Call () from /usr/local/lib/libpython2.6.so.1.0
#8  0x0a6aa6d4 in PyEval_CallObjectWithKeywords ()
   from /usr/local/lib/libpython2.6.so.1.0
#9  0x0a61e1fc in PyObject_CallObject ()
   from /usr/local/lib/libpython2.6.so.1.0
#10 0x02c333c4 in init_gobject ()
   from /usr/local/lib/python2.6/site-packages/gtk-2.0/gobject/_gobject.so
#11 0x01246e33 in g_closure_invoke ()
   from /usr/local/lib/libgobject-2.0.so.2600.0
#12 0x0125ebb8 in g_signal_handlers_block_matched ()
   from /usr/local/lib/libgobject-2.0.so.2600.0
#13 0x01260f60 in g_signal_emitv ()
   from /usr/local/lib/libgobject-2.0.so.2600.0
---Type <return> to continue, or q <return> to quit---
#14 0x02c2a6b6 in init_gobject ()
   from /usr/local/lib/python2.6/site-packages/gtk-2.0/gobject/_gobject.so
#15 0x0a6574e6 in PyCFunction_Call () from /usr/local/lib/libpython2.6.so.1.0
#16 0x0a61d818 in PyObject_Call () from /usr/local/lib/libpython2.6.so.1.0
#17 0x06abc60f in __pyx_pf_8wimpiggy_8lowlevel_8bindings_74_maybe_send_event ()
   from /home/yary/xpra/install/lib/python/wimpiggy/lowlevel/bindings.so
#18 0x0a6574fc in PyCFunction_Call () from /usr/local/lib/libpython2.6.so.1.0
#19 0x0a61d818 in PyObject_Call () from /usr/local/lib/libpython2.6.so.1.0
#20 0x06abd918 in __pyx_pf_8wimpiggy_8lowlevel_8bindings_75_route_event ()
   from /home/yary/xpra/install/lib/python/wimpiggy/lowlevel/bindings.so
#21 0x0a6574fc in PyCFunction_Call () from /usr/local/lib/libpython2.6.so.1.0
#22 0x0a61d818 in PyObject_Call () from /usr/local/lib/libpython2.6.so.1.0
#23 0x06ac68aa in __pyx_f_8wimpiggy_8lowlevel_8bindings_x_event_filter ()
   from /home/yary/xpra/install/lib/python/wimpiggy/lowlevel/bindings.so
#24 0x0a464bc1 in gdk_drag_action_get_type ()
   from /usr/local/lib/libgdk-x11-2.0.so.2200.0
#25 0x0a466444 in gdk_drag_action_get_type ()
   from /usr/local/lib/libgdk-x11-2.0.so.2200.0
#26 0x0a468218 in gdk_drag_action_get_type ()
   from /usr/local/lib/libgdk-x11-2.0.so.2200.0
#27 0x0a46868f in gdk_drag_action_get_type ()
   from /usr/local/lib/libgdk-x11-2.0.so.2200.0
#28 0x02e4ce47 in g_main_context_dispatch ()
---Type <return> to continue, or q <return> to quit---
   from /usr/local/lib/libglib-2.0.so.2600.0
#29 0x02e507be in g_main_context_prepare ()
   from /usr/local/lib/libglib-2.0.so.2600.0
#30 0x02e50bc7 in g_main_loop_run () from /usr/local/lib/libglib-2.0.so.2600.0
#31 0x0d161de4 in gtk_main () from /usr/local/lib/libgtk-x11-2.0.so.2200.0
#32 0x0eee4d10 in init_gtk ()
   from /usr/local/lib/python2.6/site-packages/gtk-2.0/gtk/_gtk.so
#33 0x0a6af9fa in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.6.so.1.0
#34 0x0a6b09a1 in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.6.so.1.0
#35 0x0a6b19f8 in PyEval_EvalCodeEx () from /usr/local/lib/libpython2.6.so.1.0
#36 0x0a6af501 in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.6.so.1.0
#37 0x0a6b19f8 in PyEval_EvalCodeEx () from /usr/local/lib/libpython2.6.so.1.0
#38 0x0a6af501 in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.6.so.1.0
#39 0x0a6b19f8 in PyEval_EvalCodeEx () from /usr/local/lib/libpython2.6.so.1.0
#40 0x0a6b1b33 in PyEval_EvalCode () from /usr/local/lib/libpython2.6.so.1.0
#41 0x0a6cd0e6 in run_mod () from /usr/local/lib/libpython2.6.so.1.0
#42 0x0a6cd19e in PyRun_FileExFlags () from /usr/local/lib/libpython2.6.so.1.0
#43 0x0a6ce89f in PyRun_SimpleFileExFlags ()
   from /usr/local/lib/libpython2.6.so.1.0
#44 0x0a6cefea in PyRun_AnyFileExFlags ()
   from /usr/local/lib/libpython2.6.so.1.0
#45 0x0a6db549 in Py_Main () from /usr/local/lib/libpython2.6.so.1.0
#46 0x1c0008a2 in main ()

Sun, 15 Jan 2012 06:38:01 GMT - Antoine Martin: status changed; resolution deleted

Looks like premultiply_argb_in_place causes the segfault (strange how I didn't encounter it during my testing), so the temporary solution is just to disable custom cursors for now (untested), the easiest way is to just add a:

        return None

right at the top of the do_wimpiggy_cursor_event method in xpra/server.py, in here: server.py

I'll make a proper fix for this asap, will probably just need to copy the cursor data to a temporary buffer (no biggie since the cursor data is quite small).


Tue, 17 Jan 2012 18:32:41 GMT - Antoine Martin:

It might have been dereferencing a NULL pointer in that method, r437 should prevent that.

Sorry but I cannot test this as installing all the X11 development headers would take far too long (unless there is a "pkg_add -r libX11-dev" or something like this that I have missed?)


Tue, 17 Jan 2012 19:27:01 GMT - Yary:

I think all the X11 headers are under /usr/X11R6/include which would be installed as part of the xbase package. If you don't have it, get the correct xbase.tgz for your platform from a mirror OpenBSD Faq 4.11)

Updated to r439 but still crashing. Backtrace:

Program received signal SIGSEGV, Segmentation fault.
[Switching to process 20564, thread 0x8a24e000]
0x0f301cd6 in __pyx_f_8wimpiggy_8lowlevel_8bindings_argbdata_to_pixdata ()
   from /home/yary/xpra/install/lib/python/wimpiggy/lowlevel/bindings.so
(gdb) bt
#0  0x0f301cd6 in __pyx_f_8wimpiggy_8lowlevel_8bindings_argbdata_to_pixdata ()
   from /home/yary/xpra/install/lib/python/wimpiggy/lowlevel/bindings.so
#1  0x0f302bcc in __pyx_pf_8wimpiggy_8lowlevel_8bindings_60get_cursor_image ()
   from /home/yary/xpra/install/lib/python/wimpiggy/lowlevel/bindings.so
#2  0x002f69fa in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.6.so.1.0
#3  0x002f89f8 in PyEval_EvalCodeEx () from /usr/local/lib/libpython2.6.so.1.0
#4  0x0028d52a in function_call () from /usr/local/lib/libpython2.6.so.1.0
#5  0x00264818 in PyObject_Call () from /usr/local/lib/libpython2.6.so.1.0
#6  0x00273468 in instancemethod_call ()
   from /usr/local/lib/libpython2.6.so.1.0
#7  0x00264818 in PyObject_Call () from /usr/local/lib/libpython2.6.so.1.0
#8  0x002f16d4 in PyEval_CallObjectWithKeywords ()
   from /usr/local/lib/libpython2.6.so.1.0
#9  0x002651fc in PyObject_CallObject ()
   from /usr/local/lib/libpython2.6.so.1.0
#10 0x0459c3c4 in init_gobject ()
   from /usr/local/lib/python2.6/site-packages/gtk-2.0/gobject/_gobject.so
#11 0x05bfae33 in g_closure_invoke ()
   from /usr/local/lib/libgobject-2.0.so.2600.0
#12 0x05c12bb8 in g_signal_handlers_block_matched ()
   from /usr/local/lib/libgobject-2.0.so.2600.0
#13 0x05c14f60 in g_signal_emitv ()
   from /usr/local/lib/libgobject-2.0.so.2600.0
---Type <return> to continue, or q <return> to quit---
#14 0x045936b6 in init_gobject ()
   from /usr/local/lib/python2.6/site-packages/gtk-2.0/gobject/_gobject.so
#15 0x0029e4e6 in PyCFunction_Call () from /usr/local/lib/libpython2.6.so.1.0
#16 0x00264818 in PyObject_Call () from /usr/local/lib/libpython2.6.so.1.0
#17 0x0f30b97d in __pyx_pf_8wimpiggy_8lowlevel_8bindings_76_maybe_send_event ()
   from /home/yary/xpra/install/lib/python/wimpiggy/lowlevel/bindings.so
#18 0x0029e4fc in PyCFunction_Call () from /usr/local/lib/libpython2.6.so.1.0
#19 0x00264818 in PyObject_Call () from /usr/local/lib/libpython2.6.so.1.0
#20 0x0f30cc86 in __pyx_pf_8wimpiggy_8lowlevel_8bindings_77_route_event ()
   from /home/yary/xpra/install/lib/python/wimpiggy/lowlevel/bindings.so
#21 0x0029e4fc in PyCFunction_Call () from /usr/local/lib/libpython2.6.so.1.0
#22 0x00264818 in PyObject_Call () from /usr/local/lib/libpython2.6.so.1.0
#23 0x0f315c18 in __pyx_f_8wimpiggy_8lowlevel_8bindings_x_event_filter ()
   from /home/yary/xpra/install/lib/python/wimpiggy/lowlevel/bindings.so
#24 0x03c3ebc1 in gdk_drag_action_get_type ()
   from /usr/local/lib/libgdk-x11-2.0.so.2200.0
#25 0x03c40444 in gdk_drag_action_get_type ()
   from /usr/local/lib/libgdk-x11-2.0.so.2200.0
#26 0x03c42218 in gdk_drag_action_get_type ()
   from /usr/local/lib/libgdk-x11-2.0.so.2200.0
#27 0x03c4268f in gdk_drag_action_get_type ()
   from /usr/local/lib/libgdk-x11-2.0.so.2200.0
#28 0x0b56ae47 in g_main_context_dispatch ()
---Type <return> to continue, or q <return> to quit---
   from /usr/local/lib/libglib-2.0.so.2600.0
#29 0x0b56e7be in g_main_context_prepare ()
   from /usr/local/lib/libglib-2.0.so.2600.0
#30 0x0b56ebc7 in g_main_loop_run () from /usr/local/lib/libglib-2.0.so.2600.0
#31 0x05735de4 in gtk_main () from /usr/local/lib/libgtk-x11-2.0.so.2200.0
#32 0x01e48d10 in init_gtk ()
   from /usr/local/lib/python2.6/site-packages/gtk-2.0/gtk/_gtk.so
#33 0x002f69fa in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.6.so.1.0
#34 0x002f79a1 in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.6.so.1.0
#35 0x002f89f8 in PyEval_EvalCodeEx () from /usr/local/lib/libpython2.6.so.1.0
#36 0x002f6501 in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.6.so.1.0
#37 0x002f89f8 in PyEval_EvalCodeEx () from /usr/local/lib/libpython2.6.so.1.0
#38 0x002f6501 in PyEval_EvalFrameEx () from /usr/local/lib/libpython2.6.so.1.0
#39 0x002f89f8 in PyEval_EvalCodeEx () from /usr/local/lib/libpython2.6.so.1.0
#40 0x002f8b33 in PyEval_EvalCode () from /usr/local/lib/libpython2.6.so.1.0
#41 0x003140e6 in run_mod () from /usr/local/lib/libpython2.6.so.1.0
#42 0x0031419e in PyRun_FileExFlags () from /usr/local/lib/libpython2.6.so.1.0
#43 0x0031589f in PyRun_SimpleFileExFlags ()
   from /usr/local/lib/libpython2.6.so.1.0
#44 0x00315fea in PyRun_AnyFileExFlags ()
   from /usr/local/lib/libpython2.6.so.1.0
#45 0x00322549 in Py_Main () from /usr/local/lib/libpython2.6.so.1.0
#46 0x1c0008a2 in main ()

Tue, 17 Jan 2012 19:51:37 GMT - Antoine Martin:

I'll give it a go, have you tried disabling this method entirely as per comment 20?


Tue, 17 Jan 2012 20:23:44 GMT - Antoine Martin:

OK thanks - now up and running, I got a half-decent framerate running glxgears with mmap enabled.

How do you manage to get those beautifully informative stack traces from gdb? (mine aren't human readable) I hope I won't have to build python and the rest from source?

With r442 and the hack to ignore all cursor events, I don't get any crashes once the connection is made (not yet anyway - briefly tested) - although it does fail many connection attempts (no pattern identified yet - may depend on the number of X11 clients running), will look into why that is. I will also try to fix the cursor stuff, since the problem isn't a null dereference.

That's an OpenBSD 5.0 server with the client on the same machine, with remote Linux clients the connection fails everytime - that should help me in narrowing it down. Latency/timing possibly, it seems unhappy about the socket state. That's the second part.


Wed, 18 Jan 2012 00:12:59 GMT - Yary:

For gdb stack traces- I am using python as installed by pkg_add. I don't think I did anything special. (Hmm, there is also a /usr/bin/python which I am not using, maybe that's why?) First I start xpra, I get the process number via ps -auxw|grep python though there is probably a better way. Then I run gdb /usr/local/bin/python 12345 where 12345 is the xpra/python pid. At the gdb prompt I hit "c" to continue, then I connect a client, wait for a crash, and type "bt".

I did not try disabling the method as per comment 20... sounds like you have a decent way to reproduce the troubles now, let me know if you still want me to or need the pretty backtraces for further versions.


Wed, 18 Jan 2012 08:46:45 GMT - Antoine Martin:

the cursor stuff is fixed in r443: the C code was requesting 4 times as much memory as had been provided by the X11 cursor code. (this may be worth a security advisory). Now on to the socket stuff, which has decided to not happen very often today to make it harder for me to debug :(


Wed, 18 Jan 2012 16:49:44 GMT - Antoine Martin:

I believe the second part is due to some signal being received by the socket thread, which is why it is intermittent. Although I would like to remove child processes completely from the keymap code, this can't be done just yet (compatibility with older versions for one - see #57 for planning details) so the proper fix is something like SA_RESTART so the select() and read()/write() calls will continue after the interruption.

References:

Now, I would much prefer figuring out what signal/command is causing the problem and ignoring it there rather than wrapping all the network code with try/catch looking for EINTR as was done with subprocess (ie: here)


Wed, 18 Jan 2012 17:45:54 GMT - Antoine Martin:

r449 deals with EINTR

I can't get it to crash or misbehave, so I believe OpenBSD is now a supported platform!

(please confirm so I can close this ticket)


Wed, 18 Jan 2012 22:59:11 GMT - Yary:

Thanks for all the work- alas I can still make it crash! I've updated to r452. I start a new xpra server and run an xterm. Then I connect using Windows xpra client over ssh, using plink. I hit a control key and hold it down, and the server exits, saying:

keycode_from_name(None,Control_L,['control']) level=0, group=0, keycode=37
keycode_from_name(,Control_L,['control']) level=0, group=0, keycode=37
The program 'xpra' received an X Window System error.
This probably reflects a bug in the program.
The error was 'BadValue (integer parameter out of range for operation)'.
  (Details: serial 68776 error_code 2 request_code 132 minor_code 2)
  (Note to programmers: normally, X errors are reported asynchronously;
   that is, you will receive the error a while after causing it.
   To debug your program, run it with the --sync command line
   option to change this behavior. You can then get a meaningful
   backtrace from your debugger if you break on the gdk_x_error() function.)
Program exited with code 01.

I'd like to follow the instructions but nothing seems to take the --sync option.


Thu, 19 Jan 2012 09:11:16 GMT - Antoine Martin:

Are you using the latest native Xpra released yesterday on MS Windows? (v0.0.7.34). Or cygwin? Either way, it should not be using the method keycode_from_name() If you aren't, maybe r454 will avoid this particular crash. (would be interesting to try this first to confirm the problem)

Obviously, I have tried to reproduce with v0.0.7.34 and it refuses to crash or misbehave.. (only slight difference is that I use TCP mode rather than plink - not that it should make much difference as this is transparent to the application)

As for debugging with X11 synchronous calls, the way to do it is (poorly documented in gtk/gdk):

GDK_SYNCHRONIZE=1 xpra ...

Even then, in my experience there is no guarantee that the X11 error will really be the root cause.


Thu, 19 Jan 2012 09:33:28 GMT - Antoine Martin: attachment set

shows win32 client connect to openbsd server with Control_L pressed


Thu, 19 Jan 2012 09:35:01 GMT - Antoine Martin:

xpra/gtk_view_keyboard.py

as shown in the screenshot above can be useful for seeing what keys are being pressed and released (as seen by the client application running on the server)


Thu, 19 Jan 2012 15:32:58 GMT - Yary:

I was using Windows native xpra v0.0.7.32

So let's try upgrading the server to r454 and rebuilding, connecting with that older client... still "an X Window System error." It seems to be something about a repeated key, the first key event doesn't crash it.

Upgrading the windows client to xpra v0.0.7.34 and no keyboard crashes, hooray! Starting an emacs session- and now mouseover tooltips stop xpra with "an X Window System error." Seems like I can hover over "Emacs Tutorial" on the home screen, then hover on the line below, and the xpra server exits with

The error was 'BadWindow (invalid Window parameter)'.
  (Details: serial 3478 error_code 3 request_code 12 minor_code 0)

Grrr. Let's try X11 debugging.

(gdb) break gdk_x_error
Function "gdk_x_error" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (gdk_x_error) pending.
(gdb) break gdk_x_error()
Function "gdk_x_error()" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 2 (gdk_x_error()) pending.
(gdb) c
Continuing.

And the program exits on error, apparently there really is no gdk_x_error to break on.

Sorry for the headaches


Thu, 19 Jan 2012 15:46:38 GMT - Antoine Martin:

It's quite possible that old clients make the server press the same key twice, why that would crash the Xvfb server/xpra on OpenBSD is beyond me.

As for gdk_x_error(), you may be able to get the X11 related backtrace without it (iirc, that's how I got them previously), just using 'bt' from when the program stops.

I'm going to try to find a simple test case to reproduce this, as I don't use emacs. The tooltips sound like a floating window (aka 'Override Redirect'). Those are typically short lived and can disappear whilst we're requesting their contents, this happens fairly regularly but does not normally cause crashes. (generally just a 'wtf, Pixmap is None' message or other)


Thu, 19 Jan 2012 16:12:52 GMT - Yary:

The program has exited by the time the error message shows. Hey, I can trap exit and _exit! Here it is running with GDK_SYNCHRONIZE=1

The error was 'BadWindow (invalid Window parameter)'.
  (Details: serial 17429 error_code 3 request_code 12 minor_code 0)
Breakpoint 1, exit (status=1) at /usr/src/lib/libc/stdlib/exit.c:57
57              register struct atexit *p, *q;
(gdb) bt
#0  exit (status=1) at /usr/src/lib/libc/stdlib/exit.c:57
#1  0x06124410 in gdk_drag_action_get_type ()
   from /usr/local/lib/libgdk-x11-2.0.so.2200.0
#2  0x08dd3190 in _XError () from /usr/X11R6/lib/libX11.so.14.0
#3  0x08dda90a in handle_error () from /usr/X11R6/lib/libX11.so.14.0
#4  0x08dda948 in handle_response () from /usr/X11R6/lib/libX11.so.14.0
#5  0x08ddaf3e in _XReply () from /usr/X11R6/lib/libX11.so.14.0
#6  0x08dc88c0 in XQueryPointer () from /usr/X11R6/lib/libX11.so.14.0
#7  0x06130ac6 in gdk_drag_action_get_type () from /lib/libgdk-x11-2.0.so.2200.0
#8  0x060de08b in gdk_device_get_core_pointer () from /lib/libgdk-x11-2.0.so.2200.0
#9  0x060fd8fc in gdk_window_get_pointer () from /lib/libgdk-x11-2.0.so.2200.0
#10 0x05376b56 in init_gtk () from /lib/python2.6/site-packages/gtk-2.0/gtk/_gtk.so
#11 0x089679fa in PyEval_EvalFrameEx () from /lib/libpython2.6.so.1.0
#12 0x089699f8 in PyEval_EvalCodeEx () from /lib/libpython2.6.so.1.0
#13 0x08967501 in PyEval_EvalFrameEx () from /lib/libpython2.6.so.1.0
#14 0x089699f8 in PyEval_EvalCodeEx () from /lib/libpython2.6.so.1.0
#15 0x08967501 in PyEval_EvalFrameEx () from /lib/libpython2.6.so.1.0
#16 0x089689a1 in PyEval_EvalFrameEx () from /lib/libpython2.6.so.1.0
#17 0x089689a1 in PyEval_EvalFrameEx () from /lib/libpython2.6.so.1.0
#18 0x089699f8 in PyEval_EvalCodeEx () from /lib/libpython2.6.so.1.0
#19 0x088fe52a in function_call () from /lib/libpython2.6.so.1.0
#20 0x088d5818 in PyObject_Call () from /lib/libpython2.6.so.1.0
#21 0x088e4468 in instancemethod_call () from /lib/libpython2.6.so.1.0
#22 0x088d5818 in PyObject_Call () from /lib/libpython2.6.so.1.0
#23 0x089626d4 in PyEval_CallObjectWithKeywords () from /lib/libpython2.6.so.1.0
#24 0x088d61fc in PyObject_CallObject () from /lib/libpython2.6.so.1.0
#25 0x0b7e01eb in _pyglib_handler_marshal () from /lib/libpyglib-2.0-python2.6.so.1.0
#26 0x0cea7151 in g_source_is_destroyed () from /lib/libglib-2.0.so.2600.0
#27 0x0cea8e47 in g_main_context_dispatch () from /lib/libglib-2.0.so.2600.0
#28 0x0ceac7be in g_main_context_prepare () from /lib/libglib-2.0.so.2600.0
#29 0x0ceacbc7 in g_main_loop_run () from /lib/libglib-2.0.so.2600.0
#30 0x0d97ade4 in gtk_main () from /lib/libgtk-x11-2.0.so.2200.0
#31 0x0535ad10 in init_gtk () from /lib/python2.6/site-packages/gtk-2.0/gtk/_gtk.so
#32 0x089679fa in PyEval_EvalFrameEx () from /lib/libpython2.6.so.1.0
#33 0x089689a1 in PyEval_EvalFrameEx () from /lib/libpython2.6.so.1.0
#34 0x089699f8 in PyEval_EvalCodeEx () from /lib/libpython2.6.so.1.0
#35 0x08967501 in PyEval_EvalFrameEx () from /lib/libpython2.6.so.1.0
#36 0x089699f8 in PyEval_EvalCodeEx () from /lib/libpython2.6.so.1.0
#37 0x08967501 in PyEval_EvalFrameEx () from /lib/libpython2.6.so.1.0
#38 0x089699f8 in PyEval_EvalCodeEx () from /lib/libpython2.6.so.1.0
#39 0x08969b33 in PyEval_EvalCode () from /lib/libpython2.6.so.1.0
#40 0x089850e6 in run_mod () from /lib/libpython2.6.so.1.0
#41 0x0898519e in PyRun_FileExFlags () from /lib/libpython2.6.so.1.0
#42 0x0898689f in PyRun_SimpleFileExFlags () from /lib/libpython2.6.so.1.0
#43 0x08986fea in PyRun_AnyFileExFlags () from /lib/libpython2.6.so.1.0
#44 0x08993549 in Py_Main () from /lib/libpython2.6.so.1.0
#45 0x1c0008a2 in main ()

I suppose _XError is the routine to break on instead of gdk_x_error


Thu, 19 Jan 2012 16:43:17 GMT - Antoine Martin:

The plot thickens, this looks like #3

Found some other crashes via google that look similar (calls to gdk_display_get_pointer), like this one for emacs: https://bugs.kde.org/show_bug.cgi?id=193281

The big problem here is that there is no Xpra code between gtk_main and the gdk_display_get_pointer, so it isn't like I can trap the error and ignore it I think..


Fri, 20 Jan 2012 18:59:08 GMT - Antoine Martin:

Found some more:

So I am very tempted to close this ticket.. as this needs to be dealt with in gtk.


Mon, 20 Feb 2012 19:36:13 GMT - Antoine Martin: milestone changed


Thu, 01 Mar 2012 17:07:27 GMT - Antoine Martin:

Can you give me some precise steps to reproduce this crash? I've tried really hard to make the OpenBSD client or server crash and not had any luck (at r577), see for example ticket #3 comment:7

Please also see the note regarding Debugging with Gdb: knowing where in the python code the PyEval_EvalFrameEx lines come from would help.


Fri, 09 Mar 2012 11:05:27 GMT - Antoine Martin: status changed; resolution set

feel free to re-open - can't reproduce :(


Tue, 07 Apr 2020 08:23:08 GMT - Antoine Martin: description changed

(formatting)


Sat, 23 Jan 2021 04:44:33 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/64