#1456 closed defect (fixed)
Server crashes on launching application (inconsistently)
Reported by: | psycho_zs | Owned by: | psycho_zs |
---|---|---|---|
Priority: | blocker | Milestone: | 2.0 |
Component: | server | Version: | trunk |
Keywords: | Cc: |
Description
This does not happen every time I launch thunderbird, but sometimes.
The following error shows up in server's log before the crash:
2017-03-01 21:23:10,833 started command 'thunderbird' with pid 7119 [xcb] Unknown sequence number while processing queue [xcb] Most likely this is a multi-threaded client and XInitThreads has not been called [xcb] Aborting, sorry about that.
I was able to revive the session with xpra upgrade
afterwards
Attachments (3)
Change History (17)
comment:1 Changed 5 years ago by
Milestone: | → 2.0 |
---|---|
Status: | new → assigned |
Changed 5 years ago by
Attachment: | xcb_error.log added |
---|
comment:2 Changed 5 years ago by
Attached last 1k lines of log. With -d all
the only new info at the moment of the crash is the python error.
Changed 5 years ago by
Attachment: | io_error.log added |
---|
comment:3 Changed 5 years ago by
Another crash upon connecting, with IO error 11. X server is alive and session can be restored with xpra upgrade
comment:4 Changed 5 years ago by
...also I can not seem to launch xpra on server anymore, it outputs this:
$ xpra start :100 using systemd-run to wrap 'start' server command 'systemd-run' '--description' 'xpra-start' '--scope' '--user' '/usr/bin/xpra' 'start' ':100' '--systemd-run=no' Failed to create bus connection: No such file or directory
Remote start works though.
comment:5 Changed 5 years ago by
Lots of goodies in those log samples.
What distro is this? What exact command lines?
- attachment/ticket/1456/io_error.log shows calls to xsettings just before the IO error and sure enough the X11 calls are unguarded - that's been the case since the very first version of xpra! r15207 fixes that. (will be backported to v1.0.x)
- attachment/ticket/1456/xcb_error.log has more issues in it:
Here's a list of significant events leading up to this last crash:
- mouse events (motion and clicks)
- client closes a window and we send the message asking it to close on the server:
02:14:57,051 client closed window 2 - WindowModel(0x4000b4) 02:14:57,051 sending WM_DELETE_WINDOW to 0x4000b4
- we reset the focus to the world window
- the application unmaps the window and we stop managing it:
02:14:57,195 x_event_filter event=('xpra-unmap-event', 'xpra-child-unmap-event')/UnmapNotify window=0x4000b4 02:14:57,197 unmanaging window: WindowModel(0x4000b4) \ (<gtk.gdk.Window object at 0x7f772463fa00 (GdkWindow at 0x5627d55ea000)> - \ <gtk.gdk.Window object at 0x7f772463f6e0 (GdkWindow at 0x5627d5597480)>)
- we try to cleanup the window state (and get lots of errors since it's gone - which is expected):
02:14:57,200 XError XError: BadWindow detected while already in unwind; discarding
- we tell the client that the window is gone:
02:14:57,200 add_packet_to_queue(lost-window ...)
- the window in focus changes, and so does the cursor so we send the new cursor pixel data to the client:
02:14:57,233 x_event_filter event=('xpra-focus-out-event', None)/FocusOut took 4.7ms 02:14:57,233 get_cursor_data() cursor=[1072, 542, 16, 16, 7, 7, 1L, '1024 bytes', ''] 02:14:57,234 get_cursor_data(): default cursor - clearing it
- a system tray window is destroyed and we tell the client (I assume thunderbird has some sort of system tray icon):
02:14:58,819 x_event_filter event=('xpra-destroy-event', None)/DestroyNotify took 0.2ms 02:14:58,819 x_event_filter event=('xpra-unmap-event', 'xpra-child-unmap-event')/UnmapNotify window=0x4000f3 02:14:58,819 SystemTray.do_xpra_unmap_event(<X11:UnmapNotify \ {'delivered_to': '0x4000f3', 'send_event': 0, 'window': '0x4000f3', 'serial': '0x175c', 'type': 18, \ 'display': <gtk.gdk.DisplayX11 object at 0x7f773896fa00 (GdkDisplayX11 at 0x5627d4cf7000)>}>) \ container window=<gtk.gdk.Window object at 0x7f772463f500 (GdkWindow at 0x5627d5597ea0)> 02:14:58,821 lost_window: 1 - SystemTrayWindowModel(0x20002e) 02:14:58,821 add_packet_to_queue(lost-window ...)
- again, lots of errors we ignore as we receive events for the already-destroyed window / try to clean it up
- the thunderbird child process terminates:
02:14:58,835 sigchld(17, <frame object at 0x7f7710003610>) 02:14:58,835 poll() procinfo list: [ProcInfo({'returncode': None, 'name': 'thunderbird', 'process': <subprocess.Popen object at 0x7f77232b1450>, \ 'pid': 9936, 'dead': False, 'ignore': False, 'callback': None, 'command': ['thunderbird'], 'forget': False})] 02:14:58,835 add_dead_process(ProcInfo({'returncode': None, 'name': 'thunderbird', 'process': <subprocess.Popen object at 0x7f77232b1450>, \ 'pid': 9936, 'dead': False, 'ignore': False, 'callback': None, 'command': ['thunderbird'], 'forget': False})) 02:14:58,835 add_dead_process returncode=0, dead=True, callback=None 02:14:58,835 child 'thunderbird' with pid 9936 has terminated
- the client disconnects 4 seconds later:
02:15:03,579 read thread: eof
- a new client connects:
02:15:05,543 processing packet hello
- it requests the server to launch thunderbird:
02:15:19,201 will process ui packet start-command 02:15:19,201 start new command: ['start-command', 'thunderbird', 'thunderbird', False, True]
It would save me a lot of time to know that I only really care about the last few entries!
comment:6 Changed 5 years ago by
This was done on Current Debian testing, 64bit, xpra 2.0 r15180. Thunderbird with firetray extension.
Server launched with remote xpra start ssh:...:100 -d all
At some point it was re-launched after crash with the same X session by xpra upgrade :100
locally on the server, if I remember correctly. (local xpra start
does not work).
comment:7 Changed 5 years ago by
Priority: | major → blocker |
---|
I have seen the crash - not sure how to reproduce it reliably yet.
Hints would be welcome.
comment:8 Changed 5 years ago by
Owner: | changed from Antoine Martin to psycho_zs |
---|---|
Status: | assigned → new |
@psycho_zs: I am unable to make it crash now that I am ready to investigate it :(
I've stopped and restarted thunderbird dozens of times using xpra control :3 start thunderbird
.
Then I hit a different bug when running debian stretch: we seem to be crashing handling the initial info request, it could be the same bug.
If you can reproduce the thunderbird one easily, can you try:
- apply this to the client:
--- xpra/client/ui_client_base.py (revision 15219) +++ xpra/client/ui_client_base.py (working copy) @@ -1981,7 +1981,7 @@ gui_ready() if self.tray: self.tray.ready() - self.send_info_request() + #self.send_info_request() def handshake_complete(self): oh = self._on_handshake
- running with "--no-tray" to see if that helps
- running r15207 or later to see if that helps
comment:9 Changed 5 years ago by
comment:10 Changed 5 years ago by
Bug squashed, see ticket:1430#comment:4.
This affected both 1.0.x and trunk.
@psycho_zs: I would really appreciate if you could try to break it again now that I have committed those fixes. I am quite confident the fixes are correct, but there may still be another bug lurking somewhere.
To make it easier I have posted updated Debian Stretch beta packages (both trunk and v1.0.4): http://xpra.org/beta.
Changed 5 years ago by
Attachment: | encoder_error.log added |
---|
comment:11 Changed 5 years ago by
no crash yet, local start works.
But server drains CPU when client is connected, attached new error spam.
comment:12 Changed 5 years ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
@psycho_zs: great news - thank you, the error you are seeing is unrelated to the crash (was caused by r15205, fixed in r15222)
I am closing this ticket for now, but feel free to re-open it.
I will still look into this as part of #1430.
Note: the only way you can hit this codepath is if you have opengl client rendering disabled... which is not recommended.
comment:13 Changed 5 years ago by
With opengl on I have redraw issues on window resize (just re-checked on Intel, mesa 17.0.0), so I leave that disabled. Will see if I'm able to catch the crash again.
comment:14 Changed 16 months ago by
this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/1456
I have seen this very occasionally and indicates a bug in xpra.
Unfortunately, I have not been able to reproduce it reliably.
If you can reproduce it, please attach the end of the server's "-d all" log.