Xpra: Ticket #924: OSX client - "Xpra quit unexpectedly"

I'm observing the following about once per day. This may be right after returning from sleep mode.

I do not recall seeing this on 15.2.

Crash Dump:

Time Awake Since Boot: 350000 seconds
Time Since Wake:       12 seconds
Crashed Thread:        0  Dispatch queue: com.apple.main-thread
Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       KERN_INVALID_ADDRESS at 0x0000000000000175
Application Specific Information:
objc_msgSend() selector name: handleWakeNotification:
Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libobjc.A.dylib               	0x948e40ab objc_msgSend + 27
1   com.apple.CoreFoundation      	0x99560c34 __CFNOTIFICATIONCENTER_IS_CALLING_OUT_TO_AN_OBSERVER__ + 20
2   com.apple.CoreFoundation      	0x99440901 _CFXNotificationPost + 3713
3   com.apple.Foundation          	0x9439f224 -[NSNotificationCenter postNotificationName:object:userInfo:] + 92
4   com.apple.AppKit              	0x9560e541 powerSubsystemPostNotification + 131
5   com.apple.AppKit              	0x9560e2f1 powerSubsystemCallback + 228
6   com.apple.framework.IOKit     	0x9be69a03 IODispatchCalloutFromCFMessage + 254
7   com.apple.CoreFoundation      	0x994b751d __CFMachPortPerform + 317
8   com.apple.CoreFoundation      	0x994b73d5 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE1_PERFORM_FUNCTION__ + 53
9   com.apple.CoreFoundation      	0x994b733e __CFRunLoopDoSource1 + 510
10  com.apple.CoreFoundation      	0x994a8750 __CFRunLoopRun + 2624
11  com.apple.CoreFoundation      	0x994a7aa6 CFRunLoopRunSpecific + 390
12  com.apple.CoreFoundation      	0x994a790b CFRunLoopRunInMode + 123
13  com.apple.HIToolbox           	0x9397d8f8 RunCurrentEventLoopInMode + 262
14  com.apple.HIToolbox           	0x9397d631 ReceiveNextEventCommon + 494
15  com.apple.HIToolbox           	0x9397d42c _BlockUntilNextEventMatchingListInModeWithFilter + 99
16  com.apple.AppKit              	0x94d34b41 _DPSNextEvent + 742
17  com.apple.AppKit              	0x94d341e5 -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] + 350
18  libgdk-quartz-2.0.0.dylib     	0x03c4c78e poll_func + 300
19  libglib-2.0.0.dylib           	0x0311107d g_main_context_poll + 81
20  libglib-2.0.0.dylib           	0x031109d9 g_main_context_iterate + 414
21  libglib-2.0.0.dylib           	0x03110e3f g_main_loop_run + 476
22  libgtk-quartz-2.0.0.dylib     	0x0385b563 gtk_main + 239
23  _gtk.so                       	0x03609c43 _wrap_gtk_main + 129
24  libpython2.7.dylib            	0x000c759b PyEval_EvalFrameEx + 24891
25  libpython2.7.dylib            	0x000c96bf PyEval_EvalFrameEx + 33375
26  libpython2.7.dylib            	0x000c96bf PyEval_EvalFrameEx + 33375
27  libpython2.7.dylib            	0x000c9f3c PyEval_EvalCodeEx + 2012
28  libpython2.7.dylib            	0x000c7ccb PyEval_EvalFrameEx + 26731
29  libpython2.7.dylib            	0x000c9f3c PyEval_EvalCodeEx + 2012
30  libpython2.7.dylib            	0x000ca087 PyEval_EvalCode + 87
31  libpython2.7.dylib            	0x000ef72d PyRun_StringFlags + 285
32  libpython2.7.dylib            	0x000ef83e PyRun_SimpleStringFlags + 78
33  libpython2.7.dylib            	0x001069fc Py_Main + 1564
34  Xpra Launcher                 	0x00002fb6 start + 54
Model: MacBookPro11,2, BootROM MBP112.0138.B15, 4 processors, Intel Core i7, 2.2 GHz, 16 GB, SMC 2.18f15
Graphics: Intel Iris Pro, Intel Iris Pro, Built-In


Thu, 23 Jul 2015 14:45:04 GMT - Antoine Martin: status, description changed

Very likely to be related to suspend and resume. Not much we can do about this since this is deep in the GTK event loop.


Fri, 23 Oct 2015 06:25:55 GMT - Antoine Martin: owner, status changed

Actually, we may be able to do something as the fix may well be similar to #901. We would need to find out how to get notifications about power events and/or screensaver events and add the same hooks as already used on win32 (#901) and Linux (#492).

@kverne / @afarr: can you reproduce? (I would guess: suspend+resume a laptop whilst it is showing a fast moving window like glxgears)


Fri, 23 Oct 2015 16:09:57 GMT - Antoine Martin:

Update from kverne received by email:

Sure. I'll install the latest and send you the dump.

glxgears shouldn't be necessary to repro. All I usually had up was a few xterms and maybe Slickedit. Every morning I'd get the error after opening the laptop. Just temporarily closing the laptop doesn't trigger it, so perhaps there's a deep-sleep problem.

(Btw, X2GoClient also crashes on revival., so it may be a common issue.)


My reply: not sure what the problem is with dependencies, this is an OSX client problem, you should be able to run 0.15 or even 0.16 beta clients against your existing 0.14 servers.


Mon, 02 Nov 2015 19:20:48 GMT - alas: owner changed

This is interesting. We see this error related to sleep handling with our own set up on OSX clients when they deal with the sleep handlers (I see it on my own OSX with the same objc_msgSend() selector name: handleWakeNotification:, and another user seems to stumble across it with objc_msgSend() selector name: handleSleepNotification:).

I haven't been able to repro it with non-customized xpra clients, however... neither 0.14.x, 0.15.x, nor 0.16.x - and not for lack of trying.

I would suspect that kveme must also be wrapping his client in some other process(es), which I suspect is/are the real culprit(s), as I suspect with ours.

I don't think we've managed to pin the problem down yet though.

About the only thing I can say is, when trying to collect logs (no particular flags) I see nothing unusual client side until the end - Bus error: 10.

Server side (think this was with our 0.14.21 server) I see the following useless info:

May  6 19:07:39 dhclient[502]: bound to x.x.x.x -- renewal in 1159 seconds.
May  6 19:23:28 2015-05-06 19:23:28,013 Connection lost
May  6 19:23:28 2015-05-06 19:23:28,014 xpra client disconnected.
May  6 19:23:28  Server: Event connection-lost
May  6 19:23:28  Server: connection lost
Write failed: Broken pipe

I can repro it quite easily though... if you can suggest any particular logging flags that might be useful?


Tue, 03 Nov 2015 16:01:35 GMT - Antoine Martin: owner changed

Got another update via email (@kverne: please use the tracker): This may have a non-Xpra cause. As corporate dictates what version of software is loaded, I may just have to wait until they upgrade us before this will go away.

So I took another look at the power event handling code, which is also used for triggering a re-initialization of all the windows just like we do when we get monitor hotplug events (see #980) and this could cause problems on its own: disabling opengl should turn off this particular codepath. Does it help prevent the crash?

Also made some minor improvements:

You can find a beta with these changes here: http://xpra.org/beta/osx/.


Wed, 04 Nov 2015 02:30:24 GMT - Antoine Martin:

More updates, still received by email...

@kverne:

r11121 has been applied to v0.14.x and v0.15.x, hopefully this will fix the sleep/resume issues for those versions.


Thu, 05 Nov 2015 05:08:25 GMT - Antoine Martin: owner changed

@kverne: please use the ticketing system, copying your emails into it is tedious.

Latest update:

Unfortunately it crashed again on 11122.


Please try XPRA_OSX_SLEEP_HANDLER=0 as per comment:5.


Thu, 05 Nov 2015 12:43:28 GMT - Kerry:

Sorry guys, I thought this was automated.

Fuzzy graphics and scaling were most likely due to the preferences changeover between 14 and 15.

Now trying to repro with 'XPRA_OSX_SLEEP_HANDLER=0'.


Thu, 05 Nov 2015 18:31:19 GMT - alas:

For what it's worth - I tested with our osx client with --opengl=off, still crashes with reference to:

Application Specific Information:
objc_msgSend() selector name: handleWakeNotification:

... so OpenGL isn't the culprit.

Though, again, I can't reproduce with any xpra without our extra bells and whistles.


Mon, 16 Nov 2015 07:32:22 GMT - Antoine Martin:

@kverne: does XPRA_OSX_SLEEP_HANDLER=0 make any difference?


Mon, 16 Nov 2015 15:24:52 GMT - Kerry:

Build: Xpra11122

Cmd: open -n ./Xpra.app --args XPRA_OSX_SLEEP_HANDLER=0 ./Xpra.app/Contents/MacOS/Xpra ...

Edit: moved crash dump to an attachment attachment/ticket/924/osx-powersubsystem-crashdump.txt


Mon, 16 Nov 2015 15:30:23 GMT - Antoine Martin: attachment set

moving long comment to an attachment


Mon, 16 Nov 2015 15:34:39 GMT - Antoine Martin: attachment set

original crash dump data


Mon, 16 Nov 2015 15:42:26 GMT - Antoine Martin: description changed

(edited the ticket description and comment above to try to make it more readable - the full crash dumps are still there as attachments, I've kept the more interesting bits)

@kverne: I'm not sure the syntax you've used in comment:12 will do what is needed. Can you please try this instead:

XPRA_OSX_SLEEP_HANDLER=0 ./Xpra.app/Contents/MacOS/Xpra attach ...

Mon, 16 Nov 2015 18:38:38 GMT - Kerry:

With the following command, xpra simply exits when the server times out. No crash message displayed.

XPRA_OSX_SLEEP_HANDLER=0 ./Xpra.app/Contents/MacOS/Xpra attach ssh:server1:100
2015-11-16 11:05:28,370 Xpra gtk2 client version 0.16.0-r11122
2015-11-16 11:05:28,371  running on Mac OSX
2015-11-16 11:05:28,992 OpenGL_accelerate module loaded
2015-11-16 11:05:28,998 OpenGL enabled with Intel Iris Pro OpenGL Engine
2015-11-16 11:05:29,021  using default keyboard settings
2015-11-16 11:05:29,037  desktop size is 4000x1440 with 1 screen(s):
<repeating idle messages removed>
2015-11-16 11:31:33,936  UI thread polling waited 9.4 seconds longer than intended (9.9 vs 0.5)
2015-11-16 11:31:33,936 UI thread is now blocked
2015-11-16 11:31:33,937 UI thread is running again, resuming
2015-11-16 11:31:39,966 UI thread is now blocked
2015-11-16 11:31:40,022 UI thread is running again, resuming
2015-11-16 11:31:42,276 server is not responding, drawing spinners over the windows
Write failed: Broken pipe
2015-11-16 11:32:00,797 Connection lost

Tue, 17 Nov 2015 02:45:52 GMT - Antoine Martin: owner, status, milestone changed

@kverne: many thanks!

r11252 makes this the default - I will backport it.

I am keeping this ticket open because we do want to fix the power event handler, but this is too late for this release so re-scheduling it.


Sat, 27 Feb 2016 13:32:05 GMT - Antoine Martin: status changed; resolution set

Closing this for now since the crashes have been fixed (backports in r11265), created a new ticket for the power event handler: #1137.


Tue, 29 Mar 2016 07:19:54 GMT - Antoine Martin:

@kverne: please see ticket:1137#comment:2.


Tue, 05 Apr 2016 12:55:27 GMT - Kerry:

Unfortunately, (for me) all the 17.x beta installs fail due to plist errors. I'll file another ticket if you want the full dump, but here's the latest:

4/5/16 8:55:23.773 AM Finder[318]: There was an error parsing the Info.plist for the bundle at URL Contents/Info.plist -- file:///Applications/Xpra.app/Contents/Xpra_NoDock.app/
 The data couldn’t be read because it isn’t in the correct format.
 <CFBasicHash 0x610000e6b940 [0x7fff747b1ed0]>{type = immutable dict, count = 2,
entries =>
	0 : <CFString 0x7fff7477c660 [0x7fff747b1ed0]>{contents = "NSDebugDescription"} = <CFString 0x610000e48640 [0x7fff747b1ed0]>{contents = "Encountered unexpected EOF"}
	1 : <CFString 0x7fff747896c0 [0x7fff747b1ed0]>{contents = "kCFPropertyListOldStyleParsingError"} = Error Domain=NSCocoaErrorDomain Code=3840 "The data couldn’t be read because it isn’t in the correct format." (Malformed data byte group at line 1; invalid hex) UserInfo=0x61000066f200 {NSDebugDescription=Malformed data byte group at line 1; invalid hex}
}

Tue, 05 Apr 2016 13:02:06 GMT - Antoine Martin:

@kverne: I've tested a few of these images on various virtual machines and cannot reproduce any problems whatsoever with any of them. I very strongly suspect that you've corrupted the image somehow, through lack of disk space or partially overwriting a previous installation. Try deleting the Xpra.app or just running directly from the DMG.


Tue, 05 Apr 2016 13:27:40 GMT - Kerry:

My laptop is very consistent at corrupting the image, and always UserInfo?.

When I run from the package, it fails to connect the first two times, then appears to run with no displayed windows.

(15.7 has no issues connecting. )


Sat, 27 Jun 2020 16:53:05 GMT - Antoine Martin:

Similar crash in #2822


Sat, 23 Jan 2021 05:09:55 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/924