xpra icon
Bug tracker and wiki

Opened 3 years ago

Closed 3 years ago

Last modified 18 months ago

#492 closed defect (fixed)

suspending a local client with opengl windows can show corrupted pixels

Reported by: Antoine Martin Owned by: alas
Priority: critical Milestone:
Component: client Version:
Keywords: opengl Cc:

Description (last modified by Antoine Martin)

This is only relevant to local servers: resuming a client connected to a remote server should break the connection (eventually - we may want to break it quicker then) which is fine.

On Linux, we should be able to get the event from the UPower Resuming dbus signal. Found some example code:

We could use this same code to force a server encoder refresh too, as hardware encoders (nvenc / opencl / cuda) tend to get messed up during suspend-resume, but we only notice next time we try to use them.

On win32, we could detect WM_POWERBROADCAST events.

Then we can just ask for a server lossless refresh to make sure the windows display clean contents.

This looks like a driver bug to me: the GPU buffers should be preserved, maybe it is the OpenGL paint state that is inconsistent?

Attachments (3)

dbus-suspendresume-notifications.patch (2.2 KB) - added by Antoine Martin 3 years ago.
dbus hooks for trying to get suspend/resume notifications
watch_PrepareForSleep.py (689 bytes) - added by Antoine Martin 3 years ago.
dbus script using the new login1 interface
xterm-resume.png (459.4 KB) - added by Antoine Martin 3 years ago.
this is what my xterm looked like when I resumed

Download all attachments as: .zip

Change History (20)

comment:1 Changed 3 years ago by Antoine Martin

Owner: changed from Antoine Martin to Antoine Martin
Status: newassigned

Easy to reproduce, and should be easy to fix too.

comment:2 Changed 3 years ago by Antoine Martin

Description: modified (diff)

comment:3 Changed 3 years ago by Antoine Martin

The dbus approach sounds nice, except it doesn't work... I can't get any of the code examples to fire. This is also meant to fire the same Resuming signal (found in /usr/lib/systemd/system/upower.service), but does nothing:

dbus-send --system --type=signal --dest=org.freedesktop.UPower \
    /org/freedesktop/UPower org.freedesktop.UPower.Resuming

Posted a question here: system suspend - dbus upower signals are not seen

I have now also created a Fedora ticket for this: bugzilla 1064906

Last edited 3 years ago by Antoine Martin (previous) (diff)

Changed 3 years ago by Antoine Martin

dbus hooks for trying to get suspend/resume notifications

comment:4 Changed 3 years ago by Antoine Martin

According to this answer: Newer upower versions no longer emit that signal since this handled by systemd. Now I have to ask systemd what we're supposed to do... this is not going to make the code any nicer!

comment:5 Changed 3 years ago by Antoine Martin

The systemd / logind equivallent is PrepareForSleep:
The PrepareForShutdown() resp. PrepareForSleep() signals are sent right before (with the argument True) and after (with the argument False) the system goes down for reboot/poweroff, resp. suspend/hibernate.

So it looks like we need to look for logind and listen for this new signal, and fallback to upower otherwise.

Last edited 3 years ago by Antoine Martin (previous) (diff)

Changed 3 years ago by Antoine Martin

Attachment: watch_PrepareForSleep.py added

dbus script using the new login1 interface

comment:6 Changed 3 years ago by Antoine Martin

Owner: changed from Antoine Martin to alas
Status: assignednew

r5821 simply logs the suspend and resume events, like so:

2014-03-17 18:55:55,400 system is suspending
2014-03-17 20:09:40,209 system resumed, was suspended for 1:13:44

afarr: please test that the message does show up on the platforms that are meant to be already supported and which I am unable to test as virtualbox does not support OS level suspend and resume:

  • win32
  • linux without systemd (ie: debian or ubuntu)
  • linux with systemd is tested already (Fedora 20)

Eventually, we may also fire other actions from those callbacks to notify the server or re-connect if necessary.

r5824 contains some critical fixes, and r5826 fires the window refresh.
It works here. Bug fixed.

As for OSX... it's never simple, and again I won't be able to test with virtualbox, here are some pointers:

Last edited 3 years ago by Antoine Martin (previous) (diff)

comment:7 Changed 3 years ago by Antoine Martin

OSX is done in r5828, it's not pretty but it works!

So please test this too - on virtualbox, suspending via the apple menu, shows "suspending", followed by "resuming" just 2 seconds later. So it seems to be working.

comment:8 Changed 3 years ago by alas

I don't currently have access to a debian or ubuntu system for testing.

  • win32 - I think I'm missing something about how I can test this.

You explicitly mention the following:

This is only relevant to local servers: resuming a client connected to a remote server should break the connection (eventually - we may want to break it quicker then) which is fine.

... which I will confirm is the case. With a win32 (0.12.0 r5828) client attached to a fedora 19 server, when I suspend (sleep) the windows 7 machine the connection is nearly instantly severed. The server session carries on happily, but the client disconnects.

However, when I try to run a "local server" - I am informed that "(This xpra installation does not support starting local servers.)"

C:\Program Files (x86)\Xpra>xpra_cmd.exe --no-daemon --bind-tcp=0.0.0.0:1201 --s
tart-child=xterm --start-child=xterm start :17
Usage:
        xpra_cmd.exe attach [DISPLAY]
        xpra_cmd.exe detach [DISPLAY]
        xpra_cmd.exe screenshot filename [DISPLAY]
        xpra_cmd.exe info [DISPLAY]
        xpra_cmd.exe control DISPLAY command [arg1] [arg2]..
        xpra_cmd.exe version [DISPLAY]
        xpra_cmd.exe shadow [DISPLAY]
(This xpra installation does not support starting local servers.)
  • Is there a windows package/option that does support starting a local server?

comment:9 Changed 3 years ago by Antoine Martin

Sorry I should have made this clearer: although the visual corruption is only relevant to local servers, as only local servers will still be connected when resumed (usually - but this also works with virtual machines on the same host), the suspend & resume state detection code is what I am interested in.
The lines:

system is suspending
system resumed, was suspended for XX:XX:XX

And whether the state detection is timely and accurate.

I don't currently have access to a debian or ubuntu system for testing


I believe smo does, you can re-assign to him once you have tested the platforms you do have.

FYI: it may be used in the future (ie: #493), and will probably be used in this release to warn the server that a disconnection event is likely, and stop wasting bandwidth sending data that will never arrive at its destination - as per #543. I can only add this code once I am confident that the suspend and resume events are received reliably.

Last edited 3 years ago by Antoine Martin (previous) (diff)

comment:10 Changed 3 years ago by Antoine Martin

Priority: minorcritical

Raising as this is blocking #543

comment:11 Changed 3 years ago by alas

Trying to test with windows 7, I'm not seeing any suspend messages.

  • Setting the sleep time to 1 minute and waiting, when the machine goes to sleep the connection is maintained, but there are no messages (neither on the client-side nor server-side). The closest thing I see, client-side, is: unexpected message: 50006 / 0 / 0.
  • Trying to use the start menu to force sleep and then restart it promptly, I get nothing server-side, and client-side I see something along the following lines:
    2014-03-20 12:13:11,664 re-starting speaker because of overrun
    2014-03-20 12:13:12,351 using audio codec: MPEG 1 Audio, Layer 3 (MP3)
    2014-03-20 12:13:38,335 unexpected message: WM_POWERBROADCAST / 4 / 0
    2014-03-20 12:13:40,029 re-starting speaker because of overrun
    2014-03-20 12:13:40,717 using audio codec: MPEG 1 Audio, Layer 3 (MP3)
    2014-03-20 12:13:49,749 unexpected message: WM_POWERBROADCAST / 18 / 0
    2014-03-20 12:13:49,815 unexpected message: WM_POWERBROADCAST / 7 / 0
    2014-03-20 12:13:49,826 unexpected message: WM_TIMECHANGE / 0 / 0
    2014-03-20 12:13:49,977 server is not responding, drawing spinners over the windows
    2014-03-20 12:13:57,947 server is OK again
    2014-03-20 12:13:57,960 re-starting speaker because of overrun
    2014-03-20 12:13:59,098 using audio codec: MPEG 1 Audio, Layer 3 (MP3)
    
  • If I use start menu to induce sleep, and leave it asleep for a few minutes, I lose the connection, again with no system suspension messages client or server side. Client message is as follows:
    2014-03-20 12:18:29,937 unexpected message: WM_POWERBROADCAST / 4 / 0
    2014-03-20 12:19:50,506 server ping timeout - waited 60 seconds without a response
    2014-03-20 12:19:52,766 server is not responding, drawing spinners over the windows
    2014-03-20 12:19:52,811 Connection lost
    2014-03-20 12:19:52,953 server is not responding, drawing spinners over the windows
    

Is there a different suspend mode that you have in mind for the windows client while connected, other than sleep?

  • Trying the Hibernate option, again I just lose the connection:
    2014-03-20 12:24:54,490 unexpected message: WM_POWERBROADCAST / 4 / 0
    2014-03-20 12:24:58,250 unexpected message: WM_NCCALCSIZE / 1 / 1635532
    2014-03-20 12:24:58,286 unexpected message: WM_WINDOWPOSCHANGED / 0 / 1635572
    2014-03-20 12:24:58,349 unexpected message: WM_NCCALCSIZE / 1 / 1634520
    2014-03-20 12:24:58,414 unexpected message: 798 / 0 / 0
    2014-03-20 12:27:31,762 unexpected message: WM_TIMECHANGE / 0 / 0
    2014-03-20 12:27:32,118 unexpected message: WM_POWERBROADCAST / 7 / 0
    2014-03-20 12:27:33,349 server is not responding, drawing spinners over the windows
    2014-03-20 12:27:33,960 unexpected message: WM_POWERBROADCAST / 18 / 0
    2014-03-20 12:27:39,076 unexpected message: WM_NCCALCSIZE / 1 / 1635532
    2014-03-20 12:27:39,085 unexpected message: WM_WINDOWPOSCHANGED / 0 / 1635572
    2014-03-20 12:27:39,098 unexpected message: WM_NCCALCSIZE / 1 / 1634520
    2014-03-20 12:27:39,108 unexpected message: 798 / 0 / 0
    2014-03-20 12:27:42,506 unexpected message: WM_WININICHANGE / 47 / 582344
    2014-03-20 12:27:44,555 unexpected message: WM_WININICHANGE / 47 / 582344
    

Server side, I just get the "Disconnecting ... reason is: client ping timeout, - waited 60 seconds without a response" message.

comment:12 Changed 3 years ago by alas

I think I found the problem - just noticed the previous testing was with r5444, repeating with r5828...

  • Setting the sleep time, there is a new unexpected message (still noting server-side):
    2014-03-20 13:36:30,243 unexpected message: 49841 / 0 / 0
    2014-03-20 13:38:04,336 unexpected message: 49841 / 0 / 0
    
  • Forcing a sleep, followed by prompt awakening - I get the suspend message client-side:
2014-03-20 13:40:49,585 system is suspending
2014-03-20 13:40:54,286 server is not responding, drawing spinners over the windows
2014-03-20 13:40:57,993 system resumed, was suspended for 0:00:08
2014-03-20 13:40:59,197 server is OK again
2014-03-20 13:40:59,891 re-starting speaker because of overrun
2014-03-20 13:41:03,323 using audio codec: MPEG 1 Audio, Layer 3 (MP3)
2014-03-20 13:41:05,269 re-starting speaker because of overrun
2014-03-20 13:41:06,025 using audio codec: MPEG 1 Audio, Layer 3 (MP3)
  • Forcing a longer sleep - I still get the suspension messages, along with disconnection:
2014-03-20 13:42:21,933 system is suspending
2014-03-20 13:42:24,628 server is not responding, drawing spinners over the windows
2014-03-20 13:43:40,279 system resumed, was suspended for 0:01:18
2014-03-20 13:43:40,358 WM_TIMECHANGE: time change event: 0 / 0
2014-03-20 13:43:40,390 server ping timeout - waited 60 seconds without a response
2014-03-20 13:43:41,920 Connection lost

comment:13 Changed 3 years ago by alas

Testing with osx r5458 ...

  • With a timer induced sleep, I get no messages at all.
  • With a short sleep I get the suspension messages:
2014-03-20 13:54:15,269 system is suspending
2014-03-20 13:54:18,129 re-starting speaker because of overrun
2014-03-20 13:54:26,124 server is not responding, drawing spinners over the windows
2014-03-20 13:54:42,131 system resumed, was suspended for 0:00:26
2014-03-20 13:54:47,912 using audio codec: MPEG 1 Audio, Layer 3 (MP3)
2014-03-20 13:54:47,918 server is OK again
2014-03-20 13:54:47,920 re-starting speaker because of overrun
2014-03-20 13:54:50,492 using audio codec: MPEG 1 Audio, Layer 3 (MP3)
  • With a longer sleep I don't get the server resume message, but I get the suspension message:
2014-03-20 13:57:35,566 system is suspending
2014-03-20 13:58:54,053 server is not responding, drawing spinners over the windows
2014-03-20 13:59:00,285 read connection reset for SocketConnection(('10.0.11.191', 51408) - ('10.0.32.172', 1201))
2014-03-20 13:59:00,287 connection lost: read connection reset: [Errno 54] Connection reset by peer
2014-03-20 13:59:00,289 Connection lost

comment:14 Changed 3 years ago by Antoine Martin

I've added a test application ("Events_Test.exe") for win32 in r5873, which should make it easier to investigate power events.


I've hooked power events into the window refresh code in r5875 - see #543.
More follow up work in #540.


afarr: The OpenGL issue remains fixed, so please just check that a quick suspend-resume cycle works as well as it did before and then close this ticket.

comment:15 Changed 3 years ago by maxmylyn

Tested with r5903:

  • OSX and Windows behavior works as well as it did.

However, on my laptop I was not able to get a resume even with a short sleep cycle, instead only the following errors printed regardless of sleep length (I only pasted the relevant prints).

2014-03-24 15:55:42,141 system is suspending
2014-03-24 15:55:43,796 read error for SocketConnection(('10.0.11.77', 54092) - ('10.0.32.172', 1200))
Traceback (most recent call last):
  File "xpra\net\protocol.pyc", line 606, in _io_thread_loop
  File "xpra\net\protocol.pyc", line 660, in _read
  File "xpra\net\bytestreams.pyc", line 117, in read
  File "xpra\net\bytestreams.pyc", line 60, in _read
  File "xpra\net\bytestreams.pyc", line 52, in untilConcludes
  File "xpra\net\bytestreams.pyc", line 22, in untilConcludes
error: [Errno 10053] An established connection was aborted by the software in your host machine
2014-03-24 15:55:43,798 connection lost: read error on connection: [Errno 10053] An established connection was aborted by the software in your host machine
2014-03-24 15:55:43,798 Connection lost

This seems to be only related to the individual laptop's sleep cycle, if it seems worth pursuing let me know and I'll test it further; otherwise this is good to be closed.

comment:16 Changed 3 years ago by Antoine Martin

Resolution: fixed
Status: newclosed

Judging by the Microsoft KB entry, the error above is a WSAECONNABORTED: An established connection was aborted by the software in your host computer, possibly due to a data transmission time-out or protocol error. It may be specific to the machine drivers or BIOS. The full list is here: Windows Sockets Error Codes

In any case, this led to a "connection lost", which is fine. (what we don't want is for the connection to stay up after we told the server to slow down, without telling it to speed up again)

r5904 should remove the ugly stacktrace (we add a bunch of win32 specific error codes to the ignore list)

Last edited 3 years ago by Antoine Martin (previous) (diff)

Changed 3 years ago by Antoine Martin

Attachment: xterm-resume.png added

this is what my xterm looked like when I resumed

comment:17 Changed 20 months ago by Antoine Martin

r10573 should finally fix this properly: refreshing the pixels is not always enough, we may have to also reinitialize the window backing.

See also: #901, #924

Last edited 18 months ago by Antoine Martin (previous) (diff)
Note: See TracTickets for help on using tickets.