xpra icon
Bug tracker and wiki

Opened 3 years ago

Closed 3 years ago

#861 closed defect (fixed)

xor error with differing pixel format

Reported by: Antoine Martin Owned by: Antoine Martin
Priority: blocker Milestone: 0.16
Component: encodings Version: 0.15.x
Keywords: Cc: onlyjob@…

Description

Testing 0.15 through the proxy server, got this error:

error processing draw packet
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/xpra/client/ui_client_base.py", line 1975, in _draw_thread_loop
    self._do_draw(packet)
  File "/usr/lib64/python2.7/site-packages/xpra/client/ui_client_base.py", line 2021, in _do_draw
    window.draw_region(x, y, width, height, coding, data, rowstride, packet_sequence, options, [record_decode_time])
  File "/usr/lib64/python2.7/site-packages/xpra/client/client_window_base.py", line 423, in draw_region
    self._backing.draw_region(x, y, width, height, coding, img_data, rowstride, options, callbacks)
  File "/usr/lib64/python2.7/site-packages/xpra/client/window_backing_base.py", line 473, in draw_region
    self.paint_rgb24(img_data, x, y, width, height, rowstride, options, callbacks)
  File "/usr/lib64/python2.7/site-packages/xpra/client/window_backing_base.py", line 264, in paint_rgb24
    rgb24_data = self.process_delta(raw_data, width, height, rowstride, options)
  File "/usr/lib64/python2.7/site-packages/xpra/client/window_backing_base.py", line 186, in process_delta
    rgb_data = xor_str(img_data, ldata)
  File "xpra/codecs/xor/cyxor.pyx", line 19, in xpra.codecs.xor.cyxor.xor_str (xpra/codecs/xor/cyxor.c:736)
    assert len(buf)==len(xor), "cyxor cannot xor strings of different lengths (%s:%s vs %s:%s)" % (type(buf), len(buf), type(xor), len(xor))
AssertionError: cyxor cannot xor strings of different lengths (<type 'str'>:38192 vs <type 'str'>:28644)

Which is what you would expect if you were xoring RGBX with RGB (25% bigger).
I don't see how that is possible because we verify the pixel format AND the size of the buffer before we xor on the server side...

Attachments (1)

strict-delta-coding.patch (2.1 KB) - added by Antoine Martin 3 years ago.
ensures we never mix encodings with delta - which could make us mix pixel formats on the receiving end

Download all attachments as: .zip

Change History (13)

comment:1 Changed 3 years ago by Antoine Martin

Status: newassigned

Although I cannot reproduce this right now, I suspect that I hit this issue whilst testing through the proxy server, which does things with "draw" packets.
If I cannot find the bug, we can always disable delta buckets (set it to 0) when going through the proxy.

comment:2 Changed 3 years ago by Antoine Martin

Priority: criticalminor

Managed to reproduce, at least something similar...
with:

  • server started with:
    xpra start :10 --bind-tcp=0.0.0.0:10000 --start-child=xterm
    
  • proxy server started with:
    XPRA_PROXY_PASSTHROUGH=1 xpra proxy :20 --bind-tcp=0.0.0.0:10001 --auth=allow -d encoding
    
  • client started with:
    xpra attach tcp:192.168.42.100:10001 --no-mmap --password-file=./password.txt
    

Notes:

  • r9434 makes it easier to debug encoding related issues with the proxy server (adds -d encoding support)
  • r9399 adds XPRA_PROXY_PASSTHROUGH to make it easier to trigger and debug the passthrough code. Which normally only fires when the video context creation fails and PIL is missing... (I believe I was hitting the first condition because of r9413, not sure about the second!?)

The fix for this bug is in r9435, which also makes the code more generic: we no longer assume the pixels format is BGRX or RGBX format (with X aka alpha at the end).
Backported to v0.14.x and v0.15.x in r9438.

I am keeping this ticket open because I am not convinced that the "25% bigger" in the ticket description is necessarily the same bug.
In any case, we should also not assume that "rgb32" is available in all clients.
But since I can no longer trigger any proxy encoding bugs, I am also lowering the priority.

comment:3 Changed 3 years ago by Antoine Martin

Priority: minorblocker

I have hit this again, without the proxy... raising.

Changed 3 years ago by Antoine Martin

Attachment: strict-delta-coding.patch added

ensures we never mix encodings with delta - which could make us mix pixel formats on the receiving end

comment:4 Changed 3 years ago by Antoine Martin

Resolution: fixed
Status: assignedclosed

Maybe r9238 is wrong because maybe we do care what encoding is used. Different encodings use different decoders, and some might give us RGB, others RGBX... for the same input pixel format. Which could explain the differing sizes.
So r9448 makes the checks more strict. (I couldn't get it to trigger during testing, but we should not be making assumptions like these)

But that's not all, setting XPRA_MAX_DELTA_HITS low enough does trigger the problem.
The fix is obvious and embarrassing: see r9449.

Backports in r9450.

Last edited 3 years ago by Antoine Martin (previous) (diff)

comment:5 Changed 3 years ago by Antoine Martin

Resolution: fixed
Status: closedreopened

Seen the bug again.

Stacktraces always seem to come in pairs. Could be related to the window edges?

Last edited 3 years ago by Antoine Martin (previous) (diff)

comment:6 Changed 3 years ago by Antoine Martin

More fixes for multi delta and the expiry code, see #866.

Those should be different from the "25% bigger" bug reported here though...
If it does occur again, we can also enable XPRA_INTEGRITY_HASH=1 as per ticket:866#comment:7 (maybe the debug output will help)

comment:7 Changed 3 years ago by Antoine Martin

Maybe only occurring with opengl=on and rgb24 as per ticket:891#comment:10. If that's true, I have no idea why.

Last edited 3 years ago by Antoine Martin (previous) (diff)

comment:8 Changed 3 years ago by onlyjob

Cc: onlyjob@… added

comment:9 Changed 3 years ago by Antoine Martin

Found the problem, we sometimes call rgb_reformat to save space (from say BGRX to RGB - which is 25% smaller!), and the client may not need to convert it back: with opengl we can just upload RGB to the gpu.
But for xoring, we need the same format as the original...
This is not new to 0.15, but it just makes it easier to hit.

Fix on its way, but if it's too big the backport may more brutal: maybe just disable delta altogether!

Last edited 3 years ago by Antoine Martin (previous) (diff)

comment:10 Changed 3 years ago by Antoine Martin

Now I know:

  • why this only affects OR windows like menus: we send the pixels for those windows before the client maps the window, and therefore before we know the full list of pixel modes which are going to be available - and so we use conservative values.. like RGB - by the time the second window update comes along, we have the full list, and we may send the pixels in a different format
  • this only affects opengl=on, because the non opengl backends don't support BGRX directly

comment:11 Changed 3 years ago by Antoine Martin

Fixed in r9970: we just don't bother with delta if the pixel format is going to require reformatting as it's too difficult to guarantee that the client will be able to reconstruct the same buffer format at the other end in that case. (we often drop the unused 'X' part for example).
Backports to v0.14.x and v0.15.x in r9971.

I still want to investigate why those errors didn't trigger a paint refresh. All paint errors should at worst cause a log message and slower repaint, not persistent visual corruption.

comment:12 Changed 3 years ago by Antoine Martin

Resolution: fixed
Status: reopenedclosed

r9978 ensures we refresh the whole window if we ever get client-side decoding errors (previously this only worked for video modes) - this may be worth applying to 0.15 at some point.

I think this ticket is done, feel free to re-open if you ever hit it again.
Note: as of r9976, trunk is likely to fail with a more precise error message which will look like this: delta region uses XXXX format, was expecting XXXX (where the XXXXs are usually BGRX and RGB)

Note: See TracTickets for help on using tickets.