While testing #1593 (now closed) I kept running into the same error over and over again that led to a number of client crashes.
My server is a trunk r16446 and the server and client machines are both running Fedora 25.
I started my server with xpra start :13 --html=on --no-daemon --bind-tcp=0.0.0.0: --start-new-commands=yes --start-child=firefox --start-via-proxy=no
and connected via the HTML5 client using Firefox.
To get the crash I've found that just simply scrolling up and down rather vigorously on the Fedora homepage https://start.fedoraproject.org will usually cause the crash after a bit of time. Otherwise I've ran into it intermittently while looking for other bugs. Just before the crash the logs will usually print:
uncaught exception: invalid packet header format: 140: (unknown) uncaught exception: we can't handle this protocol flag yet, sorry
Sadly enabling the debugging option doesn't print anything more useful.
Slight update:
In the time that I took to write this ticket I managed to repro it just by switching tabs in Firefox.
led to a number of client crashes ... connected via the HTML5 client using Firefox.
Please describe the "client crash": do you seen the spinners, do you get disconnected or does the browser actually crash?
To get the crash I've found that just simply scrolling up and down rather vigorously...
That doesn't cause any problems here, no matter how vigorously or how long I scroll. (tested with both Fedora 25 and 26)
Just before the crash the logs will usually print:
uncaught exception: invalid packet header format: 140: (unknown)
Is the invalid packet header always 140? Did you get any other codes?
In the time that I took to write this ticket I managed to repro it just by switching tabs in Firefox.
Doesn't do anything here.
Since I very much doubt that the server is suddenly starting to send invalid packet headers to the HTML5 client, something else is either corrupting the websocket stream or Firefox is just dropping it - for whatever reason. Websockify is very well tested and used by many other projects, so that's unlikely to be mishandling the packets. There could be a firewall or a proxy or something on your network, try to see if using wss helps, or maybe a secure tunnel - anything to rule that out.
If that still doesn't help, please try to provide:
Since I couldn't reproduce it the "normal" way, r16448 makes it possible to send invalid raw packets:
XPRA_SEND_INVALID_PACKET_DATA="Whatever" XPRA_SEND_INVALID_PACKET=10 xpra ...
This will send "Whatever" as a raw packet 10 seconds after the connection is made.
With this in place and a client that sends lots of screen updates (ie: glxgears), I noticed that we could get stuck in the packet processing class whenever we receive a broken packet (trying to process the same one over and over again) - so r16449 ensures we trigger the close / re-connection, and ignore any further events from this dead connection. With this in place, I see a quick screen flicker as the client re-connects (every 10 seconds with the env var above)
Please describe the "client crash": do you seen the spinners, do you get disconnected or does the browser actually crash?
I get disconnected; the browser doesn't crash.
That being said, now that I've upped my server to r16498, I have been able to reproduce the crash I got in the original post, but instead of my client getting disconnected entirely, I get a slight screen flicker. Looks like that fix is what did it.
Closing.
This is not fixed, the flicker is caused by the workaround. We still don't know why the invalid packet occurs.
Okay I upped my server to r16516 and ran it with -d network
until it disconnected me.
Here's the logs:
2017-07-25 10:42:11,857 will process ui packet sound-control 2017-07-25 10:42:11,861 peer: (0, -1, -1) 2017-07-25 10:42:11,861 peer: (0, -1, -1) 2017-07-25 10:42:11,861 new_connection((1, <socket._socketobject object at 0x7f5ec46fd590>)) sock=<socket._socketobject object at 0x7f5eb34a0c90>, timeout=0.1, sockname=('192.168.1.138', 2200), address=('192.168.1.140', 50304), peername=('192.168.1.140', 50304). timeout=0.1 2017-07-25 10:42:11,862 192.168.1.140: Plain non-SSL (ws://) WebSocket connection 2017-07-25 10:42:11,863 192.168.1.140: Version hybi-13, base64: 'False' 2017-07-25 10:42:11,863 io_thread_loop(read, <bound method Protocol._read of Protocol(tcp websocket: 192.168.1.138:2200 <- 192.168.1.140:50304)>) loop starting 2017-07-25 10:42:11,874 read_parse_thread_loop starting 2017-07-25 10:42:11,875 processing packet hello 2017-07-25 10:42:11,875 will process ui packet hello 2017-07-25 10:42:11,875 enable_compressor_from_caps(..) options=['lz4', 'lzo', 'zlib'] 2017-07-25 10:42:11,875 enable_compressor(lz4): <function lz4_compress at 0x7f5ec59abc80> 2017-07-25 10:42:11,875 get_enabled_encoders(['rencode', 'bencode', 'yaml']) enabled=['rencode', 'bencode'] 2017-07-25 10:42:11,875 enable_encoder_from_caps(..) options=['rencode', 'bencode'] 2017-07-25 10:42:11,875 get_enabled_encoders(['bencode', 'rencode', 'yaml']) enabled=['rencode', 'bencode'] 2017-07-25 10:42:11,875 enable_encoder(bencode): <function do_bencode at 0x7f5ec59b0b18> 2017-07-25 10:42:11,876 Handshake complete; enabling connection 2017-07-25 10:42:11,876 disconnect_client(Protocol(tcp websocket: 192.168.1.138:2200 <- 192.168.1.140:50242), new client, ('this session does not allow sharing',)) 2017-07-25 10:42:11,876 disconnect_protocol(Protocol(tcp websocket: 192.168.1.138:2200 <- 192.168.1.140:50242), new client, ('this session does not allow sharing',)) 2017-07-25 10:42:11,876 Disconnecting client 192.168.1.140:50242: 2017-07-25 10:42:11,876 new client (this session does not allow sharing) 2017-07-25 10:42:11,876 flush_then_close(['disconnect', 'new client', 'this session does not allow sharing'], None) closed=False 2017-07-25 10:42:11,876 flush_then_close: wait_for_write_lock() 2017-07-25 10:42:11,876 flush_then_close: acquired the write lock 2017-07-25 10:42:11,876 flush_then_close: queue is now empty, sending the last packet and closing 2017-07-25 10:42:11,876 cleanup_protocol(Protocol(tcp websocket: 192.168.1.138:2200 <- 192.168.1.140:50242)) 2017-07-25 10:42:11,877 flush_then_close: packet_queued() closed=False 2017-07-25 10:42:11,878 flush_then_close: wait_for_packet_sent() queue.empty()=True, closed=False 2017-07-25 10:42:11,879 flush_then_close: wait_for_packet_sent() close_and_release() 2017-07-25 10:42:11,879 Protocol.close() closed=False, connection=tcp websocket: 192.168.1.138:2200 <- 192.168.1.140:50242 2017-07-25 10:42:11,882 Protocol.close() calling <bound method WebSocketConnection.close of tcp websocket: 192.168.1.138:2200 <- 192.168.1.140:50242> 2017-07-25 10:42:11,882 tcp websocket: 192.168.1.138:2200 <- 192.168.1.140:50242.close() for socket={'fileno': 16, 'type': 'UNIX', 'family': 'DGRAM', 'timeout': 0, 'proto': 0} 2017-07-25 10:42:11,882 tcp websocket: 192.168.1.138:2200 <- 192.168.1.140:50242.close() done 2017-07-25 10:42:11,882 terminate_queue_threads() 2017-07-25 10:42:11,883 Protocol.close() done 2017-07-25 10:42:11,883 parse thread: empty marker, exiting 2017-07-25 10:42:11,883 flush_then_close: done, callback=None 2017-07-25 10:42:11,883 io_thread_loop(write, <bound method Protocol._write of Protocol(None)>) loop ended, closed=True 2017-07-25 10:42:11,884 cleanup_source(X11ServerSource(1 : Protocol(None))) remaining sources: [] 2017-07-25 10:42:11,884 xpra client 1 disconnected. 2017-07-25 10:42:11,885 accept_client(Protocol(tcp websocket: 192.168.1.138:2200 <- 192.168.1.140:50304), {'named_cursors': 0, 'encoding.h264.deblocking-filter': 0, 'bell': 1, 'lz4': 1, 'yaml': 0, 'keyboard': 1, 'file-transfer': 1, 'sound.receive': 1, 'digest': ['hmac', 'hmac+md5', 'xor', 'hmac+md5', 'hmac+sha1', 'hmac+sha256'], 'xkbmap_print': '', 'platform.platform': '5.0 (X11)', 'clipboard.want_targets': 1, 'uuid': '038d816b-d4f0-ae56-ba2a-e4638c455cc6', 'encodings.core': ['jpeg', 'png', 'rgb', 'rgb32'], 'encoding.vp8+webm.score-delta': 50, 'encoding.mpeg4+mp4.score-delta': 50, 'encoding.h264+mp4.YUV420P.level': '3.0', 'sound.server_driven': 1, 'session-type.full': 'Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0', 'platform': 'Posix', 'encoding.scrolling': 1, 'encodings.rgb_formats': ['RGBX', 'RGBA'], 'encodings.cursor': ['png'], 'randr_notify': 1, 'sound.send': 0, 'screen_sizes': [['HTML', 1877, 1109, 497, 293, [['Canvas', 0, 0, 1877, 1109, 497, 293]], 0, 0, 1877, 1109]], 'username': '', 'auto_refresh_delay': 500, 'encoding.h264.YUV420P.profile': 'baseline', 'encoding.transparency': 1, 'xkbmap_layout': 'us', 'notifications': 1, 'xkbmap_query': '', 'encoding.rgb24zlib': 1, 'metadata.supported': ['fullscreen', 'maximized', 'above', 'below', 'title', 'size-hints', 'class-instance', 'transient-for', 'window-type', 'decorations', 'override-redirect', 'tray', 'modal', 'opacity'], 'lzo': 0, 'encoding.x264.YUV420P.profile': 'baseline', 'clipboard_enabled': 1, 'rencode': 0, 'window.raise': 1, 'encoding.client_options': 1, 'platform.name': 'Posix', 'desktop_mode_size': [1877, 1109], 'client_type': 'HTML5', 'window.initiate-moveresize': 1, 'steal': 1, 'dpi': 96, 'encoding.video_scaling': 1, 'zlib': 1, 'desktop_size': [1877, 1109], 'encoding.h264+mp4.score-delta': 50, 'share': 0, 'lz4.js.version': '0.5.1', 'sound.decoders': [], 'cursors': 1, 'encodings': ['jpeg', 'png', 'rgb', 'rgb32'], 'encoding.h264.cabac': 0, 'generic-rgb-encodings': 1, 'xkbmap_keycodes': [[8, 'BackSpace', 8, 0, 0], [9, 'Tab', 9, 0, 0], [12, 'KP_Begin', 12, 0, 0], [13, 'Return', 13, 0, 0], [16, 'Shift_L', 16, 0, 0], [17, 'Control_L', 17, 0, 0], [18, 'Alt_L', 18, 0, 0], [19, 'Pause', 19, 0, 0], [20, 'Caps_Lock', 20, 0, 0], [27, 'Escape', 27, 0, 0], [31, 'Mode_switch', 31, 0, 0], [32, 'space', 32, 0, 0], [33, 'Prior', 33, 0, 0], [34, 'Next', 34, 0, 0], [35, 'End', 35, 0, 0], [36, 'Home', 36, 0, 0], [37, 'Left', 37, 0, 0], [38, 'Up', 38, 0, 0], [39, 'Right', 39, 0, 0], [40, 'Down', 40, 0, 0], [42, 'Print', 42, 0, 0], [45, 'Insert', 45, 0, 0], [46, 'Delete', 46, 0, 0], [48, '0', 48, 0, 0], [49, '1', 49, 0, 0], [50, '2', 50, 0, 0], [51, '3', 51, 0, 0], [52, '4', 52, 0, 0], [53, '5', 53, 0, 0], [54, '6', 54, 0, 0], [55, '7', 55, 0, 0], [56, '8', 56, 0, 0], [57, '9', 57, 0, 0], [58, 'colon', 58, 0, 0], [59, 'semicolon', 59, 0, 0], [60, 'less', 60, 0, 0], [61, 'equal', 61, 0, 0], [62, 'greater', 62, 0, 0], [63, 'question', 63, 0, 0], [64, 'at', 64, 0, 0], [65, 'a', 65, 0, 0], [66, 'b', 66, 0, 0], [67, 'c', 67, 0, 0], [68, 'd', 68, 0, 0], [69, 'e', 69, 0, 0], [70, 'f', 70, 0, 0], [71, 'g', 71, 0, 0], [72, 'h', 72, 0, 0], [73, 'i', 73, 0, 0], [74, 'j', 74, 0, 0], [75, 'k', 75, 0, 0], [76, 'l', 76, 0, 0], [77, 'm', 77, 0, 0], [78, 'n', 78, 0, 0], [79, 'o', 79, 0, 0], [80, 'p', 80, 0, 0], [81, 'q', 81, 0, 0], [82, 'r', 82, 0, 0], [83, 's', 83, 0, 0], [84, 't', 84, 0, 0], [85, 'u', 85, 0, 0], [86, 'v', 86, 0, 0], [87, 'w', 87, 0, 0], [88, 'x', 88, 0, 0], [89, 'y', 89, 0, 0], [90, 'z', 90, 0, 0], [91, 'Menu', 91, 0, 0], [92, 'Menu', 92, 0, 0], [93, 'KP_Enter', 93, 0, 0], [96, '0', 96, 0, 0], [97, '1', 97, 0, 0], [98, '2', 98, 0, 0], [99, '3', 99, 0, 0], [100, '4', 100, 0, 0], [101, '5', 101, 0, 0], [102, '6', 102, 0, 0], [103, '7', 103, 0, 0], [104, '8', 104, 0, 0], [105, '9', 105, 0, 0], [106, 'KP_Multiply', 106, 0, 0], [107, 'KP_Add', 107, 0, 0], [109, 'KP_Subtract', 109, 0, 0], [110, 'KP_Delete', 110, 0, 0], [111, 'KP_Divide', 111, 0, 0], [112, 'F1', 112, 0, 0], [113, 'F2', 113, 0, 0], [114, 'F3', 114, 0, 0], [115, 'F4', 115, 0, 0], [116, 'F5', 116, 0, 0], [117, 'F6', 117, 0, 0], [118, 'F7', 118, 0, 0], [119, 'F8', 119, 0, 0], [120, 'F9', 120, 0, 0], [121, 'F10', 121, 0, 0], [122, 'F11', 122, 0, 0], [123, 'F12', 123, 0, 0], [124, 'F13', 124, 0, 0], [125, 'F14', 125, 0, 0], [126, 'F15', 126, 0, 0], [127, 'F16', 127, 0, 0], [128, 'F17', 128, 0, 0], [129, 'F18', 129, 0, 0], [130, 'F19', 130, 0, 0], [131, 'F20', 131, 0, 0], [132, 'F21', 132, 0, 0], [133, 'F22', 133, 0, 0], [134, 'F23', 134, 0, 0], [135, 'F24', 135, 0, 0], [144, 'Num_Lock', 144, 0, 0], [145, 'Scroll_Lock', 145, 0, 0], [160, 'dead_circumflex', 160, 0, 0], [161, 'exclam', 161, 0, 0], [162, 'quotedbl', 162, 0, 0], [163, 'numbersign', 163, 0, 0], [164, 'dollar', 164, 0, 0], [165, 'percent', 165, 0, 0], [166, 'ampersand', 166, 0, 0], [167, 'underscore', 167, 0, 0], [168, 'parenleft', 168, 0, 0], [169, 'parenright', 169, 0, 0], [170, 'asterisk', 170, 0, 0], [171, 'plus', 171, 0, 0], [172, 'bar', 172, 0, 0], [173, 'minus', 173, 0, 0], [174, 'braceleft', 174, 0, 0], [175, 'braceright', 175, 0, 0], [176, 'asciitilde', 176, 0, 0], [186, 'semicolon', 186, 0, 0], [187, 'equal', 187, 0, 0], [188, 'comma', 188, 0, 0], [189, 'minus', 189, 0, 0], [190, 'period', 190, 0, 0], [191, 'slash', 191, 0, 0], [192, 'grave', 192, 0, 0], [219, 'bracketleft', 219, 0, 0], [220, 'backslash', 220, 0, 0], [221, 'bracketright', 221, 0, 0], [222, 'apostrophe', 222, 0, 0]], 'encoding.icons.max_size': [30, 30], 'raw_window_icons': 1, 'system_tray': 1, 'namespace': 1, 'sound.bundle-metadata': 1, 'encoding.generic': 1, 'version': '2.1', 'encoding.rgb_zlib': 1, 'encoding.full_csc_modes': {'h264': ['YUV420P'], 'mpeg4+mp4': ['YUV420P'], 'h264+mp4': ['YUV420P'], 'vp8+webm': ['YUV420P']}, 'encoding.h264.YUV420P.level': '2.1', 'encoding.rgb_lz4': 1, 'bencode': 1, 'encoding.h264+mp4.YUV420P.profile': 'main', 'notify-startup-complete': 1, 'encoding.h264.score-delta': -20, 'argv': ['http://192.168.1.138:2200/'], 'server-window-resize': 1, 'clipboard.selections': ['CLIPBOARD', 'PRIMARY'], 'clipboard.greedy': 1, 'printing': 1, 'file-size-limit': 10, 'encoding.color-gamut': '', 'windows': 1, 'encodings.window-icon': ['png'], 'compression_level': 1, 'session-type': 'Firefox', 'platform.processor': 'Linux x86_64', 'encoding.csc_atoms': 1}) 2017-07-25 10:42:11,886 set_socket_timeout(tcp websocket: 192.168.1.138:2200 <- 192.168.1.140:50304, None) 2017-07-25 10:42:11,887 HTML5 Posix Firefox client version 2.1 2017-07-25 10:42:11,887 automatic picture encoding enabled 2017-07-25 10:42:11,887 also available: 2017-07-25 10:42:11,887 jpeg, png, rgb32 2017-07-25 10:42:11,887 will process ui packet connection-lost 2017-07-25 10:42:11,888 Protocol.close() closed=True, connection=None 2017-07-25 10:42:11,888 client root window size is 1877x1109 with 1 display: 2017-07-25 10:42:11,888 HTML (497x293 mm - DPI: 95x96) 2017-07-25 10:42:11,888 Canvas 2017-07-25 10:42:11,889 best resolution matching 1877x1109 is unchanged: 1920x1152 2017-07-25 10:42:11,891 keyboard mapping already configured (skipped) 2017-07-25 10:42:11,892 get_enabled_encoders(['bencode', 'rencode', 'yaml']) enabled=['rencode', 'bencode'] 2017-07-25 10:42:11,893 write_format_thread_loop starting 2017-07-25 10:42:11,894 add_packet_to_queue(hello ...) 2017-07-25 10:42:11,897 io_thread_loop(write, <bound method Protocol._write of Protocol(tcp websocket: 192.168.1.138:2200 <- 192.168.1.140:50304)>) loop starting
I think the sound packet may be of interest - I haven't enabled sound in this session.
@antoine - are these logs enough, or do you want me to pipe my entire session into a file and post that?
I can also get the wireshark as well.
@maxmylyn: this log starts near:
new_connection((1, <socket._socketobject object at 0x7f5ec46fd590>)) ...
Which means that the client has already decided to re-connect. We still don't know why it did. Maybe the javascript debug console can tell us? (enable debug mode in the HTML5 client) If that doesn't clarify things, a wireshark capture may be required.
But ideally, I would be able to reproduce it myself - that would make it much easier to fix. What am I doing wrong? Can you reduce your setup to something minimal I can easily replicate?
As of trunk r16825 I'm unable to repro this in the same manner as I was able to before.
However, after connecting I get this error flooding the console output of the server:
2017-09-11 10:18:02,043 Warning: client decoding error: 2017-09-11 10:18:02,043 window not found
Please see comment:7, we need the javascript console output in debug mode, and / or the complete server output (before new_connection(...)
.
Even better would be steps so that I can reproduce the problem.
Upped server and client to r16911:
I am definitely unable to reproduce the original error I mentioned in the description and haven't gotten disconnected even after fairly thoroughly testing this.
I'm going to close this ticket if there's no objections.
Okay I'm still definitely unable to reproduce the hard crash or the unexpected uncaught exceptions, as of r16954. Closing.
this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/1595