#1089 closed defect (invalid)
OSX 0.17.0 client periodically "failing to receive anything" from server when trying to connect.
Reported by: | alas | Owned by: | alas |
---|---|---|---|
Priority: | critical | Milestone: | 0.17 |
Component: | client | Version: | trunk |
Keywords: | Cc: |
Description
OSX 0.17.0 clients (r11640, r11653, perhaps others) are periodically, though intermittently, failing to connect on initial attempts, with the following output client side:
Schadenfreude:MacOS Schadenfreude$ ./xpra attach --opengl=on 2016-01-12 13:29:23,321 Xpra gtk2 client version 0.17.0-r11653 2016-01-12 13:29:23,322 running on Mac OSX 2016-01-12 13:29:23,611 GStreamer version 1.4 for Python 2.7 2016-01-12 13:29:23,935 OpenGL_accelerate module loaded 2016-01-12 13:29:23,948 OpenGL enabled with Intel Iris OpenGL Engine 2016-01-12 13:29:23,970 using default keyboard settings 2016-01-12 13:29:24,251 desktop size is 2560x2490 with 1 screen: 2016-01-12 13:29:24,251 schadenfreude.local (903x878 mm - DPI: 72x72) 2016-01-12 13:29:24,252 monitor 1 1680x1050 at 880x1440 (592x370 mm - DPI: 72x72) 2016-01-12 13:29:24,252 monitor 2 2560x1440 (903x508 mm - DPI: 72x72) 2016-01-12 13:29:24,278 failed to receive anything, not an xpra server? 2016-01-12 13:29:24,278 could also be the wrong username, password or port 2016-01-12 13:29:24,278 Connection lost
I haven't yet been able to reliably reproduce, but it is happening mildly regularly (seems to be especially when a new client or server is tried... or re-tried, after a while?). I'll try to repro with -d auth,client
, but if there's a better flag to try to use, I'd be happy to try that instead.
I do notice that the client does seem to be outputting desktop and display size information, suggesting that it has at least made an initial connection... but I'm not sure what is happening after that. (Could it be related to #1029 or #876?)
In any case, trying to connect a second time always works, so this is just a nuisance, really... unless the connection is automated.
Attachments (1)
Change History (11)
comment:1 Changed 5 years ago by
comment:2 Changed 5 years ago by
Owner: | changed from Antoine Martin to alas |
---|
There is absolutely nothing suspicious there.
Are you sure this is a regression? Has anything changed in your server-side setup?
The failed to receive anything, not an xpra server?
message is only printed when we get disconnected by the server (or by a firewall or something else sitting in between the client and the server) and we have never received a single packet back before that.
comment:4 Changed 5 years ago by
I was just re-testing this... thought I hadn't seen it for a while, but no such luck.
It seems to be happen consistently with freshly "installed" 0.17.0 clients.
I might note that, rather than "properly" installing the osx .dmgs, I have a folder on the desktop with a subfolder for each revision I mean to use... which I wouldn't think would make a difference (since it didn't with any previous to 0.17.x), but I mention it in case it seems like something you think might be at issue.
To repro.
- Build/download/install a new 0.17.0 osx client.
- Use that client to connect to a server.
On the first attempt, you should see:
2016-04-21 17:36:19,582 failed to receive anything, not an xpra server? 2016-04-21 17:36:19,582 could also be the wrong username, password or port 2016-04-21 17:36:19,582 Connection lost
- Try a second (or subsequent) time and it should connect with no hitches.
Testing to see if it is a regression I tried a 0.16.2 r11888 osx client (freshly installed) against a 0.16.4 r11617 (unknown changes) fedora 23 server... and it actually occurred there (maybe not a regression, just something I didn't notice with 0.16.x?).
I'll try doing some further network testing and see if maybe that's where the problem is.
comment:5 Changed 5 years ago by
- does the server see the connection attempt? does it reply to it? (see debug output from the point of the connection attempt)
- have you tried turning everything off? (password, encryption, etc)
- try running without any routers or firewalls between the client and server
- maybe collect some wireshark flows to see which end is guilty and how. (it's most definitely the osx client, but not sure if it is failing to send or failing to receive properly at this point)
comment:6 Changed 5 years ago by
- I see absolutely no output from the server, leading me to suspect that it doesn't see the connection attempt.
- I am using no passwords or encryption.
- Both client and server are running on the same network ... so I, again, suspect there are no routers or firewalls between the two.
- I'm attaching a tcpdump run from the server during the connection attempt. If that has nothing to elucidate the problem, I'll move on to wireshark.
Changed 5 years ago by
Attachment: | ticket1089_tcp-dump3.pcap added |
---|
tcpdump from server end during initial failed client connection attempt
comment:7 Changed 5 years ago by
Your tcpdump shows a TCP ACKed unseen segment
, which implies that this is NOT the first connection to the server and that some of the capture is missing or that the networking is seriously messed up. See TCP ACked Unseen Segment.
You should be able to see the state of the connections from the server with netstat -tan
- may need to be scripted as this all happens quite quickly.
PS: created #1188 to help with wireshark.
comment:8 Changed 5 years ago by
Still investigating exactly what's going on, but I was able to narrow the issue a little bit.
The osx client that I'm using had both wired & wireless active. Disabling wired (unplugging the wire) I found that I have no issues with wireless connection. I get an "[error number 61] connection refused" when I try to connect with the wired connection (which, oddly, is on the same network as the vm I'm using as a server, while the wireless is actually on another network).
I tried the wired against a hardware fedora 23 server (on the same network as the wired connection and the vm), and it worked fine.
I think you're right and we are probably just have a bit or weirdness happening with our network. It would make sense to reduce the priority of this ticket, and I'll add info and close it once I can manage to puzzle out what's going on.
comment:9 Changed 5 years ago by
Resolution: | → invalid |
---|---|
Status: | new → closed |
Edit: apparently a network issue due to having two routes to the server, one via wifi and the other via lan.
comment:10 Changed 5 weeks ago by
this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/1089
Sure enough, after leaving OSX client disconnected "for a while", remembering the flags when re-connecting actually captured a lot of tracebacks (including all output from connection until the error message as one block, then from the error message on in a second block... hoping that makes it easier to sort through).