Opened 3 years ago
Closed 3 years ago
#1500 closed defect (fixed)
Firefox and Chrome causing hard client crash
Reported by: | J. Max Mena | Owned by: | J. Max Mena |
---|---|---|---|
Priority: | blocker | Milestone: | 2.1 |
Component: | client | Version: | trunk |
Keywords: | Cc: |
Description
Using the latest trunk r15617 (Both client and server are Fedora 25), I am getting hard client crashes after using Firefox or Chrome. Chrome seems to behave nicer, I can use it almost indiscriminately, but at some point something changes and then it causes a hard client crash. Firefox seems to be the easiest - I just opened it up and watched a YouTube? video and that was enough to cause the crash - even just opening up a text only website like Reddit or Wikipedia and scrolling will also cause a crash. Once the client crashes, any subsequent connections will fail with the same error(if you were watching a video) until you kill Firefox or Chrome. Of note, just leaving Xterms running seems to behave - no crashes or errors. Running with a -d gtk
seems to turn up nothing of interest.
Here's the last bit of the client log before it crashes:
(Xpra:19270): Gdk-ERROR **: The program 'Xpra' received an X Window System error. This probably reflects a bug in the program. The error was 'BadMatch (invalid parameter attributes)'. (Details: serial 79698 error_code 8 request_code 72 minor_code 0) (Note to programmers: normally, X errors are reported asynchronously; that is, you will receive the error a while after causing it. To debug your program, run it with the --sync command line option to change this behavior. You can then get a meaningful backtrace from your debugger if you break on the gdk_x_error() function.) 2017-04-14 10:30:54,287 sound output stopping Trace/breakpoint trap (core dumped)
One quick question:
What -d
flag would be most useful here?
Attachments (1)
Change History (15)
comment:1 Changed 3 years ago by
Owner: | changed from Antoine Martin to J. Max Mena |
---|
comment:2 Changed 3 years ago by
Looks like it was OpenGL - running with --opengl=no
I don't get the crash anymore.
Running with OpenGL enabled and -d all
, here's the last few prints:
2017-04-18 10:08:01,913 check_server_echo(0) last=True, server_ok=True 2017-04-18 10:08:01,913 add_packet_to_queue(add_data ...) 2017-04-18 10:08:01,928 gtk2.GLWindowBacking(12, (460, 185), None).gl_show() swapping buffers now 2017-04-18 10:08:01,929 gl_show after 79ms took 0ms, 1 updates 2017-04-18 10:08:01,929 gtk2.GLWindowBacking(12, (460, 185), None).gl_frame_terminator() 2017-04-18 10:08:01,929 <OpenGL.platform.baseplatform.glBindFramebuffer object at 0x7fed61243fd0>(GL_FRAMEBUFFER (36160), c_uint(1L)) 2017-04-18 10:08:01,929 gtk2.GLWindowBacking(12, (460, 185), None).do_present_fbo() done (Xpra:13520): Gdk-ERROR **: The program 'Xpra' received an X Window System error. This probably reflects a bug in the program. The error was 'BadMatch (invalid parameter attributes)'. (Details: serial 5782 error_code 8 request_code 72 minor_code 0) (Note to programmers: normally, X errors are reported asynchronously; that is, you will receive the error a while after causing it. To debug your program, run it with the --sync command line option to change this behavior. You can then get a meaningful backtrace from your debugger if you break on the gdk_x_error() function.) 2017-04-18 10:08:04,399 sound output stopping Trace/breakpoint trap (core dumped)
Can you bisect it?
I'll get on that now since this morning seems a bit slow.
comment:3 Changed 3 years ago by
Okay, I found a very reliable trigger:
- Launch Firefox
- Click on the SSL Information button (the button between Back / Forwards and the URL bar that shows you site certificate info)
That seems to reliably cause a crash here. Now, on to bisecting.
comment:4 Changed 3 years ago by
First things first, rebuilding the client as 1.X causes the issue to go away, so it's limited to 2.X for now.
So I've rolled the client back to trunk r15500, and I'm still getting the crash.
I'll try rolling the server back as well, just to see if it makes a difference, but I'm 75% sure the issue is limited to the client.
comment:5 Changed 3 years ago by
Alright, so of interest, my client's OpenGL information lists that it's a VMWare, Inc GPU with OpenGL version 2.1 - same as the server without a GPU. This desktop has an Nvidia 745 dedicated GPU, so I'm a little confused as to why it's listed as VMWare.
Edit:
This would probably explain the issues I'm running in to, they seem to be only on this machine as afarr's laptop with an Intel iGPU works fine running Windows 10 using the 2.1 r15608 client, and a 2.1 r15664 Fedora 25 server.
Changed 3 years ago by
Attachment: | 1500_gl_check.txt added |
---|
comment:6 Changed 3 years ago by
Milestone: | → 2.1 |
---|---|
Priority: | critical → blocker |
We need to figure out if this crash is caused by:
- the recent changes to the opengl backend: r15567 + r15568 + r15569 + r15570 + r15575 + r15576 + r15612 - testing with r15566 should tell us that (ideally singling out the problematic changeset)
- the changes in blacklisting / greylisting of drivers: r15007
- the jpeg decoder code path (#1423): turning off jpeg encoding should prevent the crash in this case
- the 30-bit paint code updates (#1309): unlikely
- something else - likely from the list of changes found here: log/xpra/trunk/src/xpra/client/gl/gl_window_backing_base.py
comment:7 Changed 3 years ago by
Starting with the low hanging fruit:
- Upped client and server to trunk r15684
- Disabling jpeg using
--encodings=png,h264
still gets the crash, so it isn't the jpeg decoder
I perused #1309 and found out the fix was added in r15094. I rolled back to r15093, and the crash went away. However, upping to r15094 still doesn't cause the crash, so that isn't it.
Next up, walking through the versions you mentioned.
comment:8 Changed 3 years ago by
So I still get the same crash even with r15560 - so it's been around for a while now. Not sure why I didn't catch this earlier. The fact that I get it before those changes tells me that they aren't the cause. Probably.
comment:9 Changed 3 years ago by
Okay I've tried all the revisions mentioned in comment:6 and they all have the same crash.
comment:10 Changed 3 years ago by
Please try:
And bisect from there as needed. (bearing in mind you may have to force enable opengl on older revisions)
comment:11 Changed 3 years ago by
Quick update on this one:
After talking about it, the going theory right now is that the Nvidia driver isn't being loaded properly, as such we are falling back to the software opengl, which is causing an issue. That being said, the "VMWare" should work.
comment:12 Changed 3 years ago by
Update:
- Reinstalled the Nvidia driver today (thanks for finally supporting kernel 4.10 Nvidia, it's only been two/three months) and the issue went away.
Looks like it's the software OpenGL that's the problem.
comment:13 Changed 3 years ago by
Summary: | Firefox and Chrome causing hard client crash in 15617 Trunk → Firefox and Chrome causing hard client crash |
---|
r15811 moves "vmware" to the blacklist, so opengl won't be used by default on this buggy driver.
@maxmylyn: I think we can close this ticket?
When did this regression start? Can you bisect it?
I'm not seeing that here and I'm always running trunk on F25.
Do you use mmap? Have you tried enabling / disabling opengl? clipboard? etc..
As for debug flags, when in doubt go for "-d all": better have too much than too little.