xpra icon
Bug tracker and wiki

Opened 5 months ago

Closed 4 months ago

#1500 closed defect (fixed)

Firefox and Chrome causing hard client crash

Reported by: J. Max Mena Owned by: J. Max Mena
Priority: blocker Milestone: 2.1
Component: client Version: trunk
Keywords: Cc:

Description

Using the latest trunk r15617 (Both client and server are Fedora 25), I am getting hard client crashes after using Firefox or Chrome. Chrome seems to behave nicer, I can use it almost indiscriminately, but at some point something changes and then it causes a hard client crash. Firefox seems to be the easiest - I just opened it up and watched a YouTube? video and that was enough to cause the crash - even just opening up a text only website like Reddit or Wikipedia and scrolling will also cause a crash. Once the client crashes, any subsequent connections will fail with the same error(if you were watching a video) until you kill Firefox or Chrome. Of note, just leaving Xterms running seems to behave - no crashes or errors. Running with a -d gtk seems to turn up nothing of interest.

Here's the last bit of the client log before it crashes:

(Xpra:19270): Gdk-ERROR **: The program 'Xpra' received an X Window System error.
This probably reflects a bug in the program.
The error was 'BadMatch (invalid parameter attributes)'.
  (Details: serial 79698 error_code 8 request_code 72 minor_code 0)
  (Note to programmers: normally, X errors are reported asynchronously;
   that is, you will receive the error a while after causing it.
   To debug your program, run it with the --sync command line
   option to change this behavior. You can then get a meaningful
   backtrace from your debugger if you break on the gdk_x_error() function.)
2017-04-14 10:30:54,287 sound output stopping
Trace/breakpoint trap (core dumped)

One quick question:

What -d flag would be most useful here?

Attachments (1)

1500_gl_check.txt (7.2 KB) - added by J. Max Mena 5 months ago.

Download all attachments as: .zip

Change History (15)

comment:1 Changed 5 months ago by Antoine Martin

Owner: changed from Antoine Martin to J. Max Mena

When did this regression start? Can you bisect it?
I'm not seeing that here and I'm always running trunk on F25.
Do you use mmap? Have you tried enabling / disabling opengl? clipboard? etc..

As for debug flags, when in doubt go for "-d all": better have too much than too little.

Last edited 5 months ago by Antoine Martin (previous) (diff)

comment:2 Changed 5 months ago by J. Max Mena

Looks like it was OpenGL - running with --opengl=no I don't get the crash anymore.

Running with OpenGL enabled and -d all, here's the last few prints:

2017-04-18 10:08:01,913 check_server_echo(0) last=True, server_ok=True
2017-04-18 10:08:01,913 add_packet_to_queue(add_data ...)
2017-04-18 10:08:01,928 gtk2.GLWindowBacking(12, (460, 185), None).gl_show() swapping buffers now
2017-04-18 10:08:01,929 gl_show after  79ms took  0ms,  1 updates
2017-04-18 10:08:01,929 gtk2.GLWindowBacking(12, (460, 185), None).gl_frame_terminator()
2017-04-18 10:08:01,929 <OpenGL.platform.baseplatform.glBindFramebuffer object at 0x7fed61243fd0>(GL_FRAMEBUFFER (36160), c_uint(1L))
2017-04-18 10:08:01,929 gtk2.GLWindowBacking(12, (460, 185), None).do_present_fbo() done

(Xpra:13520): Gdk-ERROR **: The program 'Xpra' received an X Window System error.
This probably reflects a bug in the program.
The error was 'BadMatch (invalid parameter attributes)'.
  (Details: serial 5782 error_code 8 request_code 72 minor_code 0)
  (Note to programmers: normally, X errors are reported asynchronously;
   that is, you will receive the error a while after causing it.
   To debug your program, run it with the --sync command line
   option to change this behavior. You can then get a meaningful
   backtrace from your debugger if you break on the gdk_x_error() function.)
2017-04-18 10:08:04,399 sound output stopping
Trace/breakpoint trap (core dumped)


Can you bisect it?


I'll get on that now since this morning seems a bit slow.

Last edited 5 months ago by J. Max Mena (previous) (diff)

comment:3 Changed 5 months ago by J. Max Mena

Okay, I found a very reliable trigger:

  • Launch Firefox
  • Click on the SSL Information button (the button between Back / Forwards and the URL bar that shows you site certificate info)

That seems to reliably cause a crash here. Now, on to bisecting.

Last edited 5 months ago by Antoine Martin (previous) (diff)

comment:4 Changed 5 months ago by J. Max Mena

First things first, rebuilding the client as 1.X causes the issue to go away, so it's limited to 2.X for now.

So I've rolled the client back to trunk r15500, and I'm still getting the crash.

I'll try rolling the server back as well, just to see if it makes a difference, but I'm 75% sure the issue is limited to the client.

comment:5 Changed 5 months ago by J. Max Mena

Alright, so of interest, my client's OpenGL information lists that it's a VMWare, Inc GPU with OpenGL version 2.1 - same as the server without a GPU. This desktop has an Nvidia 745 dedicated GPU, so I'm a little confused as to why it's listed as VMWare.

Edit:

This would probably explain the issues I'm running in to, they seem to be only on this machine as afarr's laptop with an Intel iGPU works fine running Windows 10 using the 2.1 r15608 client, and a 2.1 r15664 Fedora 25 server.

Last edited 5 months ago by J. Max Mena (previous) (diff)

Changed 5 months ago by J. Max Mena

Attachment: 1500_gl_check.txt added

comment:6 Changed 5 months ago by Antoine Martin

Milestone: 2.1
Priority: criticalblocker

We need to figure out if this crash is caused by:

comment:7 Changed 5 months ago by J. Max Mena

Starting with the low hanging fruit:

  • Upped client and server to trunk r15684
  • Disabling jpeg using --encodings=png,h264 still gets the crash, so it isn't the jpeg decoder

I perused #1309 and found out the fix was added in r15094. I rolled back to r15093, and the crash went away. However, upping to r15094 still doesn't cause the crash, so that isn't it.

Next up, walking through the versions you mentioned.

comment:8 Changed 5 months ago by J. Max Mena

So I still get the same crash even with r15560 - so it's been around for a while now. Not sure why I didn't catch this earlier. The fact that I get it before those changes tells me that they aren't the cause. Probably.

comment:9 Changed 5 months ago by J. Max Mena

Okay I've tried all the revisions mentioned in comment:6 and they all have the same crash.

comment:10 Changed 5 months ago by Antoine Martin

Please try:

And bisect from there as needed. (bearing in mind you may have to force enable opengl on older revisions)

Last edited 5 months ago by Antoine Martin (previous) (diff)

comment:11 Changed 5 months ago by J. Max Mena

Quick update on this one:

After talking about it, the going theory right now is that the Nvidia driver isn't being loaded properly, as such we are falling back to the software opengl, which is causing an issue. That being said, the "VMWare" should work.

comment:12 Changed 4 months ago by J. Max Mena

Update:

  • Reinstalled the Nvidia driver today (thanks for finally supporting kernel 4.10 Nvidia, it's only been two/three months) and the issue went away.

Looks like it's the software OpenGL that's the problem.

comment:13 Changed 4 months ago by Antoine Martin

Summary: Firefox and Chrome causing hard client crash in 15617 TrunkFirefox and Chrome causing hard client crash

r15811 moves "vmware" to the blacklist, so opengl won't be used by default on this buggy driver.

@maxmylyn: I think we can close this ticket?

comment:14 Changed 4 months ago by J. Max Mena

Resolution: fixed
Status: newclosed

Agreed, closing.

Note: See TracTickets for help on using tickets.