xpra icon
Bug tracker and wiki

Opened 4 weeks ago

Closed 2 weeks ago

#2481 closed defect (fixed)

Laggy unless XPRA_OPENGL_DRAW_REFRESH=0

Reported by: aerusso Owned by: aerusso
Priority: major Milestone: 4.0
Component: client Version: 3.0.x
Keywords: lag regression Cc:

Description

I appreciate there is some difficulty with drivers on Python 3, and, for some systems, refreshing more frequently solves the problem.

However, I found the recent change (3.0.2) to set the default of XPRA_OPENGL_DRAW_REFRESH=1 makes scrolling uncomfortably laggy in many programs. This is all local.

I have not experienced any issues by setting this to 0 (yet). Should I expect any strangeness if I keep this set? Is this an issue that a whitelist or blacklist could help?

Attachments (1)

xpra-opengl.out (638 bytes) - added by aerusso 4 weeks ago.
Output of xpra opengl

Download all attachments as: .zip

Change History (17)

comment:1 Changed 4 weeks ago by Antoine Martin

Owner: changed from Antoine Martin to aerusso

Please specify the exact distro you are using.
Please also include the output of xpra opengl.

Changed 4 weeks ago by aerusso

Attachment: xpra-opengl.out added

Output of xpra opengl

comment:2 Changed 4 weeks ago by aerusso

This is Debian unstable, and I've attached the output.

comment:3 Changed 3 weeks ago by Antoine Martin

Owner: changed from aerusso to Antoine Martin
Status: newassigned

The change that enabled XPRA_OPENGL_DRAW_REFRESH=1 for v3 is in r24352.
More details in #2466.
According to ticket:2466#comment:5, this also affects some Linux drivers..
So r24433 switches back to XPRA_OPENGL_DRAW_REFRESH=0 in trunk, but I am very reluctant to apply this to v3 for now, as I don't want to spend days making new release after new release just to toggle this switch back and forth when people hit issues..

I think that the slowness is caused by the multiple calls to present_fbo.
When scrolling, we're sending the window updates using many chunks. We should bundle all of them together for the refresh code path, the same way that we already do for the non-refresh paint code path.

comment:4 Changed 3 weeks ago by Antoine Martin

Owner: changed from Antoine Martin to aerusso
Status: assignednew

@aerusso: does r24434 fix things?
It is a better solution than XPRA_OPENGL_DRAW_REFRESH=0.
The only other option worth a try if you have repaint issues with this patch is XPRA_REPAINT_ALL=1.

There are beta builds with this change here: https://xpra.org/beta.

comment:5 Changed 3 weeks ago by aerusso

I am confused: testing r24434 with XPRA_OPENGL_DRAW_REFRESH=0 (which is what r24433 sets, if I'm reading things right), is excellent (I didn't actually apply that patch, I only applied r24434). Should I be testing with development tip?

With XPRA_OPENGL_DRAW_REFRESH=1, it seems better, but not as good as XPRA_OPENGL_DRAW_REFRESH=0.

This is a relatively high end machine---Ryzen 7 3700X.

comment:6 Changed 3 weeks ago by Antoine Martin

Should I be testing with development tip?

Applying r24434 should be enough.

which is what r24433 sets, if I'm reading things right

Damn, I meant to revert r24433, that's now done in r24435.

With XPRA_OPENGL_DRAW_REFRESH=1, it seems better, but not as good as XPRA_OPENGL_DRAW_REFRESH=0.

Better than what?

I think r24434 is what I am going to backport to v3, it's the safest solution: we always request a repaint, but doing it efficiently.
I just need to re-test with wayland first, because GTK does weird things with coordinates because of CSD.

comment:7 Changed 3 weeks ago by aerusso

Sorry for the confusion. From best to worse, in terms of lag:

  1. XPRA_OPENGL_DRAW_REFRESH=0, with or without r24434 (I can't tell if there is any difference, though I don't really see how there could be)
  2. XPRA_OPENGL_DRAW_REFRESH=1, with r24434
  3. XPRA_OPENGL_DRAW_REFRESH=1, without r24434

Is this a bug in the Python 3 implementation of GTK3? It seems like, at least most of the time on X, a refresh is not needed.

By the way, coalescing all the refreshes into a single one helps the lag enough that I wouldn't have noticed (at least on my machine):

    def after_draw_refresh(self, success, message=""):
        plog("after_draw_refresh(%s, %s) pending_refresh=%s",
             success, message, self.pending_refresh)
        backing = self._backing
        if not backing:   
            return                  
        if REPAINT_ALL or self._client.xscale!=1 or self._client.yscale!=1 or is_Wayland():
            #easy: just repaint the whole window:
            rw, rh = self.get_size()
            self.idle_add(self.queue_draw_area, 0, 0, rw, rh)
            return           
        pr = self.pending_refresh
        self.pending_refresh = []
        bx, by, bw, bh = None, None, None, None
        for x, y, w, h in pr:
            if bx is None:
                bx, by, bw, bh = x, y, w, h
                continue
            if x < bx:
                bw += bx - x
                bx = x 
            if y < by:
                bh += by - y
                by = y
            if w > bw:
                bw = w
            if h > bh:                                             
                bh = h
 
        rx, ry, rw, rh = self._client.srect(bx, by, bw, bh)
        #if self.window_offset:
        #    rx -= self.window_offset[0]
        #    ry -= self.window_offset[1]
        self.idle_add(self.queue_draw_area, rx, ry, rw, rh)

I don't do very much graphics, so I don't know if this will break something somewhere else.

comment:8 Changed 3 weeks ago by Antoine Martin

XPRA_OPENGL_DRAW_REFRESH=0, with or without r24434 (I can't tell if there is any difference, though I don't really see how there could be)

With r24434, when draw refresh is enabled =1 and with opengl, we don't paint the fbo on-screen until all the screen updates in the same group have been received. This should be faster for when many screen updates arrive in the same group (ie: when scrolling).

By the way, coalescing all the refreshes into a single one helps the lag enough that I wouldn't have noticed (at least on my machine):

I assume you mean without r24434?
Or is it better than r24434 with draw refresh enabled? (which is the new default)

I don't do very much graphics, so I don't know if this will break something somewhere else.

I don't think the code is correct, we already have a cython accelerated rectangle merging function in xpra: browser/xpra/trunk/src/xpra/rectangle.pyx (with unit tests).
In some cases, it is best not to coalesce: imagine a window updating a single pixel in its 4 corners, we would ask the graphics system to repaint all of it. Seems like a contrived example but similar behaviour happens more than you would think because of the way applications decorate their windows.
It's something I have looked into in the past, and it can be costly, especially when opengl is disabled (high CPU cost).
We could still merge something like this, as long as it isn't the default.

comment:9 Changed 3 weeks ago by Antoine Martin

The backport to v3 is in: r24441.
Will test some more.

comment:10 Changed 3 weeks ago by aerusso

Or is it better than r24434 with draw refresh enabled? (which is the new default)

Yes, for the workload I tested, draw refresh enabled and merging all of the rectangles is best. It winds up essentially being equivalent to XPRA_REPAINT_ALL=1.

The workload is Google Chrome, when scrolling. For example, here is a typical pending_requests list

[(0, 0, 1500, 846), (0, 119, 1500, 4), (0, 564, 1500, 16), (0, 601, 1500, 28), (0, 648, 1500, 4), (0, 697, 1500, 16), (0, 740, 1500, 2), (0, 780, 1500, 10), (0, 798, 1500, 13), (0, 817, 1500, 10)]

It's really bizarre to me that updating 10 regions is actually something that I can "feel". Now that I'm digging into queue_draw_area, I suspect that there's some slow code path there, maybe getting and initializing the gl context? I didn't look very hard, but gl_init looks like it could be slow.

I think the ideal solution probably renames queue_draw_area to queue_draw_areas and makes it take a list of rectangular regions that all get updated using the same gl context.

I would guess that it only makes sense to consider merging rectangles after collecting the refreshes within a single gl context creation.

comment:11 Changed 3 weeks ago by Antoine Martin

Owner: changed from aerusso to Antoine Martin
Status: newassigned

I didn't look very hard, but gl_init looks like it could be slow.

Of course, you're right.

comment:12 Changed 3 weeks ago by Antoine Martin

The slowdown comes from calling gl_show which ends up calling swap_buffers().
The other thing I had forgotten about is that we already do a full window redraw when double-buffered (which is everywhere):

        if self.is_double_buffered() or bw!=ww or bh!=wh:
            #refresh the whole window:
            rectangles = ((0, 0, bw, bh), )
        else:
            #paint just the rectangles we have accumulated:
            rectangles = self.pending_fbo_paint

Now, we don't want to penalize the non-opengl case, so maybe the code just needs to be delegated to the backend class so opengl can take the shortcut.

comment:13 Changed 3 weeks ago by Antoine Martin

Owner: changed from Antoine Martin to aerusso
Status: assignednew

@aerusso: how about r24460? (this solution may help with #2373)

This does a single queue_draw_area for the full window when using the opengl backend whilst preserving the multiple-regions code for the non-opengl case. (tested briefly)

Last edited 3 weeks ago by Antoine Martin (previous) (diff)

comment:14 Changed 3 weeks ago by Antoine Martin

This may have caused a regression on macos: #2491.

comment:15 Changed 2 weeks ago by aerusso

I tested with Debian unstable's version and added r24434, r24435, and r24460. I cannot "feel" any lag with these patches.

Will double buffering always be enabled? If not, I think there may still be value in collecting the multiple calls to queue_draw_area into a single gl context.

Also: thanks for xpra! It's really a fantastic piece of software!

comment:16 Changed 2 weeks ago by Antoine Martin

Resolution: fixed
Status: newclosed

r24460 has been backported to the v3 branch in r24483.

Since this fixes the symptoms, I am going to close this ticket. We can deal with regressions and bugs in other tickets and link back here if needed.

Note: See TracTickets for help on using tickets.