xpra icon
Bug tracker and wiki

This bug tracker and wiki are being discontinued
please use https://github.com/Xpra-org/xpra instead.


Opened 10 years ago

Closed 9 years ago

Last modified 10 months ago

#132 closed defect (fixed)

idle_add is a latency bottleneck in damage_to_data

Reported by: Antoine Martin Owned by: Antoine Martin
Priority: major Milestone: 0.4
Component: server Version: 0.3.x
Keywords: Cc:

Description (last modified by Antoine Martin)

When we get many small areas in damage requests, we may end up scheduling hundreds of calls to idle_add from damage_to_data which causes the latency to shoot right up.

Solutions:

  • Run the whole damage_to_data in one idle_add call - or even run it directly from damage (via damage_now or send_delayed). This would reduce the number of idle_add calls at the cost of making one longer idle_add call...
  • Keep track of how many idle_add calls are pending and use this measure to update the batching delay. (but this may still not allow the many small regions to be coalesced..)
  • Prefer full frame refreshes over many small regions by increasing packet_cost and/or decreasing pixels_threshold

Or a combination of the above? Or??

Once done, we can deal with #135

Attachments (3)

xpra-unthreadeddamage.patch (15.5 KB) - added by Antoine Martin 10 years ago.
removes threading from damage codepath
xpra-unthreadeddamage-addlatencymeasurements.patch (23.3 KB) - added by Antoine Martin 10 years ago.
updated patch which dumps the damage latency at various points in the pipeline
xpra-protocol-addlatencymeasurements2.patch (3.5 KB) - added by Antoine Martin 10 years ago.
patch for protocol.py so the latency reported is when the packet has been sent by the network layer

Download all attachments as: .zip

Change History (10)

comment:1 Changed 10 years ago by Antoine Martin

Description: modified (diff)
Status: newaccepted

comment:2 Changed 10 years ago by Antoine Martin

Using this simple patch:

Index: src/xpra/server.py
===================================================================
--- src/xpra/server.py	(revision 883)
+++ src/xpra/server.py	(working copy)
@@ -539,9 +540,10 @@
             except Exception, e:
                 log.error("damage_to_data: error processing region %s: %s", damage, e)
                 continue
-            gobject.idle_add(self._process_damage_regions, wid, window, ww, wh, regions, coding, sequence, options)
+            gobject.idle_add(self._process_damage_regions, time.time(), wid, window, ww, wh, regions, coding, sequence, options)
 
-    def _process_damage_regions(self, wid, window, ww, wh, regions, coding, sequence, options):
+    def _process_damage_regions(self, req_time, wid, window, ww, wh, regions, coding, sequence, options):
+        log.info("process_damage_regions: idle_add latency=%s", int(1000*1000*(time.time()-req_time)))
         if self._damage_cancelled.get(wid, 0)>=sequence:
             log("process_damage_regions: dropping damage request with sequence=%s", sequence)
             return

And running the xterm performance test, we can clearly see the idle_add latency shooting right up:

process_damage_regions: idle_add latency=39
process_damage_regions: idle_add latency=601
process_damage_regions: idle_add latency=793
process_damage_regions: idle_add latency=1873
process_damage_regions: idle_add latency=8841
process_damage_regions: idle_add latency=2904
process_damage_regions: idle_add latency=3646
process_damage_regions: idle_add latency=4523
process_damage_regions: idle_add latency=4644
process_damage_regions: idle_add latency=5743
Uh-oh, our size doesn't fit window sizing constraints: 979x633 vs 979x628
Uh-oh, our size doesn't fit window sizing constraints: 979x633 vs 979x628
unknown modifier: meta
process_damage_regions: idle_add latency=135900
process_damage_regions: idle_add latency=135923
process_damage_regions: idle_add latency=135952
process_damage_regions: idle_add latency=106001
process_damage_regions: idle_add latency=289604
(...)
process_damage_regions: idle_add latency=1162372
process_damage_regions: idle_add latency=1150017
process_damage_regions: idle_add latency=124438
process_damage_regions: idle_add latency=93
process_damage_regions: idle_add latency=28
process_damage_regions: idle_add latency=41
process_damage_regions: idle_add latency=46
process_damage_regions: idle_add latency=59

Then suddenly dropping back down... now need to figure out why!

comment:3 Changed 10 years ago by Antoine Martin

A simple:

time.sleep(0)

added in r884 ensures we don't queue too many idle_add calls before giving them a chance to run.

A better solution is probably to remove the threading from damage_to_data and to keep track of the idle_add latency as a measure of how smoothly the system is running, and batch more when it goes too high?

Changed 10 years ago by Antoine Martin

Attachment: xpra-unthreadeddamage.patch added

removes threading from damage codepath

Changed 10 years ago by Antoine Martin

updated patch which dumps the damage latency at various points in the pipeline

comment:4 Changed 10 years ago by Antoine Martin

With this code in _get_rgb_rawdata

    pixbuf = gtk.gdk.Pixbuf(gtk.gdk.COLORSPACE_RGB, False, 8, width, height)
    log.info("get_rgb_rawdata(..) creating pixbuf object took %s ms", int(1000*(time.time()-start)))
    pixbuf.get_from_drawable(pixmap, pixmap.get_colormap(), x, y, 0, 0, width, height)
    log.info("get_rgb_rawdata(..) pixbuf.get_from_drawable took %s ms", int(1000*(time.time()-start)))
    raw_data = pixbuf.get_pixels()
    log.info("get_rgb_rawdata(..) pixbuf.get_pixels took %s ms", int(1000*(time.time()-start)))
    rowstride = pixbuf.get_rowstride()
    log.info("get_rgb_rawdata(..) took %s ms", int(1000*(time.time()-start)))

We can see:

get_rgb_rawdata(..) creating pixbuf object took 0 ms
get_rgb_rawdata(..) pixbuf.get_from_drawable took 182 ms
get_rgb_rawdata(..) pixbuf.get_pixels took 183 ms
get_rgb_rawdata(..) took 184 ms

So the expensive call is pixbuf.get_from_drawable() in gtk.

Solutions:

  • use the X11 pixmap directly/unwrapped (tricky?)
  • ???
Last edited 10 years ago by Antoine Martin (previous) (diff)

Changed 10 years ago by Antoine Martin

patch for protocol.py so the latency reported is when the packet has been sent by the network layer

comment:5 Changed 10 years ago by Antoine Martin

r886 commits the unthreaded damage code and adds the latency measurement via "xpra info", see changeset commit message for details

comment:6 Changed 9 years ago by Antoine Martin

Resolution: fixed
Status: acceptedclosed

good enough in trunk with the latency minimization code from #153

comment:7 Changed 10 months ago by migration script

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/132

Note: See TracTickets for help on using tickets.