xpra icon
Bug tracker and wiki

Version 13 (modified by Antoine Martin, 7 years ago) (diff)

--

Testing

The current milestone can be found on the roadmap page.

Generic Regression Testing Issues

  • Python versions: we support Python 2.4 to 2.7 (ie: CentOS5.x for old versions of python) - code that is not backwards compatible can often make it into the repository and should be fixed before major releases (ie syntax: r2906, r2901, r2885, r2839, r2748, r2746 + r2747, r2710, r2703, r2704, r2705, r2616, r2615, r2608, r1280, r991, r990), simply compile-testing it is often enough to spot those, other issues may affect packaging (ie: #116) or also sometimes more complex (r2683), which means testing beta package builds, other bugs can be more difficult to identify and even more difficult to fix (ie: #251, #215). We also want to check that the Python 3.x version can be built via the python3-build script (compile tested), though actually using/testing it is not required at present since it isn't officially supported.
  • gtk/pygtk versions: similar to Python, older versions (ie: 2.17 and older) can cause problems, see: r2863, r2706, r2705, r1498, r555, r554.
  • client applications: it is important to test a wide range of client applications, using a wide variety of UI toolkits and languages: gtk, qt/kde, wx, Java (see #162), etc.. Each can uncover subtle bugs. Then there are specific applications that are known to cause problems because of the way the interact with the X11 server: wine applications?, VMWare? (#199), Firefox? (#220, #158, #96), gtkperf -a is very good at triggering races in window setup/cleanup, etc. Also, newer versions of specific target applications may change the behaviour of the application in ways which have a significant impact on xpra compression/quality.
  • backwards compatibility with old versions: we try to keep backwards compatibility with older versions as much as possible, though some features may not be available. Occasionally we will drop compatibility (ie: #57) to allow the code to move on from old crufty workarounds. At present, all versions from 0.3.11 onwards should be able to connect, both as client and server. Though looking forward, only v0.5 onwards are going to be supported.
  • unusual setups: although these may not be optimal, people still expect this to work - and it should! Again, the errors that this uncovers may well help in other areas. Things like: running xpra nested (#210), running xpra from "ssh -X" / "ssh -Y" (#207, #3)
  • platform specific quirks: OSX problems (#249), platforms with static builds of the x264 and vpx libraries or those where the dynamic libraries are bundled in a binary image (#103): MS Windows, OSX, CentOS 5.x, CentOS 6.x, Debian Squeeze, Ubuntu Lucid)
  • desktop environments: each DE may handle things slightly differently and uncover bugs, especially when it comes to window placement, resizing, minimizing, etc. Obviously we want to test the major DEs (gnome/cinnamon, KDE, LXDE, XFCE) but it may be worth testing some of the more unusual window managers too (fluxbox, window maker, etc)
  • binary builds with library updates (OSX and MS Windows), in particular: gtk-osx updates and rebuilds, pycrypto, gstreamer, pywin32, etc..
  • installations vs upgrades: sometimes this makes a difference if the OS decides to keep the old default configuration in place..
  • memory leaks (not spotted by automated tests as these do not run for long enough to trigger the problem)

Specific Testing Combinations

The release notes should help in figuring out what has changed and therefore what is likely to require more thorough testing. As can be seen in the number of items listed above, testing every combination is simply impossible. Here are some of the most common setups, and those that are most likely to uncover compatibility issues. Ideally, all of those should be tested before major releases.

  • All MS Windows clients (from XP to 8) with CentOS/RedHat 5.x and 6.x servers
  • OSX clients
  • xpra 0.5.x client and servers
  • CentOS 5.x clients with both old servers (CentOS 5.x) and new ones (Fedora 18+ or Debian sid)
  • Debian Squeeze or Ubuntu Lucid packages.

Automated performance and regression testing

The xpra.test_measure_perf script can be used to run a variety of client applications within an xpra session (or optionally vnc) and get statistics on how well the encoding performed. The data is printed out at the end in CSV format which you can then import into any tool to compare results - you can find some examples generated using sofastats here and here. It can also be useful to redirect the test's output to a log file to verify that none of the tests failed with any exceptions/errors (looking for exception messages in the log afterwards). At the moment it does not have a command line interface, and all the options have to be edited directly in the source file.
Note: to take advantage of iptables packet accounting (mostly for comparing with VNC which does not provide this metric), follow the error message and setup iptables rules to match the port being used in the tests, ie: by default:

iptables -I INPUT -p tcp --dport 10000 -j ACCEPT
iptables -I OUTPUT -p tcp --sport 10000 -j ACCEPT

The obvious benchmarking caveats apply:

  • make sure there are no other applications running
  • disable cron jobs or any other scheduled work (systemd makes this a little harder)
  • etc..

Please also see:

Misleading Statistics

One has to be very careful when interpreting performance results, here are some examples of misleading statistics:

  • higher CPU usage is not necessarily a bad thing if the framerate has increased
  • lower latency is good, but not if the highest latency is up
  • when averaging samples, all the tests must have completed (if one test run was cut short, it may be skewed)
  • averaging between throttled and non-throttled test samples is not particularly useful
  • averaging between many different test commands is not particularly useful, especially:
  • gtkperf tests really skew the results and should generally be excluded from sample data (or not run at all)
  • the automated tests may run with the "vpx" and "x264" encodings many more times because of the many extra options that these encodings support (quality, opengl, etc..), giving more sample data, and more varied, this cannot be compared directly with other encodings without some initial filtering.
  • etc..