xpra icon
Bug tracker and wiki

This bug tracker and wiki are being discontinued
please use https://github.com/Xpra-org/xpra instead.

Version 19 (modified by Nick Centanni, 7 years ago) (diff)



The current milestone can be found on the roadmap page.

Generic Regression Testing Issues

  • test every xpra sub-command (upgrade, info, proxy, etc)
  • Python versions: as of version 0.15, we only support Python 2.6 and 2.7 (ie: CentOS5.x for old versions of python) - code that is not backwards compatible can often make it into the repository and should be fixed before major releases, simply compile-testing it is often enough to spot those, other issues may affect packaging (ie: #116) or also sometimes more complex (r2683), which means testing beta package builds, other bugs can be more difficult to identify and even more difficult to fix (ie: #251, #215). We also want to check that the Python 3.x version can be built via the python3-build script (compile tested), though actually using/testing it is not required at present since it isn't officially supported.
  • gtk/pygtk versions: similar to Python, older versions (ie: 2.17 and older) can cause problems, see: r2863, r2706, r2705, r1498, r555, r554.
  • client applications: it is important to test a wide range of client applications, using a wide variety of UI toolkits and languages: gtk, qt/kde, wx, Java (see #162), etc.. Each can uncover subtle bugs. Then there are specific applications that are known to cause problems because of the way the interact with the X11 server: wine applications?, VMWare? (#199), Firefox? (#220, #158, #96), gtkperf -a is very good at triggering races in window setup/cleanup, etc. Also, newer versions of specific target applications may change the behaviour of the application in ways which have a significant impact on xpra compression/quality.
  • backwards compatibility with old versions: we try to keep backwards compatibility with older versions as much as possible, though some features may not be available. Occasionally we will drop compatibility (ie: #57) to allow the code to move on from old crufty workarounds. At present, all versions from 0.7.8 onwards should be able to connect, both as client and server.
  • unusual setups: although these may not be optimal, people still expect this to work - and it should! Again, the errors that this uncovers may well help in other areas. Things like: running xpra nested (#210), running xpra from "ssh -X" / "ssh -Y" (#207, #3)
  • platform specific quirks: OSX problems (#249), platforms with static builds of the x264 and vpx libraries or those where the dynamic libraries are bundled in a binary image (#103): MS Windows, OSX, CentOS 5.x, CentOS 6.x, Debian Squeeze, Ubuntu Lucid)
  • desktop environments: each DE may handle things slightly differently and uncover bugs, especially when it comes to window placement, resizing, minimizing, etc. Obviously we want to test the major DEs (gnome/cinnamon, KDE, LXDE, XFCE) but it may be worth testing some of the more unusual window managers too (fluxbox, window maker, etc)
  • binary builds with library updates (OSX and MS Windows), in particular: gtk-osx updates and rebuilds, pycrypto, gstreamer, pywin32, etc..
  • installations vs upgrades: sometimes this makes a difference if the OS decides to keep the old default configuration in place..
  • memory leaks (not spotted by automated tests as these do not run for long enough to trigger the problem)

Specific Testing Combinations

The release notes should help in figuring out what has changed and therefore what is likely to require more thorough testing. As can be seen in the number of items listed above, testing every combination is simply impossible. Here are some of the most common setups, and those that are most likely to uncover compatibility issues. Ideally, all of those should be tested before major releases.

  • All MS Windows clients (from XP to 8) with CentOS/RedHat 6.x and 7.x servers
  • OSX clients
  • xpra 0.7.x client and servers
  • CentOS 5.x/6.x clients with both old servers (CentOS 5.x/6.x) and new ones (Fedora 18+ or Debian sid)
  • Debian Squeeze or Ubuntu Lucid packages.

Automated performance and regression testing

The xpra.test_measure_perf script can be used to run a variety of client applications within an xpra session (or optionally vnc) and get statistics on how well the encoding performed. The data is printed out at the end in CSV format which you can then import into any tool to compare results - you can find some examples generated using sofastats here and here. There is a facility for generating charts directly from the CSV data, using a script that we now provide, described below.

It can also be useful to redirect the test's output to a log file to verify that none of the tests failed with any exceptions/errors (looking for exception messages in the log afterwards).

At the moment it does not have a command line interface, and all the options have to be edited directly. However, we have improved that process by splitting out the configuration data into a separate file: xpra.perf_config_default.py.

Note: to take advantage of iptables packet accounting (mostly for comparing with VNC which does not provide this metric), follow the error message and setup iptables rules to match the port being used in the tests, ie: by default:

iptables -I INPUT -p tcp --dport 10000 -j ACCEPT
iptables -I OUTPUT -p tcp --sport 10000 -j ACCEPT

The obvious benchmarking caveats apply:

  • make sure there are no other applications running
  • disable cron jobs or any other scheduled work (systemd makes this a little harder)
  • etc..

To create multiple output files which can be used to generate charts, using xpra.test_measure_perf_charts:

  • Determine the values of the following variables: prefix: (a string to identify the data set), id: (a string to identify the variable that the data set is testing, for example '14' because we're testing xpra v14 in this data set), repetitions: the number of times you want to run the tests.
  • The data file names you will produce will then be in the format: prefix_id_rep.csv.
  • With this information in hand you can now create a script that will run the tests.

For example:

./test_measure_perf.py all_tests_40 ./data/all_tests_40_14_1.csv 1 14 > ./data//all_tests_40_14_1.log                                                                                                                                                             
./test_measure_perf.py all_tests_40 ./data/all_tests_40_14_2.csv 2 14 > ./data//all_tests_40_14_2.log                                                                                                                                                             

In the above example, test_measure_perf is run twice, using a config class named "all_tests_40.py", and outputting the results to data files using the prefix "all_tests_40", for version 14. The additional arguments "1 14", "2 14" are custom paramaters which will be written to the "Custom Params" column in the corresponding data files. The "1", "2" in the file names or in the parameters, refer to the corresponding repetition of the tests.

Please also see:

Misleading Statistics

One has to be very careful when interpreting performance results, here are some examples of misleading statistics:

  • higher CPU usage is not necessarily a bad thing if the framerate has increased
  • lower latency is good, but not if the highest latency is up
  • when averaging samples, all the tests must have completed (if one test run was cut short, it may be skewed)
  • averaging between throttled and non-throttled test samples is not particularly useful
  • averaging between many different test commands is not particularly useful, especially:
  • gtkperf tests really skew the results and should generally be excluded from sample data (or not run at all)
  • the automated tests may run with the "vpx" and "x264" encodings many more times because of the many extra options that these encodings support (quality, opengl, etc..), giving more sample data, and more varied, this cannot be compared directly with other encodings without some initial filtering.
  • when measuring the performance of video encoders, it is important to take into account both the bandwidth used (mostly tied to the "speed" option) and the quality of the output (mostly tied to the "quality" option, but also heavily influenced by the amount of lossless regions sent)
  • etc..