http://clang.llvm.org/ is an alternative to GCC.
It is possible to build the cython modules with the script attached (based on this filter). To use it:
CC=clang-wrapper setup.py build
It would be interesting to run the automated tests to compare and see if we get any difference in performance. Probably should be tested with something that tests lots of instantiations rather than x264 encoding as this will still use the same binary code. That said, it would be worth trying to build x264 and other libs with clang too...
wrapper script which strips arguments that clang chokes on
test data output
one of the most telling graphs I generated
gcc beats clang in most cases, it looks like clang adds an overhead to even the most basic case (mmap). CPU utilization is about the same, but gcc encodes more pixels/s, sends more packets, etc.
I did find one case where clang does better than gcc: it seems that clang compresses png better, at least for the "glxgears" and "glxspheres" tests. No idea why as we don't rebuild the libpng library with clang.. it should end up being more or less the same. Odd!
We may want to revisit this later, once clang matures a bit more and once projects start making more of an effort to optimize for it. At present, x264 is always faster with gcc it seems.
I've re-run the clang vs gcc tests on a more massive scale -- this is 40 seconds per test, the suite run 9 times for each variable, and each cell averaged across the 9 reps.
I've attached the results as an archive. After extracting the archive, view the HTML file.
Alternately look at how the python script is used, and change which combinations of metrics, encodings, and apps are graphed.
After making changes run the python script with no arguments, and a new HTML file will be written.
I've reopened the tickets because I think the results should be examined to make sure the conclusions we reached last time are still reflected in the results.
Chart generator (beta) for examining performance results.
Thanks, that chart generator is great - much easier to use than sofastats which is a PITA to install. Can you submit it or commit it to svn? And maybe edit the wiki to refer to it?
As for clang vs gcc: the gcc win is confirmed.
I've added a section to the wiki at https://www.xpra.org/trac/wiki/Testing documenting the changes to the performance script, as well as instructions generating charts from the data using test_measure_perf_charts.py.
updated wrapper - works with Fedora 23
Related:
Maybe we should just optimize for size! Or for Xeons? or something.
Or maybe experiment with the -falign-* flags. The discussion in that second link seemed to suggest that setting alignment flags produces the equivalent speed improvement that you get when optimizing for size.
Clang getting more competitive apparently: Intel Xeon Skylake Compilers: Clang Showing Strong Performance Against GCC. No x264 data, only lame. Not quite there yet IMO.
On Fedora 26 (clang 4), the clang wrapper script is no longer needed and with Cython 0.26 we can now un-disable more warnings: r16464.
This is just one test, and with a specific CPU, but it seems that clang could give us better FPS with x264 by about 5%: Ryzen Compiler Performance: Clang 4/5 vs. GCC 6/7/8 Benchmarks.
updated wrapper for clang-6.0
Building with clang-6.0 (ie: Fedora 28) hits this error:
clang-6.0: error: unknown argument: '-mcet' clang-6.0: error: unknown argument: '-fcf-protection'
To fix this, use r19079 and the updated wrapper.
With Fedora 29 and clang 7.0.1, add -fstack-clash-protection
to the list.
this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/727