Xpra: Ticket #2055: compression dictionary

Some packet types will be sent thousands of times, it is quite likely that lz4 (or zstandard) would be able to perform a lot better if we trained it first. We could bencode / rencode a bunch of common strings and train it with that. We could send the dictionary to the other end as part of the handshake.

Another good training data set is pixel data: we use lz4 for very small areas (ie: cursors in terminals) and this doesn't compress well at the moment. Maybe this could be helped with training. (ie: repeated 0xffffff00 for white pixels)

Mon, 26 Nov 2018 05:30:09 GMT - Antoine Martin: status, description, summary changed

status changed from new to assigned
description modified (diff)
summary changed from lz4 dictionary to compression dictionary

There are python bindings for zstandard, including dictionary access. See pypi: zstandard: Note: When using dictionary data and compress() is called multiple times, the ZstdCompressionParameters? derived from an integer compression level and the first compressed data’s size will be reused for all subsequent operations. This may not be desirable if source data size varies significantly.

So maybe use two different contexts? One for packet metadata and one for pixel data?

Wed, 20 Mar 2019 05:06:15 GMT - Antoine Martin: milestone changed

milestone changed from 3.1 to 4.0

Milestone renamed

Wed, 12 Feb 2020 05:17:59 GMT - Antoine Martin: milestone changed

milestone changed from 4.0 to 5.0

Sat, 23 Jan 2021 05:40:59 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/2055