xpra icon
Bug tracker and wiki

This bug tracker and wiki are being discontinued
please use https://github.com/Xpra-org/xpra instead.


Changes between Initial Version and Version 1 of Ticket #520


Ignore:
Timestamp:
02/18/14 04:05:48 (8 years ago)
Author:
Antoine Martin
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Ticket #520

    • Property Owner changed from Antoine Martin to Antoine Martin
    • Property Status changed from new to assigned
    • Property Summary changed from cuda and nvenc load balancing to CUDA and NVENC load balancing
  • Ticket #520 – Description

    initial v1  
    11Related to #504 and #466.
    22
    3 When we have multiple cards and/or multiple virtual cards (GRID K1, K2 and others) in the same system, we want to ensure that the load is fairly evenly distributed amongst all the (v){{{GPU}}}s.
     3When we have multiple cards and/or multiple virtual cards (GRID K1, K2 and others) in the same server, we want to ensure that the load is fairly evenly distributed amongst all the {{{(v)GPU}}}s.
    44
    5 With CUDA, this isn't a problem. But with NVENC, we have no way of knowing how many contexts are still free. What happens when we reach the limit is that creating a new context will fail...
     5With CUDA, this isn't a problem. But with NVENC, we have no way of knowing how many contexts are still free. What happens when we reach the limit is that creating a new context will just fail...
    66We cannot assume that we are the only user of the device on the system, especially with proxy encoding (#504) where each proxy instances runs in its own process space.
    77
    8 The code added in r5488 moves the CUDA device selection to a utility module (amongst other things) and uses the percentage of free memory to choose the device to use. Since there are normally up to 32 contexts per GPU, this should work as a cheap load balancing solution (even with 4 {{{vGPU}}}s per slot, thing will even out before we reach 20% capacity). This won't take into account the size of the encoding contexts, but since we reserve large context buffers in all cases (see r5442 - done for #410) and since the sizes should be randomly distributed anyway, this should not be too much of a problem.
    9 We also keep track of context failures and lower the NVENC codec runtime score when failures have been encountered recently. This should ensure that as we get closer to the limit, we become less likely to try to use NVENC.
     8The code added in r5488 moves the CUDA device selection (amongst other things) to a utility module and uses the percentage of free memory to choose the device to use. Since there are normally up to 32 contexts per GPU, this should work as a cheap load balancing solution: even with 4 {{{vGPU}}}s per PCIE slot, things will even out before we reach 20% capacity. This won't take into account the size of the encoding contexts, but since we reserve large context buffers in all cases (see r5442 - done for supporting #410) and since the sizes should be randomly distributed anyway, this should not be too much of a problem.
     9We lower the NVENC codec score as we create more contexts, and we also keep track of context failures to lower the score further (taking into account how recent the failure was). This should ensure that as we get closer to the limit, we become less likely to try to use NVENC, or that when we do hit the hard limit, we have a gradual grace period until we try again.
    1010
    1111What remains to be done:
    1212* link the NVENC context failures to the CUDA context they occurred on: other devices may still have free contexts, we should try those first if asked to create a new NVENC context
    1313* maybe timeout the contexts: a context that has not been used for N seconds could probably be put to better use (may depend on current load - which is difficult to estimate in a proxy encoder context..)
     14* in the context of proxy encoding, as we lower the NVENC codec score, we will still receive RGB frames from the server being proxied and so we need to fallback to x264 or another encoding. At the moment, we fail hard if we cannot find a fallback video encoder..
    1415
    1516Notes: