xpra icon
Bug tracker and wiki

Opened 16 months ago

Last modified 5 days ago

#1105 assigned defect

systemd multi seat support

Reported by: Antoine Martin Owned by: Antoine Martin
Priority: blocker Milestone: 2.1
Component: server Version: trunk
Keywords: Cc: rektide@…

Description (last modified by Antoine Martin)

See also #1129.

The seat with vt number may be challenging since we don't have a vt number...

Attachments (3)

systemd-run.patch (5.0 KB) - added by Antoine Martin 9 months ago.
wrap xpra server command with systemd-run automatically
polkit.patch (4.6 KB) - added by Antoine Martin 6 months ago.
start polkit automatically (requires session management)
pam-session-v2.patch (9.7 KB) - added by Antoine Martin 7 days ago.
ask the proxy server to call pam_open on our behalf (ends up moving the proxy server process into the new session scope, not what we want..)

Download all attachments as: .zip

Change History (20)

comment:1 Changed 12 months ago by Antoine Martin

Status: newassigned

Enlightening thread: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/ZNQW72UP36UAFMX53HPFFQTWTQDZVJ3M/: systemd-logind will now by default terminate user processes that are part of the user session scope unit. PITA for us.

Debian ticket: systemd kill background processes after user logs out

https://www.freedesktop.org/software/systemd/man/systemd-run.html#Examples, see Start screen as a user service.

PAM support in tmux uses pam_start + pam_open_session - would this be enough?

Last edited 7 months ago by Antoine Martin (previous) (diff)

comment:2 Changed 12 months ago by Antoine Martin

Description: modified (diff)

r12722 (+fixes in r12756 + r12761 + r12753 for osx) adds an "xpra" pam service so we can call pam_open_session early (before daemonizing) when starting a server.
The pam_systemd module should also ensure that the directories are present for #1129.
We'll see if this is enough to prevent us from getting killed.

See also: Is linux-PAM session same as linux process session?: The short answer is no, they're different things, but processes that handle login sessions should handle both of them.
We're not a login session per-se, but as close as can be.

systemd-devel: The whole su/pkexec session debate: This way, screen will keep an "active" reference to the session and systemd-logind will not mark it as "closing". So the session that was nitiated by sshd will be kept open by "screen". Note that pam_open_session() without pam_authenticate() will *not* create a new session but only attach to the current session.

Last edited 12 months ago by Antoine Martin (previous) (diff)

comment:3 Changed 12 months ago by Antoine Martin

Owner: changed from Antoine Martin to jonathan.underwood
Status: assignednew

Wait, as per https://lists.freedesktop.org/archives/systemd-devel/2013-December/014996.html: The session is still marked as "closing" but because processes still exist it never quite dies. And yes, the kill processes option (which is a nice thing to enable if possible) would indeed kill the screen.

@jonathan.underwood: How on earth are we supposed to fix this thing?
We don't want or need root, just tell logind to move the process into its own session.

comment:4 Changed 11 months ago by jonathan.underwood

Well, I am no expert here :) But this is a somewhat hot topic at the moment. I very much think xpra is in the same boat as Screen and tmux. In case you missed it, this is a nice summary of why it's a hot topic:

http://lwn.net/Articles/689732/

The best thing xpra could do, i think, is start in a new process tree. Quite what the right mechanism for that is is unclear - I expect you don't want to do the dbus dance to talk to the systemd daemon to create a new session and control group (which would be the systemd maintainers preferred route).

Something along the lines of this comment might be one way to go:

http://lwn.net/Articles/690795/

This also makes for interesting reading:

https://github.com/tmux/tmux/issues/428

ps. Sorry for the late reply and lack of packaging activity in recent weeks - have changed jobs. I should be getting back to packaging now though.

comment:5 Changed 11 months ago by jonathan.underwood

Actually, probably the "right" way to go on systems using systemd is to use systemd-run to launch xpra:

https://www.freedesktop.org/software/systemd/man/systemd-run.html

comment:6 Changed 11 months ago by Antoine Martin

Milestone: 0.181.0

Milestone renamed

comment:7 Changed 10 months ago by Antoine Martin

the "right" way to go on systems using systemd is to use systemd-run to launch xpra


Users shouldn't really need to care about this low-level plumbing, so when they issue an "xpra start", they expect it to survive their current session (be it an ssh session, or even a full desktop environment). That's especially true of ssh sessions started with "xpra start ssh:HOST --start=xterm".

So we would need to do this from "xpra start ...":

  • find out if systemd is pid1 - how?
  • optionally, find out if KillUserProcesses=yes and skip the workaround if it isn't needed?
  • call systemd-run --scope --user xpra _start $@ (and make "xpra _start" the same as before)

I tried to test this using a guest account:

  • set KillUserProcesses=yes
  • ran loginctl disable-linger
  • ssh guest@localhost
  • start xpra server
  • exit ssh

And the xpra server survived... Fedora 24 all up to date.
What am I missing?
@jonathan.underwood: see also ticket:1129#comment:21

Changed 9 months ago by Antoine Martin

Attachment: systemd-run.patch added

wrap xpra server command with systemd-run automatically

comment:8 Changed 9 months ago by Antoine Martin

Priority: majorcritical

Actually, probably the "right" way to go on systems using systemd is to use systemd-run to launch xpra:


As of r13378, we now run server commands via systemd-run:

$ xpra start --start=xterm --no-daemon --systemd-run-args="-p MemoryAccounting=true -p MemoryLimit=64M" 
using systemd-run to wrap 'start' server command
'systemd-run' '--scope' '--user' '-p' 'MemoryAccounting=true' '-p' 'MemoryLimit=64M' '/usr/bin/xpra' \
    'start' '--start=xterm' '--systemd-run-args=-p MemoryAccounting=true -p MemoryLimit=64M' '--daemon=no'
Running scope as unit run-rd905fbd12caf4ec8b400030991401a14.scope.
(...)
● run-rd905fbd12caf4ec8b400030991401a14.scope - /usr/bin/xpra start --start=xterm --systemd-run-args=-p MemoryAccounting=true -p MemoryLimit=64M --daemo
   Loaded: loaded
Transient: yes
  Drop-In: /run/user/1000/systemd/user/run-rd905fbd12caf4ec8b400030991401a14.scope.d
           └─50-Description.conf, 50-MemoryAccounting.conf, 50-MemoryLimit.conf
   Active: active (running) since Wed 2016-08-17 16:09:09 ICT; 51s ago
   CGroup: /user.slice/user-1000.slice/user@1000.service/run-rd905fbd12caf4ec8b400030991401a14.scope
           ├─25491 /bin/python /usr/bin/xpra start --start=xterm --systemd-run-args=-p MemoryAccounting=true -p MemoryLimit=64M --daemon=no
           ├─25502 /usr/libexec/Xorg -noreset -nolisten tcp +extension GLX +extension RANDR +extension RENDER -auth /run/user/1000/gdm/Xauthority -logfi
           ├─25509 /usr/bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session
           ├─25639 xterm
           └─25641 bash

Aug 17 16:09:09 desktop systemd[1417]: Started /usr/bin/xpra start --start=xterm --systemd-run-args=-p MemoryAccounting=true -p MemoryLimit=64M --daemon
Aug 17 16:09:09 desktop python[25491]: pam_systemd(xpra:session): pam-systemd initializing
Aug 17 16:09:09 desktop python[25491]: pam_systemd(xpra:session): Asking logind to create session: uid=1000 pid=25491 service=xpra type=x11 class=user d
Aug 17 16:09:09 desktop python[25491]: pam_systemd(xpra:session): Failed to create session: Access denied

So we end up with a cgroup for the session, but there are problems:

  • the parent scope is still wrong (and likely to get killed on user logout which will clear the user slice):
    $ systemd-cgls
    Control group /:
    -.slice
    ├─init.scope
    (...)
    └─user.slice
      └─user-1000.slice
        ├─session-1.scope
        └─user@1000.service
          (...)
          ├─run-r450625a9be2343f0bfb2034b01db64ee.scope
          │ ├─13164 /bin/python /usr/bin/xpra start --start=xterm --bind-tcp=0.0....
          │ ├─13180 /usr/bin/dbus-daemon --fork --print-pid 5 --print-address 7 -...
          │ ├─13306 xterm
          │ └─13308 bash
    
  • the resource limits (example here: Using systemd-run to limit something's RAM consumption on the fly) aren't being enforced either (no way the server could run with 64M of memory) - looks like https://github.com/systemd/systemd/issues/3945, which means we need "cgroups v2" as per https://github.com/systemd/systemd/issues/3744
  • pam_systemd is not happy, according to logind: CreateSession?() and ReleaseSession?() may be used to open or close login sessions. These calls should never be invoked directly by clients. Creating/closing sessions is exclusively the job of PAM and its pam_systemd module., the pam_systemd man page states: If it does not exist yet, the user runtime directory /run/user/$USER is created and its ownership changed to the user that is logging in and may fix XDG_RUNTIME_DIR compatibility with more distro versions (see #1129). Looking a the code, it looks like the access is denied for calling login1 manager CreateSession via dbus (with r13379):
    python: pam_systemd(xpra:session): Asking logind to create session: uid=1000 pid=11564 service=xpra type=x11 class=user desktop=xpra seat= vtnr=0 tty= display=#001 remote=no remote_user= remote_host=
    python: pam_systemd(xpra:session): Failed to create session: Access denied
    

(re-tested after the r13505 pam fix for xauth data)

See also #1335

Last edited 8 months ago by Antoine Martin (previous) (diff)

comment:9 Changed 8 months ago by Antoine Martin

r14062 disables pam_open for now because it causes the service (#1335) to run in a user-0 slice instead of the system slice.

comment:10 Changed 7 months ago by Antoine Martin

Owner: changed from jonathan.underwood to Antoine Martin
Status: newassigned

Instead of ensuring that the session survives, this seems to have the exact opposite effect (and worse - requiring a reboot to properly clear things), details in #1348.
I've tested both with KillUserProcesses=no and KillUserProcesses=yes with the same result.

xpra does get killed unceremoniously but worst of all this seems to have an effect on ssh making the next login attempt take forever. (looks similar to systemd issue 2863)

I've asked for help on the systemd-devel mailing list: PAM session hooks for independent session

Alternatively, we could expand the proxy server to start new sessions on behalf of other users. The proxy server runs as root and should have sufficient privileges to invoke logind's createsession. Downsides: we don't currently require the proxy server to be running and this may slow down session startup.

Last edited 7 months ago by Antoine Martin (previous) (diff)

Changed 6 months ago by Antoine Martin

Attachment: polkit.patch added

start polkit automatically (requires session management)

comment:11 Changed 6 months ago by Antoine Martin

Milestone: 1.02.0

The answer from the systemd mailing list is that we do need a suid binary to do the registration: https://lists.freedesktop.org/archives/systemd-devel/2016-November/037700.html.

Too late to start messing with the suid / socket activation approaches now.

comment:12 Changed 5 months ago by rektide

Cc: rektide@… added

comment:13 Changed 3 months ago by Antoine Martin

Milestone: 2.02.1

comment:14 Changed 11 days ago by Antoine Martin

Priority: criticalblocker

Some related changes:

  • r15819 allows to run even stdin / stdout / stderr are closed
  • r15882 can quieten systemd-run

r15810 added uid and gid support when running as root (added benefits: can listen to ports below 1024 without running as root or using iptables)
So theoretically we could ask the root proxy server to start sessions for us and do the pam / logind registration. (that bit seems to work?)
The permissions could be restricted using regular authentication or even SO_PEERCRED / SCM_CREDENTIALS (probably the former).
So far so good.

But then I found:

  • KillUserProcesses in logind.conf is broken in Fedora 26? xpra survives out of the box - at least for now:
    $ sudo loginctl disable-linger guest
    $ loginctl show-user | grep Kill
    KillUserProcesses=yes
    $ sudo loginctl show-user guest | grep Linger
    Linger=no
    $ loginctl user-status
    guest (1001)
               Since: Thu 2017-05-18 22:23:32 +07; 25min ago
               State: active
            Sessions: *10
              Linger: no
                Unit: user-1001.slice
                      ├─session-10.scope
                      │ ├─3122 sshd: guest [priv]
                      │ ├─3174 sshd: guest@pts/1
                      │ ├─3196 -bash
                      │ ├─4244 loginctl user-status
                      │ └─4245 less
                      ├─session-4.scope
                      │ ├─1658 /usr/bin/python2 /usr/bin/xpra --bind-tcp=0.0.0.0:10000 start :10 --start=xterm --systemd-ru
                      │ ├─1659 /usr/libexec/Xorg-for-Xpra-:10 -noreset -novtswitch -nolisten tcp +extension GLX +extension 
                      │ ├─1686 /usr/bin/dbus-daemon --syslog-only --fork --print-pid 5 --print-address 7 --session
                      │ ├─1932 pulseaudio --start -n --daemonize=false --system=false --exit-idle-time=-1 --load=module-sus
                      │ ├─2134 /usr/libexec/gvfsd
                      │ ├─2162 xterm
                      │ └─2164 bash
                      └─user@1001.service
                        └─init.scope
                          ├─3127 /usr/lib/systemd/systemd --user
                          └─3155 (sd-pam)
    

Despite the documentation (https://www.freedesktop.org/software/systemd/man/logind.conf.html) stating that: Note that setting KillUserProcesses?=yes will break tools like screen(1) and tmux(1), unless they are moved out of the session scope. See example in systemd-run(1). - EDIT: seems to work on another system...

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Resource_Management_Guide/chap-Using_Control_Groups.html)
systemd-run either needs to be disabled by default (only on the distributions affected? could be kernel configuration related...), or changed to "auto" so we can check before trying (or even fallback after failing?)

Links:

Last edited 9 days ago by Antoine Martin (previous) (diff)

comment:15 Changed 7 days ago by Antoine Martin

Socket activation has been added (partial), see #1521.
Minor improvements to the system-wide proxy server in r15899, r15897, r15894.
Preparatory work in r15901, r15902, r15903, r15904.
Merged hidden "request-start" subcommand in r15906, ie:

xpra request-start --start=xterm :100

Will connect to the system-wide proxy server and make it start this session.


There are two ways of changing uid:

  • using xpra's "--uid=" and "--gid=" switches, this works but the session scope ends up belonging to root: user-0.slice / session-cNNN.scope - maybe we should use systemd-run without uid here?
  • (code commented out): using "systemd-run --uid= --gid=", but then the xpra process doesn't have the privileges required for doing the pam logind dance... on the plus side, the xpra server instance end up in a system.slice / run-$UUID.scope. (not sure we could ask the system proxy to do it on our behalf)

Issues:

  • pam registration works, but the scope is still wrong..
  • pulseaudio doesn't seem to get killed? (and sometimes also dbus, etc)
  • we pass too many arguments to the new instances: ['/usr/bin/xpra', 'start', '--csc-modules=all', '--packet-encoders=rencode, bencode, yaml', '--video-decoders=all', '--encodings=all', '--compressors=lz4, lzo, zlib', '--start=xterm', '--video-encoders=all', '--env=XPRA_PROXY_START_UUID=3f2cc30518ea4e2498cd85c68c87f3ae', '--systemd-run=no', '--uid=1000', '--gid=1000'] (most of the values are equivalent to the defaults)
  • XDG_RUNTIME_DIR problems... again (#1129), ie: error: XDG_RUNTIME_DIR not set in the environment.
  • restarting the system proxy should not kill the sessions it has started!
  • the Xorg process ends up in the wrong scope! (it doesn't with the "systemd-run" option above..)
  • maybe the proxy instance process should also be in the new session scope (and it should have a better command description in the process list)
  • after the client requests a new session, we connect to it through the proxy which is unnecessary (and slower) - we should be able to tell the client where it needs to connect directly instead (and the client can then decide to continue or not, ie: it may have to continue if the direct connection to the server fails)
  • xauth errors: xauth: unable to generate an authority file name, ie: Error running "xauth add :2 MIT-MAGIC-COOKIE-1 8194181c038c4086bcb206ee7610e98d": non-zero exit code: 1
Last edited 7 days ago by Antoine Martin (previous) (diff)

Changed 7 days ago by Antoine Martin

Attachment: pam-session-v2.patch added

ask the proxy server to call pam_open on our behalf (ends up moving the proxy server process into the new session scope, not what we want..)

comment:16 Changed 7 days ago by Antoine Martin

Mostly working as of r15907 + r15908 + r15909 via the new "request-start" subcommand, using "peercred" auth (#1524).
The xpra server process is started as root by the system proxy instance, it does the pam registration before changing uid, and updates the DISPLAY attribute once we have it.
We end up with a new session scope hanging off the user's slice:

Control group /:
-.slice
├─user.slice
│ ├─user-1000.slice
│ │ └─session-c32.scope
│ │   ├─31069 /bin/python /usr/bin/xpra start :100 --csc-modules=all ...
│ │   ├─31071 /usr/libexec/Xorg-for-Xpra-:100 -noreset -novtswitch -nolisten tcp ...
│ │   ├─31090 /usr/bin/dbus-daemon --syslog-only --fork --print-pid 5 --print-address 7 --session
│ │   ├─31318 pulseaudio --start -n --daemonize=false --system=false --exit-idle-time=-1 ...
│ │   ├─31457 xterm
│ │   ├─31459 bash
│ │   ├─31761 /usr/libexec/gvfsd
│ │   └─31767 /usr/libexec/gvfsd-fuse /home/antoine/.gvfs -f -o big_writes
...

And this is also shown as a session, without a seat or controlling TTY:

[antoine@desktop ~]$ loginctl list-sessions
   SESSION        UID USER             SEAT             TTY             
        c3         42 gdm              seat0            /dev/tty1       
       c32       1000 antoine                                           
        18       1000 antoine          seat0            /dev/tty2       

Exiting the xpra server terminates the whole session and all the processes get killed reliably.
Sessions started via ssh survive the logout too.

Still TODO:

  • #1521 selinux blocker
  • automatically use the system proxy with "xpra start" (and "start-desktop", "shadow"), fallback to systemd-run if unreachable / disabled (but not on F26 where "systemd-run --user --scope" is broken..)
  • this requires server "start" to use the client code... which is now separate (#1253)
  • maybe require "xpra" group membership to use the system proxy to start sessions?
  • too many places chown directories, dangerous
  • the proxy starts pretty quickly (50ms or so), but going via the system proxy is noticeably slower - maybe we can do better
  • if the session already exists, maybe we should connect to it rather than failing?
  • XDG_RUNTIME_DIR is not set? (try over TCP with a different auth module, instead of peercred + ssh)
  • better tools for using the system proxy instance? (listing all sessions, stopping them, etc)

Still as per comment:15 :

  • proxy instance process description
  • too many arguments passed to subprocess
  • connect directly to new server

Some good documentation on control groups: LWN: Control groups series by Neil Brown

Last edited 7 days ago by Antoine Martin (previous) (diff)

comment:17 Changed 5 days ago by Antoine Martin

Debian packaging of the systemd service: #1530

Note: See TracTickets for help on using tickets.