xpra icon
Bug tracker and wiki

Opened 8 years ago

Closed 8 years ago

#215 closed defect (fixed)

xpra server using 100% CPU when idle -- possible poll() error

Reported by: alis Owned by: Antoine Martin
Priority: major Milestone: 0.8
Component: core Version: trunk
Keywords: Cc:

Description

I am running the current trunk (rev 2131), and noticed that my xpra server process is constantly using 100% of CPU, even when doing nothing. The server seems to still function correctly.

I looked at the xpra process with strace, and found the following:

10:00:06.175242 poll([{fd=5, events=POLLIN}, {fd=8, events=POLLIN}, {fd=7, events=POLLIN}, {fd=10, events=POLLIN}, {fd=11, events=POLLIN}, {fd=3, events=POLLIN}], 6, -1) = 1 ([{fd=11, revents=POLLIN}]) <0.000007>
10:00:06.175277 read(5, 0x9eead4, 4096) = -1 EAGAIN (Resource temporarily unavailable) <0.000007>
10:00:06.175301 read(7, 0xad85b4, 4096) = -1 EAGAIN (Resource temporarily unavailable) <0.000006>
10:00:06.175324 poll([{fd=5, events=POLLIN}, {fd=8, events=POLLIN}, {fd=7, events=POLLIN}, {fd=10, events=POLLIN}, {fd=11, events=POLLIN}, {fd=3, events=POLLIN}], 6, -1) = 1 ([{fd=11, revents=POLLIN}]) <0.000007>
10:00:06.175359 read(5, 0x9eead4, 4096) = -1 EAGAIN (Resource temporarily unavailable) <0.000006>
10:00:06.175382 read(7, 0xad85b4, 4096) = -1 EAGAIN (Resource temporarily unavailable) <0.000006>
10:00:06.175406 poll([{fd=5, events=POLLIN}, {fd=8, events=POLLIN}, {fd=7, events=POLLIN}, {fd=10, events=POLLIN}, {fd=11, events=POLLIN}, {fd=3, events=POLLIN}], 6, -1) = 1 ([{fd=11, revents=POLLIN}]) <0.000007>
10:00:06.178013 read(5, 0x9eead4, 4096) = -1 EAGAIN (Resource temporarily unavailable) <0.000008>
10:00:06.178039 read(7, 0xad85b4, 4096) = -1 EAGAIN (Resource temporarily unavailable) <0.000007>
10:00:06.178064 poll([{fd=5, events=POLLIN}, {fd=8, events=POLLIN}, {fd=7, events=POLLIN}, {fd=10, events=POLLIN}, {fd=11, events=POLLIN}, {fd=3, events=POLLIN}], 6, -1) = 1 ([{fd=11, revents=POLLIN}]) <0.000007>
10:00:06.178099 read(5, 0x9eead4, 4096) = -1 EAGAIN (Resource temporarily unavailable) <0.000006>
10:00:06.178123 read(7, 0xad85b4, 4096) = -1 EAGAIN (Resource temporarily unavailable) <0.000006>
10:00:06.178147 poll([{fd=5, events=POLLIN}, {fd=8, events=POLLIN}, {fd=7, events=POLLIN}, {fd=10, events=POLLIN}, {fd=11, events=POLLIN}, {fd=3, events=POLLIN}], 6, -1) = 1 ([{fd=11, revents=POLLIN}]) <0.000008>
10:00:06.178183 read(5, 0x9eead4, 4096) = -1 EAGAIN (Resource temporarily unavailable) <0.000007>
10:00:06.178207 read(7, 0xad85b4, 4096) = -1 EAGAIN (Resource temporarily unavailable) <0.000006>

repeated ad infinitum.

It appears that xpra has data waiting on file descriptor 10, but refuses to ever read it, instead trying to read fd's 4 and 6, leaving it stuck in a poll-read loop. Presumably it should either read the data off of fd 10 or else not be poll()ing this fd at all.

FWIW, here are the fd's from /proc/<pid>/fd/ and /proc/<pid>/fdinfo/:

lrwx------ 1 64 Nov 15 09:57 0 -> /dev/null
lrwx------ 1 64 Nov 15 09:57 1 -> /home/ptuser/.xpra/pt-sas06-100.log
lrwx------ 1 64 Nov 15 09:57 2 -> /home/ptuser/.xpra/pt-sas06-100.log
l-wx------ 1 64 Nov 15 09:57 3 -> socket:[7165391]
lr-x------ 1 64 Nov 15 09:57 4 -> socket:[7165397]
lrwx------ 1 64 Nov 15 10:01 5 -> socket:[7165419]
lrwx------ 1 64 Nov 15 10:01 6 -> socket:[7165417]
lrwx------ 1 64 Nov 15 10:01 7 -> socket:[7165424]
lr-x------ 1 64 Nov 15 10:01 8 -> pipe:[7165421]
l-wx------ 1 64 Nov 15 10:01 9 -> pipe:[7165421]
lrwx------ 1 64 Nov 15 10:01 10 -> socket:[7165454]
lr-x------ 1 64 Nov 15 10:01 11 -> pipe:[7165458]
l-wx------ 1 64 Nov 15 10:01 12 -> pipe:[7165458]
lrwx------ 1 64 Nov 15 10:01 13 -> socket:[7165676]
0	pos:	0
	flags:	0100000
1	pos:	2528198
	flags:	0100001
2	pos:	2528198
	flags:	0100001
3	pos:	0
	flags:	02
4	pos:	0
	flags:	02
5	pos:	0
	flags:	04002
6	pos:	0
	flags:	04002
7	pos:	0
	flags:	04002
8	pos:	0
	flags:	00
9	pos:	0
	flags:	01
10	pos:	0
	flags:	04002
11	pos:	0
	flags:	00
12	pos:	0
	flags:	01
13	pos:	0
	flags:	02

With gdb, here's a couple backtraces from the running idle xpra:

#0  0x00007fd7dbf0e796 in poll () from /lib64/libc.so.6
#1  0x00007fd7d8ea7d59 in g_main_context_poll (n_fds=<optimized out>, fds=<optimized out>, priority=<optimized out>, timeout=<optimized out>, context=<optimized out>)
    at gmain.c:2904
#2  g_main_context_iterate (context=0x9eb070, block=1, dispatch=1, self=<optimized out>) at gmain.c:2586
#3  0x00007fd7d8ea835d in IA__g_main_loop_run (loop=0xb7ae40) at gmain.c:2799
#4  0x00007fd7d2b69a77 in gtk_main () from /usr/lib64/libgtk-x11-2.0.so.0
#5  0x00007fd7d31cd44a in _wrap_gtk_main (self=<optimized out>) at ./gtk.override:1241
#6  0x00007fd7dcb32b91 in call_function (oparg=<optimized out>, pp_stack=<optimized out>) at Python/ceval.c:3734
#7  PyEval_EvalFrameEx (f=0xb93600, throwflag=<optimized out>) at Python/ceval.c:2412
#8  0x00007fd7dcb328dc in fast_function (nk=<optimized out>, na=<optimized out>, n=<optimized out>, pp_stack=<optimized out>, func=<optimized out>) at Python/ceval.c:3836
#9  call_function (oparg=<optimized out>, pp_stack=<optimized out>) at Python/ceval.c:3771
#10 PyEval_EvalFrameEx (f=0x7f2fe0, throwflag=<optimized out>) at Python/ceval.c:2412
#11 0x00007fd7dcb37b07 in PyEval_EvalCodeEx (co=0x7fd7d42a1828, globals=<optimized out>, locals=<optimized out>, args=0x7eff38, argcount=0, kws=0x6a2fc8, kwcount=0, defs=0x0, 
    defcount=0, closure=0x0) at Python/ceval.c:3000
#12 0x00007fd7dcb31dd3 in fast_function (nk=<optimized out>, na=<optimized out>, n=<optimized out>, pp_stack=<optimized out>, func=<optimized out>) at Python/ceval.c:3846
#13 call_function (oparg=<optimized out>, pp_stack=<optimized out>) at Python/ceval.c:3771
#14 PyEval_EvalFrameEx (f=0x6a2d10, throwflag=<optimized out>) at Python/ceval.c:2412
#15 0x00007fd7dcb37b07 in PyEval_EvalCodeEx (co=0x7fd7dcefc648, globals=<optimized out>, locals=<optimized out>, args=0x68fa10, argcount=0, kws=0x68e678, kwcount=0, defs=0x0, 
    defcount=0, closure=0x0) at Python/ceval.c:3000
#16 0x00007fd7dcb31dd3 in fast_function (nk=<optimized out>, na=<optimized out>, n=<optimized out>, pp_stack=<optimized out>, func=<optimized out>) at Python/ceval.c:3846
#17 call_function (oparg=<optimized out>, pp_stack=<optimized out>) at Python/ceval.c:3771
#18 PyEval_EvalFrameEx (f=0x68e4e0, throwflag=<optimized out>) at Python/ceval.c:2412
#19 0x00007fd7dcb37b07 in PyEval_EvalCodeEx (co=0x7fd7dcef03f0, globals=<optimized out>, locals=<optimized out>, args=0x0, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, 
    closure=0x0) at Python/ceval.c:3000
#20 0x00007fd7dcb30f22 in PyEval_EvalCode (co=0xb7a170, globals=0x6, locals=0xffffffffffffffff) at Python/ceval.c:541
#21 0x00007fd7dcb4c5d3 in run_mod (mod=<optimized out>, filename=<optimized out>, globals=0x63d120, locals=0x63d120, flags=<optimized out>, arena=<optimized out>)
    at Python/pythonrun.c:1358
#22 0x00007fd7dcb4c68a in PyRun_FileExFlags (fp=0x68da10, filename=0x7fff0761054a "/usr/bin/xpra", start=257, globals=0x63d120, locals=0x63d120, closeit=1, flags=0x7fff0760f890)
    at Python/pythonrun.c:1344
#23 0x00007fd7dcb4d094 in PyRun_SimpleFileExFlags (fp=0x68da10, filename=0x7fff0761054a "/usr/bin/xpra", closeit=1, flags=0x7fff0760f890) at Python/pythonrun.c:948
#24 0x00007fd7dcb5481b in Py_Main (argc=<optimized out>, argv=0x7fff0760f9a8) at Modules/main.c:617
#25 0x00007fd7dbe5ec36 in __libc_start_main () from /lib64/libc.so.6
#26 0x00000000004006e9 in _start () at ../sysdeps/x86_64/elf/start.S:113
#0  0x00007fd548342750 in __read_nocancel () from /lib64/libpthread.so.0
#1  0x00007fd53f1cf450 in read (__nbytes=<optimized out>, __buf=<optimized out>, __fd=<optimized out>) at /usr/include/bits/unistd.h:45
#2  _xcb_in_read (c=0x9eb4f0) at xcb_in.c:498
#3  0x00007fd53f1cf8e8 in xcb_poll_for_event (c=0x9eb4f0) at xcb_in.c:407
#4  0x00007fd53f6318b5 in process_responses (dpy=0x4, wait_for_first_event=0, current_error=0x0, current_request=0) at xcb_io.c:115
#5  0x00007fd53f631ff6 in _XEventsQueued (dpy=0x9e92c0, mode=<optimized out>) at xcb_io.c:197
#6  0x00007fd53f61a50d in XPending (dpy=0x9e92c0) at Pending.c:56
#7  0x00007fd53e2d9496 in gdk_check_xpending (display=<optimized out>) at gdkevents-x11.c:154
#8  gdk_event_prepare (source=<optimized out>, timeout=<optimized out>) at gdkevents-x11.c:2330
#9  0x00007fd5449a77e9 in IA__g_main_context_prepare (context=0x9e7a60, priority=0x7fffbd80a6f4) at gmain.c:2280
#10 0x00007fd5449a7b79 in g_main_context_iterate (context=0x9e7a60, block=1, dispatch=1, self=<optimized out>) at gmain.c:2571
#11 0x00007fd5449a835d in IA__g_main_loop_run (loop=0xb43f50) at gmain.c:2799
#12 0x00007fd53e669927 in IA__gtk_main () at gtkmain.c:1218
#13 0x00007fd53eccd44a in _wrap_gtk_main (self=<optimized out>) at ./gtk.override:1241
#14 0x00007fd548632b91 in call_function (oparg=<optimized out>, pp_stack=<optimized out>) at Python/ceval.c:3734
#15 PyEval_EvalFrameEx (f=0xc3dd80, throwflag=<optimized out>) at Python/ceval.c:2412
#16 0x00007fd5486328dc in fast_function (nk=<optimized out>, na=<optimized out>, n=<optimized out>, pp_stack=<optimized out>, func=<optimized out>) at Python/ceval.c:3836
#17 call_function (oparg=<optimized out>, pp_stack=<optimized out>) at Python/ceval.c:3771
#18 PyEval_EvalFrameEx (f=0x7f2fe0, throwflag=<optimized out>) at Python/ceval.c:2412
#19 0x00007fd548637b07 in PyEval_EvalCodeEx (co=0x7fd53fda1828, globals=<optimized out>, locals=<optimized out>, args=0x7eff38, argcount=10401032, kws=0x6a2fc8, kwcount=0, defs=
    0x0, defcount=0, closure=0x0) at Python/ceval.c:3000
#20 0x00007fd548631dd3 in fast_function (nk=<optimized out>, na=<optimized out>, n=<optimized out>, pp_stack=<optimized out>, func=<optimized out>) at Python/ceval.c:3846
#21 call_function (oparg=<optimized out>, pp_stack=<optimized out>) at Python/ceval.c:3771
#22 PyEval_EvalFrameEx (f=0x6a2d10, throwflag=<optimized out>) at Python/ceval.c:2412
#23 0x00007fd548637b07 in PyEval_EvalCodeEx (co=0x7fd5489fc648, globals=<optimized out>, locals=<optimized out>, args=0x68fa10, argcount=10401032, kws=0x68e678, kwcount=0, defs=
    0x0, defcount=0, closure=0x0) at Python/ceval.c:3000
#24 0x00007fd548631dd3 in fast_function (nk=<optimized out>, na=<optimized out>, n=<optimized out>, pp_stack=<optimized out>, func=<optimized out>) at Python/ceval.c:3846
#25 call_function (oparg=<optimized out>, pp_stack=<optimized out>) at Python/ceval.c:3771
#26 PyEval_EvalFrameEx (f=0x68e4e0, throwflag=<optimized out>) at Python/ceval.c:2412
#27 0x00007fd548637b07 in PyEval_EvalCodeEx (co=0x7fd5489f03f0, globals=<optimized out>, locals=<optimized out>, args=0x0, argcount=10401032, kws=0x0, kwcount=0, defs=0x0, 
    defcount=0, closure=0x0) at Python/ceval.c:3000
#28 0x00007fd548630f22 in PyEval_EvalCode (co=0x4, globals=0x9eb5a4, locals=0x1000) at Python/ceval.c:541
#29 0x00007fd54864c5d3 in run_mod (mod=<optimized out>, filename=<optimized out>, globals=0x63d120, locals=0x63d120, flags=<optimized out>, arena=<optimized out>)
    at Python/pythonrun.c:1358
#30 0x00007fd54864c68a in PyRun_FileExFlags (fp=0x68da10, filename=0x7fffbd80c548 "/usr/bin/xpra", start=257, globals=0x63d120, locals=0x63d120, closeit=1, flags=0x7fffbd80b2d0)
    at Python/pythonrun.c:1344
#31 0x00007fd54864d094 in PyRun_SimpleFileExFlags (fp=0x68da10, filename=0x7fffbd80c548 "/usr/bin/xpra", closeit=1, flags=0x7fffbd80b2d0) at Python/pythonrun.c:948
#32 0x00007fd54865481b in Py_Main (argc=<optimized out>, argv=0x7fffbd80b3e8) at Modules/main.c:617
#33 0x00007fd54795ec36 in __libc_start_main () from /lib64/libc.so.6
#34 0x00000000004006e9 in _start () at ../sysdeps/x86_64/elf/start.S:113

Change History (7)

comment:1 Changed 8 years ago by Antoine Martin

Please specify your distro and versions of py(gtk), glibc, etc..

How are you connecting? ssh? etc..

comment:2 Changed 8 years ago by alis

Oops! The text above should say that the server appears to have data waiting on file descriptor 11 that it's not reading (not 10.)

comment:3 Changed 8 years ago by Antoine Martin

Status: newaccepted

Also specify the exact command that you used for starting and connecting so I can try to reproduce.

ie: do you use --start-child? have you tried without?

do you use xdummy or xvfb? etc

comment:4 Changed 8 years ago by alis

Running on SLES11 SP2 (though I've kept some packages at SP1 versions so I will have debugging symbols.) I normally connect via "xpra attach ssh:<server>:100", but I get this behavior even when no client is connected at all.


Package versions:
glibc-2.11.3-17.39.1
gtk2-2.18.9-0.20.18.1
python-gtk-2.17.0-4.1
xorg-x11-libxcb-7.4-1.20.34
xorg-x11-7.4-9.53.2
xorg-x11-server-7.4-27.66.1
xorg-x11-libs-7.4-8.26.32.1


Command to start is:
XPRA_SYNCHRONIZE=1 XPRA_X11_DEBUG=1 GDK_SYNCHRONIZE=1 xpra start -d all :100

I use Xvfb: my xpra.conf has the default

xvfb=Xvfb +extension Composite -screen 0 3840x2560x24+32 -nolisten tcp -noreset -auth $XAUTHORITY
Last edited 8 years ago by alis (previous) (diff)

comment:5 Changed 8 years ago by Antoine Martin

Hmmm,

python-gtk-2.17.0-4.1

Sounds like your version of python is also going to be quite old.. and may have bugs.

Is this a regression? Does this also occur with 0.6? (there was a setsid change in 0.7 and trunk which may impact the subprocess calls)
Looks to me like python is not reading the subprocess.Popen stdout or stderr.

Last edited 8 years ago by Antoine Martin (previous) (diff)

comment:6 Changed 8 years ago by alis

Yup, it's a regression. It doesn't appear on 0.6.4, and does appear on both 0.7.2 & trunk.

The Python version on the system is 2.6 : package python-2.6.8-0.15.1 .

Last edited 8 years ago by alis (previous) (diff)

comment:7 Changed 8 years ago by Antoine Martin

Resolution: fixed
Status: acceptedclosed

fixed in r2133

Note: See TracTickets for help on using tickets.