Another deceptive little problem on Linux. If you have a machine that's dipping deep into its swap space, how do you find what processes consume the most? Wouldn't be nice if ps
had a column for this? Well, version 3.3.12 doesn't seem to. So we need to go to the source: /proc/
.
If you like to Read the Fucking Manual, you'll find the VmSwap
line in /proc/$PID/status
has been with us since Linux 2.6.34. If you have a really old machine this won't work.
Here's how I solved this the first time:
for pid in /proc/[0-9]*; do
pn=$(readlink $pid/exe)
[[ $pn ]] || continue
pn=$(basename $pn)
sw=$(grep VmSwap $pid/status | cut -f2)
echo "${pid##*/} $sw $pn"
done | sort -nrk2
This produces output like this:
3841 253140 kB stonithd 3844 12840 kB pengine 3840 2048 kB cib 3745 1972 kB snmpd 3791 1856 kB corosync 35028 452 kB nrpe 3843 416 kB attrd 37928 408 kB udevd 3842 388 kB lrmd 618 340 kB udevd 37919 328 kB udevd 37653 220 kB rpc.statd 33
It's not pretty, but it works. We'll improve on that. First, let's improve on the speed. The quick & dirty version spawns fourprocesses per loop iteration: readlink
, basename
, grep
and cut
. Here's a faster version with the judicious application of some awk
. It spawns just two external commands now (readlink
and awk
):
for pid in /proc/[0-9]*; do
pn=$(readlink $pid/exe)
[[ $pn ]] || continue
sw=$(awk <$pid/status '/VmSwap/ { if ($2) print $2 }')
[[ $sw ]] && echo "${pid##*/} $sw ${pn##*/}"
done
You can, of course, sort the output:
for pid in /proc/[0-9]*; do
pn=$(readlink $pid/exe)
[[ $pn ]] || continue
sw=$(awk <$pid/status '/VmSwap/ { if ($2) print $2 }')
[[ $sw ]] && echo "${pid##*/} $sw ${pn##*/}"
done | sort -nk2
Use sort -rnk2
to sort in descending order.
For some fancier output, pipe it all to printf
, reverse the sort, and keep the first 10 processes to get a top 10:
for pid in /proc/[0-9]*; do
pn=$(readlink $pid/exe)
[[ $pn ]] || continue
sw=$(awk <$pid/status '/VmSwap/ { if ($2) print $2 }')
[[ $sw ]] && echo "${pid##*/} $sw ${pn##*/}"
done | sort -rnk2 | head -n10 | xargs printf "%5d %8d %s\n"
This will produce something like this:
3841 253140 stonithd 3844 12840 pengine 3840 2048 cib 3745 1972 snmpd 3791 1856 corosync 35028 452 nrpe 3843 416 attrd 37928 408 udevd 3842 388 lrmd 618 340 udevd
And here's the culprit, it looks like stonithd
is taking up the bulk of swap space. Shame on you, stonithd
, you glutton. Nowhere near as bad as on another machine, where I found it hoarding 7GB of swap space for no discernible reason, the pig.