Ack! What's using up all my disk space?

Finding what's using up your disk space goes more smoothly if you're ready to run a series of helpful commands

overflowing vladimer shioshvili
Credit: flickr / Vladimer Shioshvili

Disk space filling up can really mess your systems up, especially if the root partition gets to 100%. And that can happen when you least expect it. In fact, that's generally when it does happen. So, the question is, when you're under pressure to get a critical system or application back online and that "no space left on device" message keeps slapping you in the face, what do you do?

One of the best answers to that question is that you pull out some tools that might help you figure out what is happening. Determining whether the disk space shortage is something that has been gradually creeping up on you or is in response to some serious problem that is plaguing your server right now can be of considerable benefit. But, if you haven't been routinely monitoring disk space usage, that's not an easy thing to figure out. Having a series of commands set up to help pinpoint the problem, on the other hand, might still help you identify the problem fairly quickly. Yeah, the Boy Scouts' "be prepared" motto works for system administrators too.

Finding the big files

If there's any question about where your disk space shortage is coming from, a quick df -h will highlight the file system that has run out of room. This is pretty obvious, but is always a good place to start.

$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/cciss/c0d0p2      20G  544M   18G   3% /
/dev/cciss/c0d0p7      20G  2.5G   16G  14% /usr
/dev/cciss/c0d0p6      20G  520M   18G   3% /var
/dev/cciss/c0d0p5      20G  7.2G   12G  40% /apps
/dev/cciss/c0d0p1      99M   23M   72M  25% /boot
tmpfs                1014M     0 1014M   0% /dev/shm
fermion:/u            193G  5.0G  178G   3% /u
fermion:/var/spool/mail
                       19G  554M   18G   4% /var/spool/mail
/dev/cciss/c0d0p8      19G  9.8G  8.2G  55% /u/home

The output of the df command will show you how full each file system is, displaying the space used and available in the "human readable" format (e.g., 18G instead of 18710668). You can also get a little fancier with this command. Using -l (local), for example, you can omit file systems that are mounted from other systems.

$ df -hl
Filesystem            Size  Used Avail Use% Mounted on
/dev/cciss/c0d0p2      20G  544M   18G   3% /
/dev/cciss/c0d0p7      20G  2.5G   16G  14% /usr
/dev/cciss/c0d0p6      20G  520M   18G   3% /var
/dev/cciss/c0d0p5      20G  7.2G   12G  40% /apps
/dev/cciss/c0d0p1      99M   23M   72M  25% /boot
tmpfs                1014M     0 1014M   0% /dev/shm
/dev/cciss/c0d0p8      19G  9.8G  8.2G  55% /u/home

You can also order the output of the sort command to list your file systems in percentage full order. In the sort command below, we're doing a numeric sort on the 5th field of the df output.

$ df -hl | sort -n -k5
Filesystem            Size  Used Avail Use% Mounted on
tmpfs                1014M     0 1014M   0% /dev/shm
/dev/cciss/c0d0p2      20G  544M   18G   3% /
/dev/cciss/c0d0p6      20G  520M   18G   3% /var
/dev/cciss/c0d0p7      20G  2.5G   16G  14% /usr
/dev/cciss/c0d0p1      99M   23M   72M  25% /boot
/dev/cciss/c0d0p5      20G  7.2G   12G  40% /apps
/dev/cciss/c0d0p8      19G  9.8G  8.2G  55% /u/home

The du (disk used) command can then help you to focus on where, within the file system, that space is going. In the command below, we are looking at the largest directories within a particular file system.

$ du hold | sort -n | tail -5
96      hold/Lab10
160     hold/pr09
244     hold/fetch
616     hold/2010
2072    hold

What you probably will want to know, however, is not just which subdirectories are the largest, but which have recently grown. For this, you can add the --time option to du to sort your du output by the last update time.

$ du --time . | sort -k2 | tail -5
4       2015-03-12 16:20        ./src
60      2015-06-15 08:23        ./log
2708    2015-07-20 18:09        ./bin
45836   2015-08-15 16:43        .
160     2015-08-15 16:43        ./projects

You can also search for the largest files in a particular file system using a find command like this one:

$ find . -size +1M -ls | sort -n -k7
15089762 1264 -rw-r-----   1 shs      staff   1289365 Feb 24  2014 ./bin/235.log
12731834 1724 -rw-r-----   1 shs      staff   1761280 Oct 15  2014 ./bin.tar
13320206 2192 -rw-------   1 shs      staff   2239058 Dec  8  2014 ./mail/lab7
13320203 6308 -rw-------   1 shs      staff   6443348 Oct 26  2014 ./mail/lab6
12731744 19736 -rw-r-----   1 shs      staff  20183040 Jul 29  2014 ./backup.tar

Adjust the size parameter to something that works well on your system or you might have a lot more output than you care to comb through.

Note that looking both for large files and large subdirectories can be helpful because the problem plaguing your system could be that a single file (e.g., a log file) is growing more quickly than it should or that a series of files are being added to the system.

Setting up your most effective commands as aliases and functions

Once you've selected a series of commands that give you the view of your disk usage that works for you, committing them to aliases makes them easy to use even when you're under pressure. Add the aliases to your .bashrc or other start up file so that they're ready for use.

alias bigfs='df -hl | sort -n -k5'
alias findbig='find / -size +1G -ls'

Using a function instead of an alias, on the other hand, makes it easy to set up your command to take an argument. That makes them considerably more flexible and useful.

function bigdirs () {
   du $1 | sort -n | tail -5
}

function findbig () {
   find $1 -size +1G -ls
}

Just don't do both (aliases and functions). Your aliases will take precedence over your functions if they have the same names.

This story, "Ack! What's using up all my disk space?" was originally published by ITworld.

Related:
Shop Tech Products at Amazon