=^.^=

Find and Delete the Largest Files on a File System

karma

Using Find on the CLI

The fastest way to find large or run-away files on a whole filesystem or specific directory is to run:
find /path -xdev -type f -follow | xargs ls -lsh | sort -rhk 6,6 | head -20
Where /path is the target and 20 is the number of results you would like to see (sparing yourself a flooded terminal buffer). The output looks something like:

16M -rwxr-xr-x 1 user group 16M Jan 6 06:02 ./static/files/windows/bootdist.zip 2.0M -rw-r--r-- 1 user group 2.0M Jun 23 2022 ./static/files/2022/06/23/sonic-codebg.png 2.0M -rw-r--r-- 1 user group 2.0M Jan 21 2022 ./static/files/2022/01/21/laptopbags.jpg 1.6M -rw-r--r-- 1 user group 1.6M Jan 12 2020 ./static/files/2020/01/12/remainindoors.gif 1.4M -rw-r--r-- 1 user group 1.4M Jan 19 2022 ./static/files/2022/01/19/wunderland.mp4 1.4M -rw-r--r-- 1 user group 1.4M Dec 15 2021 ./static/files/2021/12/15/tradesecrets.png 988K -rw-r--r-- 1 user group 985K Jul 19 2022 ./static/files/2022/07/19/me_bitlockerreset.png 904K -rw-r--r-- 1 user group 904K Jun 13 2023 ./static/files/2023/06/13/img_20230612_114518.jpg 660K -rw-r--r-- 1 user group 657K Jun 20 2019 ./static/files/2019/06/20/foxpaws_2019.png …/o>

NOTE: Using find with the -xdev argument ignores other filesystems mounted under the given path, i.e: letting you search from / (root) but avoiding the loops and pitfalls of /proc, /sys, /dev etc. If, for example, your /home folder is on a different partition and you use the -xdev option, it won't be searched. You will need to execute the search again on that specific path.

Using du from the CLI

du recursively summarizes disk usage of the given path. To list the 20 largest files under a given tree run:

du -ah /path | sort -nr | head -n 20 1016K /nsm/repo/transmission-gtk-3.00-14.el9.x86_64.rpm 1016K /nsm/repo/lvm2-libs-2.03.21-3.el9.x86_64.rpm 1012K /nsm/repo/lvm2-libs-2.03.17-7.el9.x86_64.rpm 1012K /nsm/mysql/playbook/issues.ibd 1008K /nsm/repo/cockpit-ws-300.3-1.0.1.el9_3.x86_64.rpm 1008K /nsm/repo/cockpit-ws-300.1-1.0.1.el9_3.x86_64.rpm 1008K /nsm/repo/bluez-5.64-2.el9.x86_64.rpm 1008K /nsm/elasticsearch/indices/mRlZOMbzTuydO8jY_0qayQ/0/index/_kv.cfs 1000K /nsm/repo/python3-pillow-9.1.1-4.el9.x86_64.rpm 1000K /nsm/repo/ibus-typing-booster-2.11.0-5.el9.noarch.rpm 1000K /nsm/repo/gnome-keyring-40.0-3.el9.x86_64.rpm 1000K /nsm/repo/exiv2-0.27.5-2.el9.x86_64.rpm 992K /nsm/repo/stix-fonts-2.0.2-11.el9.noarch.rpm 988K /nsm/repo/urw-base35-p052-fonts-20200910-6.el9.noarch.rpm 988K /nsm/repo/btrfs-progs-5.15.1-0.el9.x86_64.rpm 988K /nsm/repo/annobin-12.12-1.el9.x86_64.rpm 980K /nsm/repo/glibc-langpack-en-2.34-60.0.2.el9.x86_64.rpm 972K /nsm/elasticsearch/indices/1Jf3pLWrTvOQ5E_Nx1VDcw/0/index/_29s.cfs 968K /nsm/repo/xorg-x11-server-Xwayland-22.1.9-2.el9.x86_64.rpm 968K /nsm/docker-registry/docker/registry/v2/blobs/sha256/7d

To list the 20 largest directories by the total size of their contents (useful when seeking out profuse collections of small files):

du -aBm /nsm | sort -nr | head -n 20 56692M /nsm 34486M /nsm/elasticsearch 34483M /nsm/elasticsearch/indices 32647M /nsm/elasticsearch/indices/aTbHOx-WS6e5D1VrmooK9w 21549M /nsm/elasticsearch/indices/aTbHOx-WS6e5D1VrmooK9w/1/index 21549M /nsm/elasticsearch/indices/aTbHOx-WS6e5D1VrmooK9w/1 11098M /nsm/elasticsearch/indices/aTbHOx-WS6e5D1VrmooK9w/0/index 11098M /nsm/elasticsearch/indices/aTbHOx-WS6e5D1VrmooK9w/0 8310M /nsm/repo 6280M /nsm/docker-registry/docker 6280M /nsm/docker-registry 6256M /nsm/docker-registry/docker/registry/v2 6256M /nsm/docker-registry/docker/registry 6254M /nsm/docker-registry/docker/registry/v2/blobs/sha256 6254M /nsm/docker-registry/docker/registry/v2/blobs 5138M /nsm/elasticsearch/indices/aTbHOx-WS6e5D1VrmooK9w/1/index/_oqm.fdt 3660M /nsm/backup 2798M /nsm/elastic-fleet/artifacts 2798M /nsm/elastic-fleet 2748M /nsm/elasticsearch/indices/aTbHOx-WS6e5D1VrmooK9w/1/index/_oqm_Lucene90_0.dvd

Using Graphical Utilities in the GUI

[attachment-lDtttU]
Baobab (Gnome Disk Analyzer)

There is a much cooler though equally less efficient way: graphical file explorers that represent disk usage in a more immediate and visually intuitive fashion. Directories are inset their parents and sized by proportion of space utilized. There exist a number of options:

  • FSView or File Size View on KDE Plasma's Konqueror. You may need to install the konqueror-plugins or konqueror-plugin-fsview, depending on your flavour, package if it is not already available.
  • The same functionality is provided by standalone wrappers with a smaller dependency footprint KDirStat and qfsview.
  • Baobab, also styled as Gnome Disk Analyzer on some distributions.
  • Filelight
  • qdirstat

opensource.com > 3 open source GUI disk usage analyzers for Linux takes a deep dive into the latter three options while the preceding are covered (en Français) at Coagul.org > Outils pour analyser l’espace disque et visualiser l'occupation d'un disque.

[attachment-WG6U6L]
KDE Plasma's Konqueror > FSView

FSView does not work over kio abstractions (ssh/sftp/fish/ftp etc) but works fine (though tends to perform excruciatingly slowly) over NFS.

[attachment-a5wWkL]
KDirstat Original Source: (from Outiles&ellipse; by coagul.org)

Open File Locks

Your woes may not be over. Even after deleting files they can stick around until every process that is using them has terminated and/or released their lock on the file(s') inode. Please continue reading the continuation: Find the Largest Open Files and Their Owner(s) on Linux with lsof if you are experiencing problems with ghost files and/or goblins.

Comments

• litestar

the nice thing about -size is to use +/-, signifying great than/less than:

find . -type f -size +30M

find . -type f -size 30M

find . -type f -size -30M

same goes for create/access time, &c. nice utility, all said.