Table of Contents#
- Basic Method: Using
duwith Wildcards (Simple but Limited) - Robust Method: Using
findanddu(Handles Thousands of Files) - Precise Method: Using
find,printf, andawk(Bytes to Human-Readable) - Advanced Customizations
- Troubleshooting Common Issues
- Conclusion
- References
Basic Method: Using du with Wildcards (Simple but Limited)#
The simplest way to calculate the total size of files matching a wildcard is to use the du (disk usage) command with the wildcard pattern. du is designed to estimate file and directory space usage, and with the right flags, it can sum sizes of specific files.
Command Syntax#
du -ch *.jpg | grep totalHow It Works#
Let’s break down the command:
du: The core command for disk usage.-c: Stands for "total"; adds a grand total to the output.-h: Stands for "human-readable"; displays sizes in KB, MB, GB, etc.*.jpg: The wildcard pattern to match files (here, all.jpgfiles in the current directory).| grep total: Filters the output to show only the "total" line.
Example Output#
Suppose you have three .jpg files in your current directory:
$ ls -lh *.jpg
-rw-r--r-- 1 user user 2.3M Jan 1 10:00 photo1.jpg
-rw-r--r-- 1 user user 1.8M Jan 1 10:01 photo2.jpg
-rw-r--r-- 1 user user 5.2M Jan 1 10:02 photo3.jpgRunning du -ch *.jpg | grep total gives:
9.3M totalLimitations#
This method works well for small numbers of files, but it has a critical flaw:
- "Argument list too long" error: If there are thousands of
.jpgfiles, the shell can’t expand*.jpginto a list of filenames (due to the OS’sARG_MAXlimit). This results in an error like:-bash: /usr/bin/du: Argument list too long
For large datasets, we need a more robust approach.
Robust Method: Using find and du (Handles Thousands of Files)#
To avoid the "argument list too long" error, use find to locate files and pipe them to du for summing. find efficiently searches for files and avoids expanding wildcards in the shell, making it ideal for large numbers of files.
Command Syntax#
find . -type f -name "*.jpg" -exec du -ch {} + | grep totalHow It Works#
Let’s dissect each part:
find .: Starts searching from the current directory (.). Replace.with a specific path (e.g.,/home/user/photos) to search elsewhere.-type f: Ensures we only match files (not directories, symlinks, etc.). Critical to avoid summing directory contents!-name "*.jpg": Matches files with names ending in.jpg(case-sensitive).-exec du -ch {} +: Executesdu -chon the found files. The{} +syntax passes all found files toduin batches (avoids the argument limit).| grep total: Extracts the total size.
Example Output#
For the same three .jpg files (plus a subdirectory old_photos/ with a 3.5M photo4.jpg):
$ find . -type f -name "*.jpg" -exec du -ch {} + | grep total
12M total # 2.3M + 1.8M + 5.2M + 3.5M = 12.8M (rounded to 12M)Key Advantages#
- Handles thousands of files: No "argument list too long" errors.
- Searches subdirectories: By default,
findrecurses into subdirectories (add-maxdepth 1to limit to the current directory).
Precise Method: Using find, printf, and awk (Bytes to Human-Readable)#
If you need precise byte-level accuracy (not rounded human-readable sizes), use find to print file sizes in bytes and awk to sum them. You can then convert bytes to human-readable format with numfmt.
Step 1: Sum Sizes in Bytes#
find . -type f -name "*.jpg" -printf "%s\n" | awk '{total += $1} END {print total}'Explanation#
-printf "%s\n":findprints the size of each file in bytes (%s) followed by a newline (\n).awk '{total += $1} END {print total}':awksums all bytes ($1is the first column, i.e., the size) and prints the total.
Example Output#
For our earlier files (2.3M = 2,411,724 bytes; 1.8M = 1,889,568; 5.2M = 5,452,595; 3.5M = 3,670,016):
$ find . -type f -name "*.jpg" -printf "%s\n" | awk '{total += $1} END {print total}'
13423903 # Total bytes (2,411,724 + 1,889,568 + 5,452,595 + 3,670,016 = 13,423,903)Step 2: Convert Bytes to Human-Readable Format#
Use numfmt to convert bytes to KB/MB/GB:
find . -type f -name "*.jpg" -printf "%s\n" | awk '{total += $1} END {print total}' | numfmt --to=iecExample Output#
13M # 13,423,903 bytes ≈ 13 MiB (IEC standard, where 1 MiB = 1,048,576 bytes)For SI units (1 MB = 1,000,000 bytes), use --to=si:
... | numfmt --to=si # Output: 13MBAdvanced Customizations#
Case-Insensitive Search#
To match .jpg, .JPG, .Jpg, etc., use -iname instead of -name in find:
find . -type f -iname "*.jpg" -exec du -ch {} + | grep totalExcluding Directories#
To exclude specific directories (e.g., node_modules/ or backup/), use -path with -prune:
find . -type d -path "./backup" -prune -o -type f -name "*.jpg" -exec du -ch {} + | grep total-type d -path "./backup" -prune: Skips the./backupdirectory.-o: Stands for "or"; continues searching other paths.
Searching Specific Directories#
To search only a specific directory (e.g., /home/user/photos), replace . with the path:
find /home/user/photos -type f -name "*.jpg" -exec du -ch {} + | grep totalTroubleshooting Common Issues#
"Permission Denied" Errors#
If find encounters directories you can’t access, add -perm -u=rwx to skip them, or redirect errors to /dev/null:
find . -type f -name "*.jpg" -exec du -ch {} + 2>/dev/null | grep totalNo Files Found#
If no .jpg files exist, grep total will return nothing. To handle this, check the exit code or add a fallback message:
result=$(find . -type f -name "*.jpg" -exec du -ch {} + | grep total)
if [ -z "$result" ]; then echo "No .jpg files found."; else echo "$result"; fiIncorrect Total (Summing Directories)#
Always include -type f in find to avoid summing directories. Without it, du will sum the contents of directories named *.jpg, leading to inflated totals.
Conclusion#
Calculating the total size of files matching a wildcard in Linux is straightforward with the right tools:
- For small datasets: Use
du -ch *.jpg | grep total(simple but limited by shell arguments). - For large datasets: Use
find . -type f -name "*.jpg" -exec du -ch {} + | grep total(robust and recursive). - For precise byte counts: Use
find+printf+awk+numfmt(accurate and customizable).
With these methods, you can efficiently manage disk space, audit file sizes, and streamline your workflow.