Table of Contents#
- Understanding the Problem: Why Disk Space Matters
- Prerequisites
- Step 1: Check Current Disk Space Usage
- Step 2: Identify Safe Files to Delete
- Step 3: Write the Shell Script
- Step 4: Test the Script
- Step 5: Automate with Cron
- Safety Considerations (Critical for Beginners!)
- Troubleshooting Common Issues
- Conclusion
- References
Understanding the Problem: Why Disk Space Matters#
Linux systems rely on free disk space for core operations: storing logs, caching data, installing updates, and even temporary files for running applications. When disk space runs low (e.g., >90% usage), you might experience:
- Slow system performance (due to heavy I/O operations).
- Failed software installations or updates.
- Crashes in applications that can’t write to disk (e.g., web servers, databases).
- Corrupted files if the system abruptly runs out of space mid-write.
Manually checking disk space with df -h and deleting files is a short-term fix, but it’s unsustainable. Automating this process ensures your system stays healthy without constant monitoring.
Prerequisites#
Before diving in, ensure you have:
- A Linux system (any distribution: Ubuntu, Debian, CentOS, Fedora, etc.).
- Basic familiarity with the terminal (e.g., navigating directories with
cd, listing files withls). sudoaccess (to modify system files and schedule cron jobs).- A text editor (we’ll use
nanofor simplicity, butvimorgeditworks too).
Step 1: Check Current Disk Space Usage#
First, let’s learn how to check disk space. The most common command is df -h, which displays disk usage in "human-readable" format (GB, MB instead of raw bytes).
Run the command:#
df -h Sample Output:#
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 200G 150G 40G 79% /
tmpfs 3.9G 0 3.9G 0% /dev/shm
/dev/sdb1 500G 450G 50G 90% /data
Key Columns Explained:#
Filesystem: The storage device (e.g.,/dev/sda1is your main hard drive).Size: Total disk space.Used: Space currently in use.Avail: Free space available.Use%: Percentage of disk used (this is critical—we’ll use this to trigger our script).Mounted on: The directory where the filesystem is attached (e.g.,/is the root directory).
For our script, we’ll focus on the Use% column. If a filesystem (e.g., /data in the example) exceeds a threshold (e.g., 90%), we’ll delete old/unnecessary files to free up space.
Step 2: Identify Safe Files to Delete#
Not all files are safe to delete! Deleting system files (e.g., /bin/, /etc/) or user data (e.g., Documents/, Pictures/) can break your system or cause data loss. Stick to non-critical, temporary, or log files that regenerate over time.
Safe Targets for Deletion:#
- Log files: Stored in
/var/log/(e.g.,syslog,auth.log). Old logs are rarely needed. - Cache files: Application caches (e.g.,
/var/cache/, browser caches in~/.cache/). - Old backups: Stale backups in
/backup/or external drives. - Temporary files: In
/tmp/(though Linux often cleans these automatically).
Avoid These:#
- System directories:
/bin/,/sbin/,/lib/,/etc/. - User data:
~/Documents/,~/Downloads/(unless you’re sure!). - Database files (e.g.,
/var/lib/mysql/—back up first!).
Use find to Locate Old/Large Files#
To identify candidates for deletion, use the find command to search for files by age, size, or type.
Example 1: Find files older than 30 days in /var/log/#
find /var/log/ -type f -mtime +30 -type f: Search for files (not directories).-mtime +30: Modified more than 30 days ago (use-mmin +1440for files older than 24 hours).
Example 2: Find files larger than 100MB in /data/backups/#
find /data/backups/ -type f -size +100M -size +100M: Larger than 100 megabytes (use+1Gfor gigabytes).
Step 3: Write the Shell Script#
Now, let’s build a script that:
- Checks disk usage for a target filesystem (e.g.,
/data). - If usage exceeds a threshold (e.g., 90%), deletes old files in a safe directory (e.g.,
/data/backups/).
Step 3.1: Create the Script File#
Open a terminal and navigate to your home directory (or a scripts folder like ~/scripts/). Create a new file named cleanup_disk.sh:
mkdir -p ~/scripts # Create a scripts folder (if it doesn’t exist)
cd ~/scripts
nano cleanup_disk.sh # Open the file in nano Step 3.2: Add Script Logic#
Paste the following code into nano, and we’ll break it down line by line:
#!/bin/bash
# --------------------------
# Disk Cleanup Script
# Deletes old files when disk usage exceeds a threshold.
# --------------------------
# --------------------------
# CONFIGURATION (EDIT THESE!)
# --------------------------
THRESHOLD=90 # Disk usage percentage to trigger cleanup (e.g., 90 = 90%)
TARGET_DIR="/data/backups" # Directory to clean (use absolute path!)
FILE_AGE=30 # Delete files older than X days (e.g., 30 = 30 days)
MOUNT_POINT="/data" # Filesystem to monitor (from `df -h` "Mounted on" column)
# --------------------------
# SCRIPT LOGIC
# --------------------------
# Get current disk usage percentage for the target mount point
# `df -P` = POSIX format (avoids line breaks), `awk` extracts the 5th column (Use%)
CURRENT_USAGE=$(df -P "$MOUNT_POINT" | awk 'NR==2 {print $5}' | sed 's/%//')
echo "Current disk usage for $MOUNT_POINT: $CURRENT_USAGE%"
# Check if current usage exceeds the threshold
if [ "$CURRENT_USAGE" -ge "$THRESHOLD" ]; then
echo "Disk usage exceeds $THRESHOLD%! Cleaning up old files in $TARGET_DIR..."
# Delete files older than FILE_AGE days in TARGET_DIR
# -maxdepth 1: Only delete files in TARGET_DIR (not subdirectories)
# -type f: Delete files (not directories)
# -mtime +FILE_AGE: Modified more than FILE_AGE days ago
# -delete: Delete the files (use `echo` instead for dry runs!)
find "$TARGET_DIR" -maxdepth 1 -type f -mtime +"$FILE_AGE" -delete
echo "Cleanup complete! Freed space in $TARGET_DIR."
else
echo "Disk usage is below threshold ($CURRENT_USAGE% < $THRESHOLD%). No cleanup needed."
fi Step 3.3: Customize the Script#
Edit the CONFIGURATION section to match your system:
THRESHOLD: Set to 85-90% (adjust based on how aggressively you want to clean).TARGET_DIR: Use a safe directory (e.g.,/var/log/for logs,/data/backups/for backups).FILE_AGE: Delete files older than X days (e.g.,7for weekly cleanup).MOUNT_POINT: Use theMounted onpath fromdf -h(e.g.,/for root,/datafor a secondary drive).
Step 3.4: Save and Exit#
In nano, press Ctrl + O to save, then Ctrl + X to exit.
Step 4: Test the Script#
Before automating, test the script manually to avoid accidental data loss!
Step 4.1: Make the Script Executable#
Scripts need execution permissions to run. Run:
chmod +x ~/scripts/cleanup_disk.sh Step 4.2: Dry Run (Critical!)#
A "dry run" lets you see what the script would delete without actually deleting files. Modify the find line in the script to echo files instead of deleting them:
Temporarily replace this line:
find "$TARGET_DIR" -maxdepth 1 -type f -mtime +"$FILE_AGE" -delete With this (add echo before -delete):
find "$TARGET_DIR" -maxdepth 1 -type f -mtime +"$FILE_AGE" -print Now run the script:
~/scripts/cleanup_disk.sh Sample Dry Run Output:#
Current disk usage for /data: 91%
Disk usage exceeds 90%! Cleaning up old files in /data/backups...
/data/backups/backup_20230101.tar.gz
/data/backups/backup_20230102.tar.gz
Cleanup complete! Freed space in /data/backups.
If the output shows files you’re comfortable deleting, revert the find line back to -delete (remove the echo/-print).
Step 4.3: Run the Script for Real#
After verifying the dry run, run the script to delete files:
~/scripts/cleanup_disk.sh Check disk space again with df -h to confirm space was freed!
Step 5: Automate with Cron#
Now that the script works manually, let’s automate it with cron—Linux’s built-in task scheduler. Cron runs scripts at fixed intervals (e.g., daily, weekly).
Step 5.1: Understand Cron Syntax#
Cron jobs are defined with a 5-part schedule:
* * * * * command-to-run
- - - - -
| | | | |
| | | | +-- Day of the week (0-6, 0=Sunday)
| | | +---- Month (1-12)
| | +------ Day of the month (1-31)
| +-------- Hour (0-23)
+---------- Minute (0-59)
Common Examples:#
0 2 * * *: Run daily at 2:00 AM.30 3 * * 1: Run every Monday at 3:30 AM.*/15 * * * *: Run every 15 minutes.
Step 5.2: Schedule the Script with Cron#
Open the cron table for your user:
crontab -e If prompted to choose an editor, select nano (easiest for beginners).
Step 5.3: Add a Cron Job#
Add this line to the bottom of the file to run the script daily at 2:00 AM and log output:
0 2 * * * /home/your_username/scripts/cleanup_disk.sh >> /var/log/disk_cleanup.log 2>&1 Breakdown:#
0 2 * * *: Run daily at 2:00 AM./home/your_username/scripts/cleanup_disk.sh: Path to your script (useecho ~to find your username).>> /var/log/disk_cleanup.log: Append output to a log file (so you can debug later).2>&1: Redirect errors to the same log file (so you don’t miss issues).
Step 5.4: Verify the Cron Job#
List your cron jobs to confirm it was added:
crontab -l You should see your new job!
Safety Considerations (Critical for Beginners!)#
Even with a working script, mistakes can happen. Follow these rules to avoid data loss:
- Test First: Always run a dry run before deleting files.
- Backup Files: Back up critical data (e.g.,
cp /data/backups/*.tar.gz /external_drive/) before automation. - Limit Target Directories: Never run the script on
/(root) or system directories like/etc/—stick to specific safe folders. - Use
-maxdepth 1: In thefindcommand,-maxdepth 1ensures you only delete files directly inTARGET_DIR, not subdirectories (avoids accidental recursion). - Restrict Script Permissions: Make the script readable/writable only by you:
chmod 700 ~/scripts/cleanup_disk.sh # 700 = read/write/execute for owner only - Monitor Logs: Check
/var/log/disk_cleanup.logweekly to ensure the script runs and deletes only intended files.
Troubleshooting Common Issues#
Issue 1: Script Doesn’t Run#
- Cause: Missing execution permissions or incorrect path.
- Fix: Run
chmod +x ~/scripts/cleanup_disk.shand use absolute paths in the script (e.g.,/home/user/scripts/instead of~/scripts/).
Issue 2: Cron Job Doesn’t Execute#
- Cause: Cron uses a limited environment (no
PATHvariables). - Fix: Use absolute paths for all commands in the script (e.g.,
/usr/bin/findinstead offind).
Issue 3: Wrong Files Are Deleted#
- Cause:
TARGET_DIRorMOUNT_POINTis misconfigured. - Fix: Double-check the
CONFIGURATIONsection in the script. Rundf -hto confirmMOUNT_POINTmatches theMounted oncolumn.
Issue 4: Disk Still Full After Cleanup#
- Cause: Not enough files are being deleted.
- Fix: Lower
FILE_AGE(e.g., delete files older than 14 days instead of 30) or target larger files with-size +500Min thefindcommand.
Conclusion#
You’ve now built a powerful tool to automate disk cleanup! By combining a shell script with cron, you’ll never face a "disk full" error again. Remember: test thoroughly, back up data, and monitor logs to keep your system safe.
As you gain confidence, expand the script to target multiple directories, send email alerts when cleanup runs, or exclude specific file types. The possibilities are endless!