Table of Contents#
- Prerequisites
- Understanding the Requirements
- Step-by-Step Script Creation
- Full Script with Safety Features
- Customization Options
- Testing the Script
- Troubleshooting Common Issues
- Conclusion
- References
Prerequisites#
Before diving in, ensure you have:
- A Unix-like environment (Linux, macOS, or WSL on Windows).
- Basic familiarity with shell scripting (variables, loops, conditionals).
- Read/write access to the directory containing your backups.
- Backup files with a consistent naming pattern (e.g.,
*.bak,backup_*.tar.gz).
Understanding the Requirements#
Let’s clarify what the script needs to do:
- Target Directory: Specify where backups are stored (e.g.,
/var/backups). - File Pattern: Define what counts as a backup (e.g.,
*.bakorbackup_*.tar). - Count Files: List all files matching the pattern and count them.
- Cleanup Logic: If there are more than 10 backups, delete the oldest ones, keeping the newest 10.
- Safety: Avoid accidental deletion (e.g., handle spaces in filenames, add dry-run options).
Step-by-Step Script Creation#
Let’s build the script incrementally. Save it as cleanup_backups.sh and make it executable with chmod +x cleanup_backups.sh.
1. Define Variables#
Start by setting variables for reusability. This makes it easy to adjust the directory, file pattern, or number of files to keep later.
#!/bin/bash
# Configuration
BACKUP_DIR="/path/to/your/backups" # Replace with your backup directory
FILE_PATTERN="*.bak" # Replace with your backup file pattern (e.g., "backup_*.tar.gz")
KEEP=10 # Number of newest backups to keep2. Check if the Backup Directory Exists#
If the target directory doesn’t exist, the script should exit with an error message to avoid accidental operations on the wrong path.
# Check if backup directory exists
if [ ! -d "$BACKUP_DIR" ]; then
echo "Error: Backup directory $BACKUP_DIR does not exist."
exit 1
fi3. Count Backup Files#
Use find to list all files matching the pattern, then count them. find is robust for handling spaces in filenames and large directories.
# Count backup files matching the pattern
count=$(find "$BACKUP_DIR" -maxdepth 1 -type f -name "$FILE_PATTERN" | wc -l)
echo "Found $count backup files in $BACKUP_DIR."-maxdepth 1: Avoids searching subdirectories.-type f: Only counts files (not directories).-name "$FILE_PATTERN": Matches the backup file pattern.
4. Determine if Cleanup is Needed#
If there are <= KEEP files, no cleanup is needed. Exit gracefully.
# Exit if fewer than or equal to KEEP files exist
if [ "$count" -le "$KEEP" ]; then
echo "No cleanup needed. Keeping all $count files."
exit 0
fi5. Identify and Remove Oldest Backups#
To delete the oldest files, we need to:
- List files sorted by modification time (newest first).
- Exclude directories and keep only files.
- Select the oldest files (all but the newest
KEEP).
Safe File Handling#
Files with spaces or special characters (e.g., backup Jan 1.bak) can break naive scripts. Use find with -print0 and xargs -0 to handle these cases.
# Calculate how many files to delete
delete=$((count - KEEP))
echo "Need to delete $delete oldest files to keep the newest $KEEP."
# List oldest files (sorted by modification time, oldest first) and delete them
echo "Deleting the following files:"
find "$BACKUP_DIR" -maxdepth 1 -type f -name "$FILE_PATTERN" -printf "%T@ %p\n" | \
sort -n | \
head -n "$delete" | \
cut -d' ' -f2- | \
while read -r file; do
echo "Deleting: $file"
rm -f "$file" # Remove -f to prompt before deletion (safer for testing)
doneExplanation of the Pipeline:#
find ... -printf "%T@ %p\n": Prints epoch timestamp (%T@) and file path (%p) for each backup file.sort -n: Sorts by timestamp (oldest first).head -n "$delete": Selects the firstdeletefiles (the oldest).cut -d' ' -f2-: Removes the timestamp, leaving only the file path.while read -r file; do rm "$file"; done: Safely deletes each file, even with spaces.
Full Script with Safety Features#
Add a dry-run option (-n) to preview deletions without actually removing files. This is critical for testing!
#!/bin/bash
# Configuration
BACKUP_DIR="/path/to/your/backups"
FILE_PATTERN="*.bak"
KEEP=10
DRY_RUN=0 # Set to 1 for dry run (no deletion)
# Parse command-line arguments (e.g., ./cleanup_backups.sh -n for dry run)
while getopts "n" opt; do
case $opt in
n) DRY_RUN=1 ;;
*) echo "Usage: $0 [-n] (dry run)" >&2; exit 1 ;;
esac
done
# Check if backup directory exists
if [ ! -d "$BACKUP_DIR" ]; then
echo "Error: Backup directory $BACKUP_DIR does not exist." >&2
exit 1
fi
# Count backup files
count=$(find "$BACKUP_DIR" -maxdepth 1 -type f -name "$FILE_PATTERN" | wc -l)
echo "Found $count backup files in $BACKUP_DIR."
# Exit if no cleanup needed
if [ "$count" -le "$KEEP" ]; then
echo "No cleanup needed. Keeping all $count files."
exit 0
fi
# Calculate deletions
delete=$((count - KEEP))
echo "Need to delete $delete oldest files to keep the newest $KEEP."
# Delete oldest files (or preview with dry run)
echo "Files to process:"
find "$BACKUP_DIR" -maxdepth 1 -type f -name "$FILE_PATTERN" -printf "%T@ %p\n" | \
sort -n | \
head -n "$delete" | \
cut -d' ' -f2- | \
while read -r file; do
if [ "$DRY_RUN" -eq 1 ]; then
echo "[Dry Run] Would delete: $file"
else
echo "Deleting: $file"
rm -f "$file"
fi
done
echo "Cleanup complete."Customization Options#
Tailor the script to your needs by modifying:
BACKUP_DIR: Path to your backups (e.g.,~/my_project/backups).FILE_PATTERN: Match your backup naming scheme (e.g.,backup_*.tar.gz,*.sql).KEEP: Number of backups to retain (e.g.,5for fewer,20for more).- Add logging: Redirect output to a file (e.g.,
./cleanup_backups.sh >> /var/log/backup_cleanup.log 2>&1).
Testing the Script#
Always test before running on critical data!
Step 1: Create a Test Directory#
mkdir test_backups
cd test_backups
for i in {1..15}; do touch "backup_$i.bak"; sleep 1; done # Creates 15 files with unique timestamps
cd ..Step 2: Run Dry Run#
./cleanup_backups.sh -n -d test_backups -p "*.bak" # Replace with your script pathYou should see:
Found 15 backup files in test_backups.
Need to delete 5 oldest files to keep the newest 10.
[Dry Run] Would delete: test_backups/backup_1.bak
[Dry Run] Would delete: test_backups/backup_2.bak
...
Step 3: Run Actual Cleanup#
./cleanup_backups.sh -d test_backups -p "*.bak"Verify only 10 files remain:
ls -l test_backups | wc -l # Should show 10 filesTroubleshooting Common Issues#
1. "No such file or directory"#
- Ensure
BACKUP_DIRis correct (e.g., use absolute paths like/home/user/backups).
2. Files with Spaces Not Deleted#
- The script uses
while read -r fileto handle spaces. Avoidfor file in $(find ...)(it splits on spaces).
3. "Permission denied"#
- Run the script with
sudoif backups are in a restricted directory (e.g.,/var/backups).
4. Incorrect File Count#
- Check
FILE_PATTERN(e.g.,*.BAKis case-sensitive; use-inameinfindfor case-insensitive matching).
Conclusion#
This script automates backup cleanup, ensuring you never run out of space from old backups. By customizing variables and testing thoroughly, you can adapt it to any backup workflow. For added safety, schedule it with cron to run nightly (e.g., 0 3 * * * /path/to/cleanup_backups.sh).