funwithlinux blog

How to Write a Shell Script to Count Backup Files and Remove the Oldest Ones (Keep Newest 10)

Backup files are critical for data safety, but they can quickly accumulate and consume valuable storage space. Whether you’re a system administrator managing server backups or a developer archiving project files, manually sorting through backups to delete old ones is time-consuming and error-prone. Automating this process ensures you keep only the most recent backups (e.g., the newest 10) while freeing up space.

In this blog, we’ll walk through creating a robust shell script to:

  • Count backup files in a specified directory.
  • Remove the oldest backups if there are more than 10.
  • Handle edge cases (e.g., missing directories, file names with spaces).
  • Include safety features (e.g., dry runs, error handling).
2026-01

Table of Contents#

  1. Prerequisites
  2. Understanding the Requirements
  3. Step-by-Step Script Creation
  4. Full Script with Safety Features
  5. Customization Options
  6. Testing the Script
  7. Troubleshooting Common Issues
  8. Conclusion
  9. References

Prerequisites#

Before diving in, ensure you have:

  • A Unix-like environment (Linux, macOS, or WSL on Windows).
  • Basic familiarity with shell scripting (variables, loops, conditionals).
  • Read/write access to the directory containing your backups.
  • Backup files with a consistent naming pattern (e.g., *.bak, backup_*.tar.gz).

Understanding the Requirements#

Let’s clarify what the script needs to do:

  1. Target Directory: Specify where backups are stored (e.g., /var/backups).
  2. File Pattern: Define what counts as a backup (e.g., *.bak or backup_*.tar).
  3. Count Files: List all files matching the pattern and count them.
  4. Cleanup Logic: If there are more than 10 backups, delete the oldest ones, keeping the newest 10.
  5. Safety: Avoid accidental deletion (e.g., handle spaces in filenames, add dry-run options).

Step-by-Step Script Creation#

Let’s build the script incrementally. Save it as cleanup_backups.sh and make it executable with chmod +x cleanup_backups.sh.

1. Define Variables#

Start by setting variables for reusability. This makes it easy to adjust the directory, file pattern, or number of files to keep later.

#!/bin/bash
 
# Configuration
BACKUP_DIR="/path/to/your/backups"  # Replace with your backup directory
FILE_PATTERN="*.bak"                # Replace with your backup file pattern (e.g., "backup_*.tar.gz")
KEEP=10                             # Number of newest backups to keep

2. Check if the Backup Directory Exists#

If the target directory doesn’t exist, the script should exit with an error message to avoid accidental operations on the wrong path.

# Check if backup directory exists
if [ ! -d "$BACKUP_DIR" ]; then
  echo "Error: Backup directory $BACKUP_DIR does not exist."
  exit 1
fi

3. Count Backup Files#

Use find to list all files matching the pattern, then count them. find is robust for handling spaces in filenames and large directories.

# Count backup files matching the pattern
count=$(find "$BACKUP_DIR" -maxdepth 1 -type f -name "$FILE_PATTERN" | wc -l)
 
echo "Found $count backup files in $BACKUP_DIR."
  • -maxdepth 1: Avoids searching subdirectories.
  • -type f: Only counts files (not directories).
  • -name "$FILE_PATTERN": Matches the backup file pattern.

4. Determine if Cleanup is Needed#

If there are <= KEEP files, no cleanup is needed. Exit gracefully.

# Exit if fewer than or equal to KEEP files exist
if [ "$count" -le "$KEEP" ]; then
  echo "No cleanup needed. Keeping all $count files."
  exit 0
fi

5. Identify and Remove Oldest Backups#

To delete the oldest files, we need to:

  • List files sorted by modification time (newest first).
  • Exclude directories and keep only files.
  • Select the oldest files (all but the newest KEEP).

Safe File Handling#

Files with spaces or special characters (e.g., backup Jan 1.bak) can break naive scripts. Use find with -print0 and xargs -0 to handle these cases.

# Calculate how many files to delete
delete=$((count - KEEP))
echo "Need to delete $delete oldest files to keep the newest $KEEP."
 
# List oldest files (sorted by modification time, oldest first) and delete them
echo "Deleting the following files:"
find "$BACKUP_DIR" -maxdepth 1 -type f -name "$FILE_PATTERN" -printf "%T@ %p\n" | \
  sort -n | \
  head -n "$delete" | \
  cut -d' ' -f2- | \
  while read -r file; do
    echo "Deleting: $file"
    rm -f "$file"  # Remove -f to prompt before deletion (safer for testing)
  done

Explanation of the Pipeline:#

  1. find ... -printf "%T@ %p\n": Prints epoch timestamp (%T@) and file path (%p) for each backup file.
  2. sort -n: Sorts by timestamp (oldest first).
  3. head -n "$delete": Selects the first delete files (the oldest).
  4. cut -d' ' -f2-: Removes the timestamp, leaving only the file path.
  5. while read -r file; do rm "$file"; done: Safely deletes each file, even with spaces.

Full Script with Safety Features#

Add a dry-run option (-n) to preview deletions without actually removing files. This is critical for testing!

#!/bin/bash
 
# Configuration
BACKUP_DIR="/path/to/your/backups"
FILE_PATTERN="*.bak"
KEEP=10
DRY_RUN=0  # Set to 1 for dry run (no deletion)
 
# Parse command-line arguments (e.g., ./cleanup_backups.sh -n for dry run)
while getopts "n" opt; do
  case $opt in
    n) DRY_RUN=1 ;;
    *) echo "Usage: $0 [-n] (dry run)" >&2; exit 1 ;;
  esac
done
 
# Check if backup directory exists
if [ ! -d "$BACKUP_DIR" ]; then
  echo "Error: Backup directory $BACKUP_DIR does not exist." >&2
  exit 1
fi
 
# Count backup files
count=$(find "$BACKUP_DIR" -maxdepth 1 -type f -name "$FILE_PATTERN" | wc -l)
echo "Found $count backup files in $BACKUP_DIR."
 
# Exit if no cleanup needed
if [ "$count" -le "$KEEP" ]; then
  echo "No cleanup needed. Keeping all $count files."
  exit 0
fi
 
# Calculate deletions
delete=$((count - KEEP))
echo "Need to delete $delete oldest files to keep the newest $KEEP."
 
# Delete oldest files (or preview with dry run)
echo "Files to process:"
find "$BACKUP_DIR" -maxdepth 1 -type f -name "$FILE_PATTERN" -printf "%T@ %p\n" | \
  sort -n | \
  head -n "$delete" | \
  cut -d' ' -f2- | \
  while read -r file; do
    if [ "$DRY_RUN" -eq 1 ]; then
      echo "[Dry Run] Would delete: $file"
    else
      echo "Deleting: $file"
      rm -f "$file"
    fi
  done
 
echo "Cleanup complete."

Customization Options#

Tailor the script to your needs by modifying:

  • BACKUP_DIR: Path to your backups (e.g., ~/my_project/backups).
  • FILE_PATTERN: Match your backup naming scheme (e.g., backup_*.tar.gz, *.sql).
  • KEEP: Number of backups to retain (e.g., 5 for fewer, 20 for more).
  • Add logging: Redirect output to a file (e.g., ./cleanup_backups.sh >> /var/log/backup_cleanup.log 2>&1).

Testing the Script#

Always test before running on critical data!

Step 1: Create a Test Directory#

mkdir test_backups
cd test_backups
for i in {1..15}; do touch "backup_$i.bak"; sleep 1; done  # Creates 15 files with unique timestamps
cd ..

Step 2: Run Dry Run#

./cleanup_backups.sh -n -d test_backups -p "*.bak"  # Replace with your script path

You should see:

Found 15 backup files in test_backups.
Need to delete 5 oldest files to keep the newest 10.
[Dry Run] Would delete: test_backups/backup_1.bak
[Dry Run] Would delete: test_backups/backup_2.bak
...

Step 3: Run Actual Cleanup#

./cleanup_backups.sh -d test_backups -p "*.bak"

Verify only 10 files remain:

ls -l test_backups | wc -l  # Should show 10 files

Troubleshooting Common Issues#

1. "No such file or directory"#

  • Ensure BACKUP_DIR is correct (e.g., use absolute paths like /home/user/backups).

2. Files with Spaces Not Deleted#

  • The script uses while read -r file to handle spaces. Avoid for file in $(find ...) (it splits on spaces).

3. "Permission denied"#

  • Run the script with sudo if backups are in a restricted directory (e.g., /var/backups).

4. Incorrect File Count#

  • Check FILE_PATTERN (e.g., *.BAK is case-sensitive; use -iname in find for case-insensitive matching).

Conclusion#

This script automates backup cleanup, ensuring you never run out of space from old backups. By customizing variables and testing thoroughly, you can adapt it to any backup workflow. For added safety, schedule it with cron to run nightly (e.g., 0 3 * * * /path/to/cleanup_backups.sh).

References#