Table of Contents#
- Prerequisites
- Overview of Tools
- Step-by-Step Script Creation
- Customization Tips
- Testing the Script
- Advanced Enhancements
- Conclusion
- References
Prerequisites#
Before we start, ensure you have:
- A Unix-like operating system (Linux, macOS, or WSL).
- Basic familiarity with shell scripting (Bash).
- Read access to the log file you want to monitor.
- Permissions to execute the command triggered by errors (e.g.,
sudoaccess if restarting a service).
Overview of Tools#
We’ll use the following command-line tools to build our script:
1. tail -f#
The tail command displays the last few lines of a file. The -f (follow) flag makes it "follow" the file in real time, outputting new lines as they are added. This is ideal for monitoring logs that are actively being written to (e.g., application logs, system logs).
Example:
tail -f /var/log/syslog # Streams new lines from /var/log/syslog2. grep#
grep (global regular expression print) searches for patterns in text. We’ll use it to detect error keywords in the log stream. The -i flag makes the search case-insensitive, and -q (quiet) suppresses output, returning only an exit code (0 if the pattern is found, 1 otherwise).
Example:
echo "ERROR: Database connection failed" | grep -i "error" # Returns exit code 0 (match found)3. Bash Loops and Conditionals#
We’ll use a while loop to read lines from the tail -f stream and if conditionals to check if lines contain errors.
Step-by-Step Script Creation#
Let’s build the script incrementally. We’ll start with a basic version and add features as we go.
Step 1: Define Core Variables#
First, we’ll define variables to make the script easy to customize. These include the log file path, error keywords, and the command to execute on error.
Create a new file (e.g., log_monitor.sh) and add the following:
#!/bin/bash
# Configuration Variables
LOG_FILE="/var/log/syslog" # Log file to monitor (update this!)
ERROR_KEYWORDS=("ERROR" "FAILED") # Error keywords to detect (case-insensitive)
ALERT_COMMAND="echo 'Error detected! Check $LOG_FILE'" # Command to run on errorLOG_FILE: Path to the log file (e.g.,/var/log/nginx/error.log,/home/user/app.log).ERROR_KEYWORDS: Array of keywords to watch for (e.g., "CRITICAL", "WARNING").ALERT_COMMAND: Action to trigger (e.g., send an email, restart a service, or log to a file).
Step 2: Stream Logs and Detect Errors#
Use tail -f to stream the log file, then pipe the output to a while loop to process each line. For each line, check if it contains any of the error keywords using grep.
Add this to the script:
# Monitor log file in real time
echo "Monitoring log file: $LOG_FILE for keywords: ${ERROR_KEYWORDS[*]}"
tail -f "$LOG_FILE" | while read -r LINE; do
# Check if the line contains any error keyword (case-insensitive)
for KEYWORD in "${ERROR_KEYWORDS[@]}"; do
if echo "$LINE" | grep -qi "$KEYWORD"; then
echo "[$(date +'%Y-%m-%d %H:%M:%S')] Error detected: $LINE" # Log error with timestamp
eval "$ALERT_COMMAND" # Execute the alert command
break # Exit loop after first match to avoid duplicate alerts
fi
done
doneExplanation:#
tail -f "$LOG_FILE": Streams new lines from the log file.while read -r LINE: Reads each new line into the variableLINE.for KEYWORD in "${ERROR_KEYWORDS[@]}": Loops through each error keyword.grep -qi "$KEYWORD": Checks ifLINEcontains the keyword (-i= case-insensitive,-q= quiet mode).eval "$ALERT_COMMAND": Executes the command defined inALERT_COMMAND(useevalto handle commands with spaces/arguments).
Step 3: Make the Script Executable#
Save the file and make it executable with:
chmod +x log_monitor.shFull Basic Script#
Here’s the complete basic version:
#!/bin/bash
# Configuration Variables
LOG_FILE="/var/log/syslog" # Update with your log file path
ERROR_KEYWORDS=("ERROR" "FAILED") # Add/remove keywords as needed
ALERT_COMMAND="echo 'ALERT: Error detected in $LOG_FILE!'" # Update with your command
# Function to handle script exit (cleanup)
cleanup() {
echo "Stopping log monitor..."
exit 0
}
# Trap Ctrl+C to exit gracefully
trap cleanup SIGINT
# Start monitoring
echo "Starting log monitor for $LOG_FILE (press Ctrl+C to stop)..."
tail -f "$LOG_FILE" | while read -r LINE; do
for KEYWORD in "${ERROR_KEYWORDS[@]}"; do
if echo "$LINE" | grep -qi "$KEYWORD"; then
TIMESTAMP=$(date +'%Y-%m-%d %H:%M:%S')
echo "[$TIMESTAMP] ERROR DETECTED: $LINE" # Log to console with timestamp
eval "$ALERT_COMMAND" # Trigger alert/action
break # Avoid duplicate alerts for the same line
fi
done
doneCustomization Tips#
Tailor the script to your needs with these tweaks:
1. Monitor Multiple Log Files#
To monitor multiple logs, wrap the tail -f command in a loop or use tail -f /path/to/log1 /path/to/log2. Update LOG_FILE to an array:
LOG_FILES=(/var/log/syslog /var/log/nginx/error.log) # Multiple logs
tail -f "${LOG_FILES[@]}" | while read -r LINE; do ... done2. Use Multiple Error Keywords#
Add more keywords to ERROR_KEYWORDS (e.g., "CRITICAL", "WARNING", "SEGMENTATION FAULT"):
ERROR_KEYWORDS=("ERROR" "FAILED" "CRITICAL" "WARNING")3. Trigger Powerful Alerts#
Replace ALERT_COMMAND with actions like:
-
Send an email (using
mailorsendmail):ALERT_COMMAND="echo 'Error detected: $LINE' | mail -s 'Log Alert' [email protected]" -
Restart a service (requires
sudoprivileges):ALERT_COMMAND="sudo systemctl restart nginx" # Restart Nginx on error -
Log to a separate file:
ALERT_COMMAND="echo '[$TIMESTAMP] Error: $LINE' >> /var/log/error_alerts.log"
4. Avoid Duplicate Alerts#
To prevent spamming (e.g., if an error repeats every second), add a cooldown period. Use a temporary file to track the last alert time:
COOLDOWN=300 # 5 minutes (300 seconds)
LAST_ALERT_FILE="/tmp/last_alert_time.txt"
# Inside the error detection loop:
if [ ! -f "$LAST_ALERT_FILE" ] || [ $(date +%s) -gt $(( $(cat "$LAST_ALERT_FILE") + COOLDOWN )) ]; then
eval "$ALERT_COMMAND"
date +%s > "$LAST_ALERT_FILE" # Update last alert time
fiTesting the Script#
Follow these steps to test your script:
-
Create a test log file:
touch test_log.log -
Update
LOG_FILEin the script to./test_log.log. -
Run the script:
./log_monitor.sh -
Simulate an error in a new terminal:
echo "2024-01-01 12:00:00 ERROR: Database connection failed" >> test_log.log -
Verify the script triggers the alert:
You should see a timestamped error message in the script’s output, followed by yourALERT_COMMAND(e.g., an email or echo).
Advanced Enhancements#
Take your script to the next level with these features:
1. Log Error Details to a Database#
Use curl to send error data to an API (e.g., PostgreSQL, InfluxDB, or a monitoring tool like Prometheus):
ALERT_COMMAND="curl -X POST -d 'timestamp=$TIMESTAMP&error=$LINE' https://monitoring-api.example.com/log-error"2. Use inotifywait for Efficient Monitoring#
tail -f polls the file periodically, which can be inefficient. Use inotifywait (from inotify-tools) to trigger on file changes:
# Install inotify-tools first: sudo apt install inotify-tools
while inotifywait -e modify "$LOG_FILE"; do
tail -n 1 "$LOG_FILE" | while read -r LINE; do # Process only new lines
... # Error detection logic
done
done3. Integrate with Monitoring Tools#
Pipe alerts to tools like Nagios, Zabbix, or Slack:
# Send Slack alert (replace with your webhook)
ALERT_COMMAND='curl -X POST -H "Content-Type: application/json" -d "{\"text\":\"Error detected: $LINE\"}" https://hooks.slack.com/services/XXX/XXX/XXX'Conclusion#
Automated log monitoring with shell scripts is a powerful, low-cost way to keep tabs on your systems. By combining tail -f, grep, and Bash logic, you can detect errors in real time and trigger actions to resolve issues faster.
Start with the basic script, customize it to your environment, and gradually add advanced features like cooldown periods or Slack alerts. With this tool in your toolkit, you’ll spend less time debugging and more time building!