funwithlinux blog

How to Recursively Cat All JSON Files in Folders into a Single File

In today’s data-driven world, JSON (JavaScript Object Notation) has become the de facto standard for storing and exchanging structured data. Whether you’re dealing with log files, API responses, configuration data, or user-generated content, you’ll often find JSON files scattered across multiple directories and subdirectories.

A common task is to recursively concatenate (or "cat") all these JSON files into a single file for easier analysis, processing, or backup. However, simply "catting" (appending) JSON files with basic tools like cat can result in invalid JSON, as JSON requires strict syntax (e.g., objects/arrays can’t be directly concatenated).

This blog will guide you through multiple methods to safely and efficiently combine all JSON files in a directory (and its subdirectories) into a single, valid JSON file. We’ll cover command-line tools (like find and jq), Python scripts, and Node.js scripts, ensuring you can choose the approach that best fits your workflow.

2025-11

Table of Contents#

Prerequisites#

Before starting, ensure you have the following tools installed based on your chosen method:

MethodRequired Tools
Command-Line (find + jq)find (preinstalled on Linux/macOS), jq (install via brew install jq or apt install jq)
Python ScriptPython 3.x (download from python.org)
Node.js ScriptNode.js (download from nodejs.org)

Understanding the Task#

"Recursively cat JSON files" means:

  1. Recursively search through a directory and all its subdirectories.
  2. Locate all .json files (e.g., data1.json, subdir/data2.json).
  3. Combine their contents into a single JSON file.

Critical note: JSON files cannot be naively "appended" with cat file1.json file2.json > combined.json—this often results in invalid JSON (e.g., {"a":1}{"b":2} is not valid). Instead, we need to merge them into a structured format like an array (e.g., [{"a":1}, {"b":2}]) or a single object (e.g., {"file1": {"a":1}, "file2": {"b":2}}).

Method 1: Using Command-Line Tools (find + jq)#

For quick, terminal-based workflows, find (to locate files) and jq (to process JSON) are powerful.

Step 1: Locate All JSON Files Recursively#

Use find to list all .json files in the current directory and subdirectories:

find . -type f -name "*.json"  
  • .: Search starting from the current directory.
  • -type f: Only include files (not directories).
  • -name "*.json": Match files ending with .json.

Example Output:

./data/logs.json  
./subdir/config.json  
./old/backup.json  

Step 2: Combine JSON Files into an Array#

To merge all JSON files into a single array (most common use case), pipe the find results into jq with "slurp" mode (-s), which reads all input into an array:

find . -type f -name "*.json" -print0 | xargs -0 jq -s . > combined.json  

Breakdown:

  • -print0 + xargs -0: Safely handles filenames with spaces/newlines (avoids errors).
  • jq -s .: "Slurp" all input JSONs into a single array (. outputs the array).
  • > combined.json: Write the result to combined.json.

Example Input Files:
data1.json: {"name": "Alice", "age": 30}
subdir/data2.json: {"name": "Bob", "age": 25}

Output (combined.json):

[  
  {"name": "Alice", "age": 30},  
  {"name": "Bob", "age": 25}  
]  

Step 3: Merge JSON Objects (Instead of Arrays)#

If your JSON files are objects (not arrays) and you want to merge them into a single object (e.g., combining configs), use jq -s add (the add function merges objects):

find . -type f -name "*.json" -print0 | xargs -0 jq -s add > merged_objects.json  

Example Input Files:
config1.json: {"theme": "dark", "font": "Arial"}
config2.json: {"font_size": 14, "notifications": true}

Output (merged_objects.json):

{  
  "theme": "dark",  
  "font": "Arial",  
  "font_size": 14,  
  "notifications": true  
}  

Method 2: Python Script (For Readability & Control)#

Python is ideal for users who prefer readable code or need custom logic (e.g., filtering files, handling errors).

Step 1: Recursively Find JSON Files#

Use Python’s os.walk to traverse directories and collect .json files:

import os  
import json  
 
def find_json_files(root_dir):  
    json_files = []  
    for dirpath, _, filenames in os.walk(root_dir):  
        for filename in filenames:  
            if filename.endswith(".json"):  
                json_files.append(os.path.join(dirpath, filename))  
    return json_files  
 
# Usage: Find all JSON files starting from the current directory  
json_files = find_json_files(".")  
print(f"Found {len(json_files)} JSON files.")  

Step 2: Read and Validate JSON Data#

Read each JSON file, load its contents, and handle errors (e.g., invalid JSON):

combined_data = []  # Store all JSON data in an array  
 
for file_path in json_files:  
    try:  
        with open(file_path, "r", encoding="utf-8") as f:  
            data = json.load(f)  # Parse JSON  
            combined_data.append(data)  
            print(f"Successfully read: {file_path}")  
    except json.JSONDecodeError:  
        print(f"⚠️  Skipping invalid JSON: {file_path}")  
    except Exception as e:  
        print(f"⚠️  Error reading {file_path}: {str(e)}")  

Step 3: Write Combined Data to Output File#

Dump the collected data into a single JSON file:

output_file = "combined_python.json"  
with open(output_file, "w", encoding="utf-8") as f:  
    json.dump(combined_data, f, indent=2)  # indent=2 for pretty formatting  
 
print(f"✅ Combined JSON saved to: {output_file}")  

Full Python Script#

import os  
import json  
 
def find_json_files(root_dir):  
    json_files = []  
    for dirpath, _, filenames in os.walk(root_dir):  
        for filename in filenames:  
            if filename.endswith(".json"):  
                json_files.append(os.path.join(dirpath, filename))  
    return json_files  
 
def main():  
    root_dir = "."  # Start from current directory  
    json_files = find_json_files(root_dir)  
    print(f"Found {len(json_files)} JSON files.")  
 
    combined_data = []  
    for file_path in json_files:  
        try:  
            with open(file_path, "r", encoding="utf-8") as f:  
                data = json.load(f)  
                combined_data.append(data)  
                print(f"Read: {file_path}")  
        except json.JSONDecodeError:  
            print(f"⚠️  Skipping invalid JSON: {file_path}")  
        except Exception as e:  
            print(f"⚠️  Error reading {file_path}: {str(e)}")  
 
    output_file = "combined_python.json"  
    with open(output_file, "w", encoding="utf-8") as f:  
        json.dump(combined_data, f, indent=2)  
 
    print(f"\n✅ Done! Combined data saved to: {output_file}")  
 
if __name__ == "__main__":  
    main()  

Method 3: Node.js Script (JavaScript Users)#

For JavaScript developers, use Node.js’s fs (file system) and path modules to replicate the Python logic:

const fs = require('fs');  
const path = require('path');  
 
// Recursively find all JSON files  
function findJsonFiles(rootDir) {  
    let jsonFiles = [];  
    function traverse(dir) {  
        const files = fs.readdirSync(dir);  
        for (const file of files) {  
            const fullPath = path.join(dir, file);  
            if (fs.statSync(fullPath).isDirectory()) {  
                traverse(fullPath);  // Recurse into subdirectories  
            } else if (path.extname(file) === '.json') {  
                jsonFiles.push(fullPath);  
            }  
        }  
    }  
    traverse(rootDir);  
    return jsonFiles;  
}  
 
// Main logic  
const rootDir = '.';  
const jsonFiles = findJsonFiles(rootDir);  
console.log(`Found ${jsonFiles.length} JSON files.`);  
 
const combinedData = [];  
for (const file of jsonFiles) {  
    try {  
        const data = JSON.parse(fs.readFileSync(file, 'utf8'));  
        combinedData.push(data);  
        console.log(`Read: ${file}`);  
    } catch (err) {  
        console.log(`⚠️  Skipping ${file}: ${err.message}`);  
    }  
}  
 
// Write output  
const outputFile = 'combined_node.json';  
fs.writeFileSync(outputFile, JSON.stringify(combinedData, null, 2));  
console.log(`✅ Saved to: ${outputFile}`);  

Run with: node combine-json.js

Troubleshooting Common Issues#

IssueSolution
find: permission deniedRun with sudo (if needed) or exclude restricted directories: find . -type f -name "*.json" -not -path "./node_modules/*"
jq: command not foundInstall jq via brew install jq (macOS) or apt install jq (Linux).
Invalid JSON in outputUse jq . combined.json to validate, or check for malformed input files.
Filenames with spacesUse find -print0 + xargs -0 (command line) or Python/Node.js (handles spaces natively).

Best Practices#

  1. Backup First: Always back up original JSON files before merging.
  2. Validate Output: Use jq . combined.json or JSONLint to check validity.
  3. Handle Large Files: For 1000+ files, use streaming (e.g., Python’s ijson or Node.js streams) to avoid memory issues.
  4. Exclude Unneeded Files: Use find -not -path "./tmp/*" to skip directories like node_modules/ or tmp/.

Conclusion#

Recursively combining JSON files is straightforward with the right tools. Choose:

  • Command Line for speed (if you know find/jq).
  • Python/Node.js for readability, error handling, or custom logic.

All methods ensure valid JSON output, whether as an array, merged object, or custom structure.

References#