funwithlinux blog

How to Redirect/Pipe wget Download Directly into gunzip: Fixing When gunzip Doesn't Start

In the world of command-line tools, efficiency and resource management are paramount. If you’ve ever needed to download a gzipped file (.gz) and immediately decompress it, you might have tried piping wget directly into gunzip—only to find gunzip refusing to start, hanging, or throwing errors. This frustration is common, but the solution lies in understanding how wget, pipes, and gunzip interact.

In this blog, we’ll demystify the process of piping wget output into gunzip, explore why gunzip often fails to start, and provide step-by-step solutions to streamline your workflow. By the end, you’ll be able to download and decompress files in a single command, saving disk space and time by avoiding intermediate files.

2025-12

Table of Contents#

  1. Understanding the Basics
    • What is wget?
    • What is gunzip?
    • How Unix Pipes Work
    • Why Pipe wget to gunzip?
  2. The Problem: Why gunzip Doesn’t Start Immediately
    • Common Mistake 1: Forgetting to Stream wget Output to Stdout
    • Common Mistake 2: wget Saving to a File (Not the Pipe)
    • Common Mistake 3: gunzip Receiving Non-Gzipped Data
  3. Solutions to Pipe wget into gunzip Successfully
    • Solution 1: Use wget -O - to Stream to Stdout
    • Solution 2: Handle Server-Sent Compression with --no-decompress
    • Solution 3: Use curl as an Alternative
    • Solution 4: Fallback: Save Intermediate Files (When All Else Fails)
  4. Advanced Scenarios
    • Resuming Interrupted Downloads
    • Saving Both Compressed and Uncompressed Files
    • Verifying Download Integrity
    • Piping to tar for Archives
  5. Troubleshooting Common Issues
    • "gunzip: stdin: not in gzip format"
    • wget Hangs or gunzip Produces No Output
    • Permission Denied Errors
  6. Conclusion
  7. References

Understanding the Basics#

Before diving into the problem, let’s ground ourselves in the tools and concepts involved.

What is wget?#

wget is a command-line utility for downloading files from the web. It supports HTTP, HTTPS, and FTP, and is known for its robustness (resuming downloads, handling slow connections) and scriptability. By default, wget saves downloaded files to disk with the filename derived from the URL.

What is gunzip?#

gunzip is the decompression counterpart to gzip, a popular compression tool. It decompresses files compressed with gzip (file extension .gz). Unlike some tools, gunzip can process data from standard input (stdin)—meaning it can accept data piped from another command (like wget) instead of requiring a pre-saved file.

How Unix Pipes Work#

A pipe (|) is a Unix shell feature that connects the standard output (stdout) of one command to the standard input (stdin) of another. For example:

command1 | command2

Here, command1 sends its output directly to command2 for processing, without saving intermediate data to disk. This is高效 (efficient) for workflows like "download → decompress."

Why Pipe wget to gunzip?#

Piping wget into gunzip offers two key benefits:

  1. Saves Disk Space: Avoids storing the large, compressed .gz file temporarily.
  2. Saves Time: Decompresses while downloading, reducing total wait time.

The Problem: Why gunzip Doesn’t Start Immediately#

If gunzip isn’t starting when you pipe wget into it, the root cause is almost always a mismatch in how wget is configured to output data. Let’s break down the most common mistakes:

Common Mistake 1: Forgetting to Stream wget Output to Stdout#

By default, wget saves downloaded files to disk, not to stdout. If you run:

wget https://example.com/file.gz | gunzip  # ❌ Incorrect!

wget will save file.gz to disk, and its stdout will only contain progress messages (e.g., "10% [====>...]"), not the actual compressed data. gunzip receives these progress messages (not valid gzip data) and either hangs or errors out.

Common Mistake 2: wget Saving to a File (Not the Pipe)#

Even if you intend to pipe, wget may still write to a file if you omit the flag to redirect output. Without explicit redirection, wget ignores the pipe and prioritizes saving to disk.

Common Mistake 3: gunzip Receiving Non-Gzipped Data#

Some servers compress data "on the fly" using Content-Encoding: gzip (e.g., to speed up transfers). By default, wget automatically decompresses such data before saving it. If you pipe this decompressed data into gunzip, gunzip will complain:

gunzip: stdin: not in gzip format  # ❌ Error!

Because wget already stripped the gzip compression, gunzip receives plaintext (not .gz data).

Solutions to Pipe wget into gunzip Successfully#

Let’s fix these issues with targeted solutions.

Solution 1: Use wget -O - to Stream to Stdout#

The critical fix is telling wget to output the downloaded data to stdout (instead of a file) using the -O (or --output-document) flag with - as the filename (a Unix convention for "stdout").

Example Command:#

wget -O - https://example.com/large_file.gz | gunzip > decompressed_output.txt

Breakdown:#

  • -O -: Tells wget to send the downloaded file content to stdout.
  • | gunzip: Pipes stdout (the .gz data) to gunzip for decompression.
  • > decompressed_output.txt: Saves gunzip’s decompressed output to a file.

Now gunzip receives valid gzip data as it streams in, so it starts processing immediately.

Solution 2: Handle Server-Sent Compression with --no-decompress#

If the server sends data with Content-Encoding: gzip (e.g., a dynamic response, not a pre-compressed .gz file), wget will auto-decompress it by default. To preserve the original gzip data for gunzip, use --no-decompress (or -N in older versions).

Example:#

Suppose https://api.example.com/data returns gzipped data (via Content-Encoding: gzip), but isn’t named .gz. To pipe this into gunzip:

wget --no-decompress -O - https://api.example.com/data | gunzip > api_response.txt

Why This Works:#

--no-decompress tells wget to bypass auto-decompression, ensuring the raw gzip data is sent to gunzip.

Solution 3: Use curl as an Alternative#

If you’re more familiar with curl (another download tool), it streams to stdout by default, making it simpler to pipe into gunzip:

curl -L https://example.com/file.gz | gunzip > decompressed.txt

Notes:#

  • -L: Follows HTTP redirects (like wget’s default behavior).
  • No need for -O -curl sends data to stdout unless told otherwise.

Solution 4: Fallback: Save Intermediate Files (When All Else Fails)#

If piping still fails (e.g., due to network instability or non-streamable data), you can save the .gz file first, then decompress:

# Step 1: Download the .gz file
wget https://example.com/file.gz
 
# Step 2: Decompress it
gunzip file.gz  # Creates "file" (no .gz extension)

This is less efficient but reliable for problematic downloads.

Advanced Scenarios#

Once you’ve mastered the basics, here are advanced workflows to handle edge cases.

Resuming Interrupted Downloads#

Use wget -c (continue) to resume partial downloads and pipe to gunzip:

wget -c -O - https://example.com/large_file.gz | gunzip -c > decompressed.txt
  • -c: Resumes the download from where it left off.
  • -c in gunzip (optional): Explicitly tells gunzip to write to stdout (default, but clarifies intent).

Saving Both Compressed and Uncompressed Files#

Use tee to save the .gz data to disk and pipe it to gunzip (useful for backups):

wget -O - https://example.com/file.gz | tee backup_file.gz | gunzip > decompressed.txt
  • tee backup_file.gz: Writes the .gz data to backup_file.gz and passes it to gunzip.

Verifying Download Integrity#

To ensure the downloaded .gz file isn’t corrupted before decompressing, use wget’s --checksum flag (requires a checksum file from the server):

# Download checksum file (e.g., SHA256)
wget https://example.com/file.gz.sha256
 
# Download and verify, then decompress
wget --checksum=sha256 --checksum-file=file.gz.sha256 -O - https://example.com/file.gz | gunzip > decompressed.txt

Piping to tar for Archives#

For .tar.gz (compressed tar archives), pipe gunzip output to tar to extract directly:

wget -O - https://example.com/archive.tar.gz | gunzip | tar xf -
  • tar xf -: Extracts (x) the tar archive from stdin (-).

Troubleshooting Common Issues#

Error: "gunzip: stdin: not in gzip format"#

This means gunzip received non-gzip data. Fixes:

  • Check wget flags: Ensure you used -O - (not missing -).
  • Verify server compression: Use --no-decompress if the server sends Content-Encoding: gzip.
  • Test the URL: Run wget -O - URL | file - to check if the data is gzipped:
    wget -O - https://example.com/file.gz | file -
    # Expected output: "stdin: gzip compressed data, from Unix"

wget Hangs or gunzip Produces No Output#

  • Check wget progress: wget shows a progress bar by default. If it’s stuck, the issue is with the download (e.g., slow server), not the pipe.
  • Test with a small file: Use a small .gz file (e.g., https://example.com/small_test.gz) to isolate the problem.

Permission Denied When Writing Output#

If gunzip > output.txt fails with "Permission denied":

  • Ensure the output directory is writable (e.g., use ~/output.txt instead of /root/output.txt).
  • Avoid sudo unless necessary (it can cause ownership issues with the output file).

Conclusion#

Piping wget into gunzip is a powerful way to download and decompress files efficiently—when done correctly. The key takeaways are:

  • Use wget -O - to stream downloaded data to stdout.
  • Add --no-decompress if the server sends auto-decompressed data.
  • Verify with file - if gunzip complains about invalid format.

With these tools, you’ll avoid disk bloat and speed up your workflow. For stubborn cases, curl or intermediate files are reliable fallbacks.

References#