Table of Contents#
- What Are Man Pages?
- The Problem: Double Letters When Redirecting Output
- Why Does This Happen? The Science of Terminal Formatting
- How Terminal Emulators Hide the Problem
- Solutions: Get Clean Man Page Text
- Conclusion
- References
What Are Man Pages?#
Man pages (short for “manual pages”) are the primary documentation system for Unix-like operating systems. They provide detailed information about commands, system calls, libraries, and more. When you run man [command] (e.g., man ls), the system retrieves the manual page from directories like /usr/share/man and displays it in your terminal.
Crucially, man pages are not plain text files. They are written in a markup language (traditionally troff, now often groff) that includes formatting directives for bold, underline, indentation, and section headers. This formatting is what makes man pages readable in the terminal—but it’s also the source of the “double letters” problem when redirecting output.
The Problem: Double Letters When Redirecting Output#
Let’s demonstrate the issue with a simple example. Run this command to save the ls man page to a file:
man ls > ls_manual.txt Now open ls_manual.txt in a text editor (e.g., nano, vim, or even Notepad). You’ll likely see oddities like:
- “lloongg ooppttiioonnss” instead of “long options”
- “lliisstt” instead of “list”
- Random backspace characters (
^H) or escape sequences (like^[[1m).
Why does this happen? The answer lies in how man pages generate formatted text for terminals—and how that text is handled when redirected to a file.
Why Does This Happen? The Science of Terminal Formatting#
To understand the double letters, we need to unpack two key concepts: escape sequences and overstriking.
Escape Sequences and Formatting Codes#
Man pages are rendered by groff (GNU troff), which converts the markup into output suitable for terminals. To communicate formatting (bold, underline, etc.) to the terminal, groff uses ANSI escape sequences—special character sequences that tell the terminal to change its behavior.
For example:
\e[1m(where\eis the escape character, ASCII 27) enables bold text.\e[4menables underline.\e[0mresets formatting to normal.
When you view the man page in a terminal emulator (e.g., GNOME Terminal, iTerm2), the emulator parses these escape sequences and renders bold/underline text correctly. You don’t see the raw escape codes—only the formatted result.
Overstriking: The Root of Double Letters#
Here’s where the double letters come in: not all terminals historically supported escape sequences, and even today, some minimal terminals (or legacy configurations) rely on a technique called overstriking to simulate bold text.
Overstriking works by printing a character, then a backspace (\b, ASCII 8), then the same character again. For example, to display “ls” in bold, groff might output:
l\bls\bs
Breaking this down:
l(print “l”)\b(backspace: move cursor left by 1)l(print “l” again, overwriting the first “l”)s(print “s”)\b(backspace)s(print “s” again).
In a terminal that supports overstriking, this results in a bold “ls” (the overlapping characters create a darker, bolder appearance). However, if you redirect this output to a file and open it in a text editor, the backspace characters (\b) are either ignored or displayed as ^H, leaving:
lls
Similarly, “long options” in bold would become “lloongg ooppttiioonnss” when backspaces are stripped.
Modern Terminals vs. Raw Output#
Modern terminal emulators (like iTerm2 or Alacritty) use escape sequences (\e[1m for bold) instead of overstriking. Even so, when you redirect man output to a file, the raw escape sequences are saved, not the rendered text. For example, a bold section might look like this in the file:
^[[1mNAME^[[0m
ls - list directory contents
Here, ^[[1m is the escape sequence for bold, and ^[[0m resets formatting. When viewed in a terminal, these sequences are parsed, and “NAME” appears bold. But in a text editor, the sequences are treated as plain text, cluttering the output.
How Terminal Emulators Hide the Problem#
Terminal emulators are designed to interpret escape sequences and overstriking. When you run man ls directly in the terminal:
- Escape sequences like
\e[1mtrigger bold rendering. - Overstriking (backspace + repeated character) is interpreted as a single bold character.
Thus, you see clean, formatted text. But when you redirect to a file, the terminal emulator isn’t involved—the raw output (including escape sequences and overstriking) is saved directly. Text editors, which don’t parse terminal formatting, display the raw characters, leading to double letters.
Solutions: Get Clean Man Page Text#
Fortunately, there are simple ways to strip formatting from man pages and save clean, readable text. Here are three reliable methods:
Method 1: Use man -P cat to Bypass Formatting#
The man command has a -P flag to specify a pager (a program used to display text, like less or more). By default, man uses less or more, which preserve formatting. To bypass this, use cat as the pager—it will output the man page without formatting:
man -P cat ls > ls_manual_clean.txt The -P cat flag tells man to send output directly to cat, which strips terminal escape codes. The result is a plain text file with no double letters.
Method 2: Pipe Through col -b to Strip Backspaces#
The col command filters out backspace and carriage return characters, making it ideal for cleaning up overstriking. Pipe man output through col -b (the -b flag removes backspaces) before redirecting:
man ls | col -b > ls_manual_clean.txt This works because col -b explicitly strips the backspace characters that cause overstriking, leaving single characters.
Method 3: Render with groff for Plain Text#
Man pages are ultimately rendered by groff, so you can call groff directly to generate plain text without terminal formatting. Use the -Tascii flag to target ASCII output and -mandoc to process man page markup:
man ls | groff -mandoc -Tascii > ls_manual_clean.txt This method gives you the most control and is especially useful if you need to customize the output (e.g., adjust line length with -rLL=80n for 80-character lines).
Conclusion#
The “double letters” issue when redirecting man pages stems from how terminal formatting is implemented: escape sequences and overstriking (backspace + repeated characters) create bold/underline text in terminals, but these artifacts are preserved when output is saved to a file. By using tools like man -P cat, col -b, or groff, you can strip this formatting and get clean, readable man page text.
Next time you need to save a man page, remember: the terminal hides the messy formatting, but redirecting raw output doesn’t. Use the solutions above to avoid double letters and keep your documentation clean!
References#
manmanual page:man(1)groffdocumentation: GNU Groff Manual- ANSI escape sequences: ANSI Escape Code Wiki
colcommand:col(1)- Troff/Man Page Formatting: Troff User’s Manual