Understanding shell redirection
When I was learning to use the shell for the first time I remember teaching myself to do things as I ran into problems, and the first problem I ran into was how to take the output of one command and write it to a file or pipe it a second command.
So like most people, I learned:
cmd1 | cmd2
to take the output of cmd1
and send it to cmd2
, and
cmd1 > path/to/file
to take the output of cmd1
and write it to a file.
This got me pretty far for what I needed to do without fully understanding how it worked or why. Eventually, I learned as well that there were multiple types of output, STDOUT and STDERR, and you could redirect one to the other by adding 2>&1
to the end of commands. Early on, I wish I took the time to understand how redirection works and how it relates to file descriptors.
To understand this a bit better, first it’s helpful to understand file descriptors. At a very high level, file descriptors are numbers assigned by the operating system that reference open files. The shell will assign the following:
0
: STDIN - program input1
: STDOUT - program output2
: STDERR - program error
When using a pipe |
or redirection <
/>
it’s helpful to think abut these 3 file descriptors and their corresponding number assignments.
Going back to the example where we are redirecting STDOUT of a program to a file:
ls > path/to/file
This is is the same as:
1> /path/to/file ls
Where 1
refers to STDOUT of ls
Redirecting Output
Let’s break down exactly what >
is doing, from the bash man page:
The general format for redirecting output is:
[n]>word
Where [n]
is a file descriptor, but in the example > /path/to/file cmd
there isn’t a file descriptor on left side! But there is actually, if you don’t specify a specific file descriptor, for output redirection, STDOUT is the default. Which means all of these redirections are exactly the same:
$ ls > path/to/file
$ > /path/to/file ls
$ 1> /path/to/file ls
Note where it is more intuitive to put the >
after the command, it’s not necessary. This is because the position doesn’t matter because the descriptor to the left of >
is implicit. Pretty neat!
Redirecting Input
Redirecting input is almost exactly the same, from the bash man page:
The general format for redirecting output is:
[n]<word
Note that again there is a file descriptor on the left, and a file on the right. The only difference is that when using a <
the default file descriptor is 0
(STDIN) if one isn’t specified. Therefore, all of these commands are equivalent:
$ cat < path/to/file
$ < path/to/file cat
$ 0< path/to/file cat
Again, I think the first one is a bit more intuitive but it doesn’t matter whether the redirection is placed before or after the command.
Redirecting stdout to stderr
So back to 2>&1
, why is the &
necessary and what does it mean? Remember that for >
there needs to be a file descriptor on the left and a file on the right? If we used 2>1
this would simply redirect STDOUT to a file named 1
. By adding the &
it tells the shell that 1
is referring not to a file, but a file descriptor!
So the main things to remember when it comes to shell redirection:
- For redirecting input and redirecting output, there should always be a file descriptor on the left, and a file (for reading, or writing) on the right.
- If a file descriptor is not specified, the default for output
>
is1
(STDOUT) and the default for input<
is0
(STDIN). - For redirecting STDOUT to STDERR, the
&
is necessary on the right side, because you are redirecting to another file descriptor, instead of a file - The best reference for all of this is the bash man page, which contains a lot more in-depth information on ways to redirect!