Input and Output Redirection in Linux

These are my notes on input and output redirection in Linux.

This is my favorite Linux book on Amazon, if you are interested in learning Linux I highly recommend it



Standard Input and Output
Many of the programs we have used so far produce output of some kind. This
output consists of two types. The programs's results like when the data is
designed to produce something. It also produces status and error messages that
tell us about the program in question.

If we look at a command like ls, we can see that it displays its results and its
error messages on screen. Programs such as ls send their results to a special
file called standard output and their status messages to another file called
standard error. By default, both standard output and standard error are linked
to the screen and not saved into a disk file. In addition, many programs take
input from a facility called the standard input, which by default is attached to
the keyboard.

Input and output redirection allows us to change where output goes and where
input comes from. Normally, output goes to the screen and input comes from the
keyboard, but with redirection, we can change that.

Redirecting Output
Redirection allows us to redefine where standard output goes. To redirect
standard output to another file instead of the screen, we use the redirection
operator ">" followed by the name of the file. 

ls -l /usr/bin > ls-output.txt

Here, we created a long listing of the /usr/bin directoryt and sent the results
to the ls-output.txt file. If it is a long file we can use the less command:

less ls-output.txt

If we want to append information to the file instead of reqriting it, we use the
">>" redirection operator.

ls -l /usr/bin >> ls-output.txt

Using the >> operator will result in the output being appended to the file. If
the file does not exist, it is created.

Redirecting Standard Error
Redirecting standard error lacks the ease of a dedicated redirection operator.
To redirect standard error, we must refer to its file descriptor. A program can
produce output on any of several numbered file streams. While we have referred
to the first three of these file streams as standard input, output, and error,
the shell references them internally as file descriptors 0, 1, and 2. The shell
provides a notation for redirecting files using the file descriptor number.
Because standard error is number 2, we can redirect standard error like this:

ls -l /bin/usr 2> ls-error.txt

The file descriptor 2 is placed immediately before the redirection operator to
perform the redirection of standard error to the file ls-error.txt. There are
cases in which we may want to capture all of the output of a command to a single
file. To do this, we must redirect both standard output and standard error at
the same time. 

ls -l /usr/bin > ls-output.txt 2>&1

Using this method, we perform two redirections. First we redirect standard
output and then we redirect file descriptor 2 to file descriptor 1 using the
notation 2>&1.

Sometimes, we do not want output from a command. This usually applies to error
and status messages. The system provides a way to do this by redirecting output
to a special file called /dev/null. This file is a system device often referred
to as a bit bucket, which accepts input and does nothing with it. 

ls -l /usr/bin 2> /dev/null

Redirecting Standard Input
Up to now, we have not encountered many commands that make use of standard
input. The "cat" command reads one or more files and copies them to standard
output.

cat filename

You can use it to display files without paging. 

cat ls-output.txt

It is often used to display short text files. Because "cat" can accept more than
one file as an argument, it can also be used to join files together.

Pipelines
The capability of commands to read data from standard input and send to standard
output is utilized by a shell feature called pipelines. Using the pipe operator
|, the standard output of one command can be piped into the standard input of
another.

ls -l /usr/bin | less

Using this technique, we can conveniently examine the output of any command that
produces standard output.

Pipelines are often used to perform complex operations on data. It is possible
to put several commands together into a pipeline. Frequently, the commands used
in this way are referred to as filters. Filters take input, change it, then
output it. 

ls /bin /usr/bin | sort | less

Because we specified two directories, the output of ls would have consisted of
two sorted lists, one for each directory. By including sort in our pipeline, we
changed the data to produce a single sorted list.

The "uniq" command is often used in conjunction with "sort". It accepts a sorted
list of data from either standard input or a single filename argument then
removes any duplicates from the list. 

ls /bin /usr/bin | sort |uniq | less

We use "uniq" to remove any duplicates from the output of the "sort" command. If
we want to see the list of duplicates, we add the "-d" option to "uniq".

ls /bin /usr/bin | sort | uniq -d | less

The "wc" command is used to display the number of lines, words, and bytes
contained in files.

wc ls-output.txt

In this case, it prints out three numbers: lines, words, and bytes. Like our
previous commands, if executed without command line arguments, "wc" accepts
standard input. The "-l" option limits its output to report only lines. Adding
it to a pipeline is a handy way to count things. To see the number of items we
have in our sorted list we can do this:

ls /bin /usr/bin | sort uniq | wc -l

The command "grep" is a powerful program used to find text patterns within
files. It is used like this:

grep pattern filename

When "grep" encounters a pattern in the file, it prints out the lines containing
it. The patterns that "grep" can match can be very complex. Suppose we wanted to
find all the files in our list of programs that had the word zip embedded in the
name. Such a search might give us an idea of some of the programs on our system
that had something to do with file compression.

ls /bin /usr/bin | sort | uniq | grep zip

There are a couple handy options for "grep".
The option "-i" causes "grep" to ignore case when performing the search.
the option "-v" tells "grep" to print only those lines that do not match the
pattern. 

Sometimes, you do not want all the output from a command. You might want only
the first few lines or the last few lines. The "head" command prints the first
10 lines of a file, and the "tail" command prints the last 10 lines. By default,
both commands print 10 lines of text, but this can be adjusted with the "-n"
option.

head -n 5 ls-output.txt

The "tail" command operates the same way:

tail -n 5 ls-output.txt

The "tail" command also has an option to let you view files in real time. This
is useful for watching the progress of files as they are being written. 

tail -f /var/log/messages

Using the "-f" option, "tail" continues to monitor the file, and when new lines
are appended, they immediately appear on the display. This continues until you
type "ctrl-c".

The "tee" command reads standard input and copies it to both standard output and
to one or more files. This is useful for capturing a pipeline's contents at an
intermediate stage of processing. 

ls /usr/bin | tee ls.txt | grep zip

As always, check out the documentation of each of the commands we have covered.
We have seen only the most basic usage but have a number of interesting options.
You will see that the redirection feature of the command line is very useful for
solving specialized problems.