Learn Linux 06: Redirection

Published

Contents


Introduction

In this chapter we are going to unleash what may be the coolest feature of the command line. It’s called I/O redirection. The “I/O” stands for input/output, and with this facility you can redirect the input and output of commands to and from files, as well as connect multiple commands together into powerful command pipelines. This chapter will introduce the following commands:

  • cat - Concatenate files
  • sort - Sort lines of text
  • uniq - Report or omit repeated lines
  • wc - Print newline, word, and byte counts for each file
  • grep - Print lines matching a pattern
  • head - Output the first part of a file
  • tail - Output the last part of a file
  • tee - Read from standard input and write to standard output and files

Standard Input, Output, And Error

Many of the programs that we have used so far produce output of some kind. This output often consists of two types.

  • The program’s results; that is, the data the program is designed to produce.
  • Status and error messages that tell us how the program is getting along.

If we look at a command like ls, we can see that it displays its results and its error messages on the screen.

Keeping with the Unix theme of “everything is a file,” programs such as ls actually send their results to a special file called standard output (often expressed as stdout) and their status messages to another file called standard error (stderr). By default, both standard output and standard error are linked to the screen and not saved into a disk file.

In addition, many programs take input from a facility called standard input (stdin), which is, by default, attached to the keyboard.

I/O redirection allows us to change where output goes and where input comes from. Normally, output goes to the screen and input comes from the keyboard, but with I/O redirection, we can change that.

Redirecting Standard Output

I/O redirection allows us to redefine where standard output goes. To redirect standard output to another file instead of the screen, we use the > redirection operator followed by the name of the file. Why would we want to do this? It’s often useful to store the output of a command in a file. For example, we could tell the shell to send the output of the ls command to the file ls-output.txt instead of the screen.

[user@linux ~]$ ls -l /usr/bin > ls-output.txt

Appending the standard output stream to a file.

[user@linux ~]$ ls -l /usr/bin >> ls-output.txt

Redirecting Standard Error

Redirecting standard error lacks the ease of a dedicated redirection operator. To redirect standard error, we must refer to its file descriptor. A program can produce output on any of several numbered file streams. While we have referred to the first three of these file streams as standard input, output, and error, the shell references them internally as file descriptors 0, 1, and 2, respectively. The shell provides a notation for redirecting files using the file descriptor number. Because standard error is the same as file descriptor number 2, we can redirect standard error with this notation.

[user@linux ~]$ ls -l /bin/usr 2> ls-error.txt

Redirecting Standard Output And Standard Error To One File

There are two ways to do this. The first way is the traditional way, which works with old versions of the shell. Using this method, we perform two redirections. First we redirect standard output to the file ls-output.txt, and then we redirect file descriptor 2 (standard error) to file descriptor 1 (standard output) using the notation 2>&1.

[user@linux ~]$ ls -l /bin/usr > ls-error.txt 2>&1

Here is the newer version.

[user@linux ~]$ ls -l /bin/usr &> ls-error.txt

Appending the standard output and standard error streams to a single file.

[user@linux ~]$ ls -l /bin/usr &>> ls-output.txt

Disposing Of Unwanted Output

Sometimes “silence is golden” and we don’t want output from a command; we just want to throw it away. This applies particularly to error and status messages. The system provides a way to do this by redirecting output to a special file called /dev/null. This file is a system device often referred to as a bit bucket, which accepts input and does nothing with it. To suppress error messages from a command.

[user@linux ~]$ ls -l /bin/usr 2> /dev/null

Redirecting Standard Input

Up to now, we haven’t encountered any commands that make use of standard input (actually we have, but we’ll reveal that surprise a little bit later), so we need to introduce one.

cat

The cat command reads one or more files and copies them to standard output. You can use it to display files without paging. For example, the following will display the contents of the file ls-output.txt.

[user@linux ~]$ cat ls-output.txt

This is all well and good, but what does this have to do with standard input? Nothing yet, but let’s try something else. What happens if we enter cat with no arguments?

[user@linux ~]$ cat

Nothing happens; it just sits there like it’s hung. It might seem that way, but it’s really doing exactly what it’s supposed to do. If cat is not given any arguments, it reads from standard input, and since standard input is, by default, attached to the keyboard, it’s waiting for us to type something! Try adding some text and pressing enter.

[user@linux ~]$ cat
Hello World

Next, type ctrl-D (i.e., hold down the ctrl key and press D) to tell cat that it has reached end of file (EOF) on standard input.

[user@linux ~]$ cat
Hello World
Hello World

In the absence of filename arguments, cat copies standard input to standard output, so we see our line of text repeated. We can use this behavior to create short text files. Let’s say we wanted to create a file called hello_world.txt containing the text in our example.

[user@linux ~]$ cat > hello_world.txt
Hello World

Type the command followed by the text we want to place in the file. Remember to type ctrl-D at the end. Using the command line, we have implemented the world’s dumbest word processor! To see our results, we can use cat to copy the file to stdout again.

[user@linux ~]$ cat hello_world.txt
Hello World

Now that we know how cat accepts standard input, in addition to filename arguments, let’s try redirecting standard input.

[user@linux ~]$ cat < hello_world.txt
Hello World

Using the < redirection operator, we change the source of standard input from the keyboard to the file hello_world.txt. We see that the result is the same as passing a single filename argument. This is not particularly useful compared to passing a filename argument, but it serves to demonstrate using a file as a source of standard input. Other commands make better use of standard input, as we will soon see.

Pipelines

The capability of commands to read data from standard input and send to standard output is utilized by a shell feature called pipelines. Using the pipe operator |, the standard output of one command can be piped into the standard input of another.

To fully demonstrate this, we are going to need some commands. Remember how we said there was one we already knew that accepts standard input? It’s less. We can use less to display, page by page, the output of any command that sends its results to standard output.

[user@linux ~]$ ls -l /usr/bin | less

Filters

Pipelines are often used to perform complex operations on data. It is possible to put several commands together into a pipeline. Frequently, the commands used this way are referred to as filters. Filters take input, change it somehow, and then output it. The first one we will try is sort. Imagine we wanted to make a combined list of all the executable programs in /bin and /usr/bin, put them in sorted order, and view the resulting list.

[user@linux ~]$ ls /bin /usr/bin | sort | less

The Difference Between > And |

At first glance, it may be hard to understand the redirection performed by the pipeline operator | versus the redirection operator >. Simply put, the redirection operator connects a command with a file, while the pipeline operator connects the output of one command with the input of a second command.

command1 > file1
command1 | command2

uniq: Report Or Omit Repeated Lines

The uniq command accepts a list of data from either standard input or a single filename argument (see the uniq man page for details) and, by default, removes any duplicates from the list. So, to make sure our list has no duplicates (that is, any programs of the same name that appear in both the /bin and /usr/bin directories), we will add uniq to our pipeline. If we want to see the list of duplicates instead, we add the -d option to uniq.

[user@linux ~]$ ls /bin /usr/bin | sort | uniq | less

wc: Print Line, Word, And Byte Counts

The wc (word count) command is used to display the number of lines, words, and bytes contained in files.

[user@linux ~]$ wc ls-output.txt
7902 64566 503634 ls-output.txt

grep: Print Lines Matching A Pattern

The grep command is a powerful program used to find text patterns within files.

[user@linux ~]$ ls /bin /usr/bin | sort | uniq | grep zip
bunzip2
bzip2
gunzip
gzip
unzip
zip
zipcloak
zipgrep
zipinfo
zipnote
zipsplit

head/tail: Print First/Last Part Of Files

The head command prints the first 10 lines of a file, and the tail command prints the last 10 lines. By default, both commands print 10 lines of text, but this can be adjusted with the -n option.

[user@linux ~]$ head -n 5 ls-output.txt
[user@linux ~]$ tail -n 5 ls-output.txt

The tail command has an option (-f) that allows you to view files in real time. This is useful for watching the progress of log files as they are being written.

[user@linux ~]$ tail -f /var/log/messages
Jan 7 05:09:00 twin4 dhclient: DHCPACK from 192.168.1.1
Jan 7 05:09:00 twin4 dhclient: bound to 192.168.1.4 -- renewal in 1652 seconds.
Jan 7 05:09:00 twin4 mountd[3953]: /var/NFSv4/musicbox exported to both 192.168.1.0/24 and twin7.localdomain in 192.168.1.0/24,twin7.localdomain
Jan 7 05:09:00 twin4 dhclient: DHCPREQUEST on eth0 to 192.168.1.1 port 67
Jan 7 05:09:00 twin4 dhclient: DHCPACK from 192.168.1.1
Jan 7 05:09:00 twin4 dhclient: bound to 192.168.1.4 -- renewal in 1771 seconds.
Jan 7 05:09:00 twin4 smartd[3468]: Device: /dev/hda, SMART Prefailure Attribute: 8 Seek_Time_ Performance changed from 237 to 236
Jan 7 05:09:00 twin4 mountd[3953]: /var/NFSv4/musicbox exported to both 192.168.1.0/24 and twin7.localdomain in 192.168.1.0/24,twin7.localdomain
Jan 7 05:09:00 twin4 sshd(pam_unix)[29234]: session opened for user user by (uid=0)
Jan 7 05:09:00 twin4 su(pam_unix)[29279]: session opened for user root by user(uid=500)

tee: Read From Stdin And Output To Stdout And Files

In keeping with our plumbing metaphor, Linux provides a command called tee that creates a “tee” fitting on our pipe. The tee program reads standard input and copies it to both standard output (allowing the data to continue down the pipeline) and to one or more files. This is useful for capturing a pipeline’s contents at an intermediate stage of processing. Here we capture the entire directory listing to the file ls.txt before grep filters the pipeline’s contents.

[user@linux ~]$ ls /usr/bin | tee ls.txt | grep zip
bunzip2
bzip2
gunzip
gzip
unzip
zip
zipcloak
zipgrep
zipinfo
zipnote
zipsplit

Summary

In this chapter we covered a lot. From standard input, output, and error to pipelines and more. We have seen only their most basic usage. Be sure to check out the documentation of each of the commands.