We learned a great deal about how bash and other processes work together in the terminal. Let's refocus on bash and start figuring out how exactly we get stuff done with it.
As mentioned earlier, bash waits for instructions from you and then executes them to the best of its abilities. To get the most out of bash, and especially to avoid damage due to bash misunderstanding your intentions, it's important that you pay close attention to these basics of the bash shell language. There are many people that consider themselves fluent in bash but fail to understand even these most basic concepts. As a result, they create programs that can inflict extensive damage to unsuspecting users and systems. Don't be that person.
At the core of the bash shell language are its commands. Your commands tell bash what you need it to do, step-by-step, command-by-command.
Bash generally takes one command from you at a time, executes the command, and when completed returns to you for the next command. We call this synchronous command execution. It is important to understand that while bash is busy with a command that you give it, you cannot interact with bash directly: you'll have to wait for it to be ready with executing its command and return to the script. For most commands, you'll barely notice this: they get executed so fast bash will be back for the next command before you realize.
Some commands can take a long time to complete, though. In particular, commands that start other programs with which you can interact. For instance, a command might start a file editor. While you're interacting with the file editor, bash takes a back-seat and waits for the file editor to end (which generally means you quit it). When the file editor program stops running, the command ends and bash resumes operation by asking you for the next thing to do. You'll notice that while your editor is running, you are no longer at the bash prompt. As soon as your editor exits, your bash prompt re-appears:
$ exbash command to run the "ex" program.
: iex command to "insert" some text.
Hello!
.A line with just a dot tells ex to stop inserting text.
: w greeting.txtex command to "write" the text to a file.
"greeting.txt" [New] 1L, 7C written
: qex command to "quit" the program.
$ cat greeting.txtAnd now we're back in bash!
Hello!The "cat" program shows the contents of the file.
$
Notice how in this session, we started out by giving bash the command to start the ex file editor. After issuing this command, our prompt changed: any text we enter now is sent to ex, not to bash. While ex is running, bash is asleep waiting for your ex session to end. When you quit ex using the q command, the ex
bash command ends and bash is ready to receive a new command. To tell you this, it shows you its prompt again, allowing
you to enter the next bash command. We finish the example with a cat greetings.txt bash command which tells bash to run the cat program. The cat program is great for outputting file contents (its name is short for concatenate, because its purpose is to output the contents of all the files you give it, one after the other, effectively concatenating the contents in its output). The cat command in the example is used to find out what is in our
greetings.txt
file after we're done editing it with the ex program.
We've been showing quite a few examples now of running commands in bash, so you probably already have a good idea about how one issues basic commands in bash at the prompt.
Bash is mostly a line-based language. Accordingly, when bash reads your commands, it does so line-by-line. Most commands will only constitute one line and, unless the syntax of your bash command explicitly indicates that your command is not yet complete, as soon as you end that line, bash will immediately consider that to be the end of the command. As a result, typing a line of text and hitting the ⏎ key will generally cause bash to start performing the command described by your line of text.
Some commands however, span multiple lines. These are usually block commands or commands with quotes in them:
$ read -p "Your name? " nameThis command is complete and can be started immediately.
Your name? Maarten Billemont
$ if [[ $name = $USER ]]; thenThe "if" block started but wasn't finished.
> echo "Hello, me."
> else
> echo "Hello, $name."
> fiNow the "if" block ends and bash knows enough to start the command.
Hello, Maarten Billemont.
Logically, bash cannot execute a command until it has enough information to do its job. The first line of the if
command in the example above (we'll cover what these commands do in more detail later on) doesn't contain enough information for bash to know what to do if the test succeeds or if it fails. As a result, bash shows a special prompt: >
. This prompt essentially means: the command you gave me is not yet at an end
. We keep on
providing extra lines for the command, until we reach the fi
construct. When we end that line, bash knows that you're done providing conditional result cases. It immediately begins running all the code in the entire block, from if
to fi
. We will soon see the different kinds of commands defined in bash's grammar, but the if
command we just saw is called a Compound Command, because it compounds a bunch of
basic commands into a larger logical block.
In each of these cases, we're passing our commands to an interactive bash session. As we explained before, bash can also run in non-interactive mode where it reads commands from a file or stream rather than asking you for them. In non-interactive mode, bash doesn't have a prompt. Aside from that, it operates pretty much the same. We could copy the bash code from the example above and put it in a text file instead:
read -p "Your name? " name
if [[ $name = $USER ]]; then
echo "Hello, me."
else
echo "Hello, $name."
fi
It doesn't matter much what you name the file in which you save the code. Let's say you saved it in a file called hello.txt
, we can now run the commands from that file using bash without it having to ask us for instructions:
$ bash hello.txtThis starts a new "bash" process.
Your name? Maarten Billemont
Hello, Maarten Billemont.Our new "bash" process ends when there is no code left in the file.
$ Now that the "bash" command is done, our interactive bash comes back.
Notice that two bash processes are involved in this example. The bash process we start off from is our regular interactive shell. We tell that bash process to run a command which will cause it to start a new bash process. This second bash process will execute all the commands it finds in the file hello.txt
, non-interactively. When it's done (there are no commands left in the file), the non-interactive bash process ends and the interactive bash process is
ready with your bash hello.txt command; it shows a new prompt asking you for the next command to run.
It's only a small step from a file with a list of commands in it to a veritable bash script. Open your hello.txt
file again using your favourite text editor and add a hashbang to the top of it, as the first line of the script: #!/usr/bin/env bash
#!/usr/bin/env bash
read -p "Your name? " name
if [[ $name = $USER ]]; then
echo "Hello, me."
else
echo "Hello, $name."
fi
Congratulations! You've created your first bash script. What's a bash script? It's a file with bash code in it that can be executed by the kernel just like any other program on your computer. In essence, it is a program in itself, although it does need the bash interpreter to do the work of translating the bash language into instructions the kernel understands. That's where this "hashbang" line we've just added to the file comes in: It tells the kernel what interpreter
it needs to use to understand the language in this file, and where to find it. We call it a "hashbang" because it always begins with a "hash" #
followed by a "bang" !
. Your hashbang must then specify an absolute pathname to any program that understands the language in your file and can take a single argument. Our hashbang is a little special, though: We reference the program /usr/bin/env
, which isn't really a program that understands the
bash language. It's a program that can find and start other programs. In our case, we use an argument to tell it to find the bash
program and use that for
interpreting the language in our script. Why do we use this "inbetween" program called env
? It has everything to do with what comes before the name: the path. We know with relative certainty that the env
program lives in the /usr/bin
path. Given the large variety of operating systems and configurations, however, we don't have any good certainty about where the bash
program is installed. Which is why we use the
env
program to find it for us. That was a little complicated! But now, what's the difference between our file before and after adding the hashbang?
$ chmod +x hello.txtMark hello.txt as an executable program.
$ ./hello.txtTell bash to start the hello.txt program.
Most systems require you to mark a file as executable before the kernel is willing to allow you to run it as a program. Once we do that, we can start the hello.txt
program like we would any other program. The kernel will look inside the file, find the hashbang, use that to track down the bash interpreter, and finally use the bash interpreter to start running the instructions in the file. You have your first real bash program!
If you've been paying close attention to the previous sections, you've got a pretty good introduction to what bash is, where and how it operates within the system and how you use it.
Time to start speaking "bash". We're going to get introduced to the bash shell language's grammar, and with that, this guide is going to start getting a bit more technical. Don't worry, focus and you won't get left behind. If you get a feeling of unease and uncertainty, re-read the section before moving on to avoid getting completely lost. We'll try and cover all the how's and why's of new concepts. If anything remains unclear, we encourage you to get in touch so that we can improve this guide for you and your fellow students. Our contact information is at the beginning of the guide.
The biggest difference with speaking to a computer as opposed to speaking to a human is that computer programs are generally terrible at placing your requests in context and figuring out what your intention is. Those that try are usually called "smart" for being able to take ambiguous input and going to lengths to figure out what the intended result was. In this context, "smart" is unfortunately not quite in line with what we'd expect from smart humans: the kinds of assumptions computer programs make based on our ambiguous input tend to be miles off and often lead to terrible or even disastrous results.
Sadly, we, as humans, are used to speaking in ambiguity: we rely on the receiver to understand the context of our requests and figure out what the most likely desirable action is. When we ask our partner to get the salt, we don't expect her to return with a handful of salt: we expect her to understand that our intention is to use the salt container to sprinkle some salt on our dish and we need them to bring the container to us, filled with at least a minimum amount of salt.
It is important to recognize the ambiguity in our language and requests before we start talking to computer programs, because we need to learn to get rid of that ambiguity in our language. If you have little experience with doing this, it will likely be your biggest challenge going forward. It takes practice to think in such literal terms. It helps to imagine we're talking to a three-year old and showing them for the first time, each time, how to do the thing you need them to
do. When Bring the animals book
doesn't yet cut it, we need to teach them the steps: Look around, do you see the books behind you?
, Great! Can you find the book with the lion and the cow on it?
, That's the one, grab it for me!
, Good boy, now bring it to daddy! Come here, you.
, Hi! Look at you. Give me the book and sit down, let's read it together.
; in a way, writing a bash script is similar to teaching your system how to do a task.
The difference is your three year old will recognize previous experiences in new requests by himself, your system won't: you'll need to explicitly specify and run previously written job descriptions.
Some language interpreters (an interpreter is a program, like bash, that can understand a language) try to compensate for this problem by being extremely strict with their grammar and syntax. The idea is to weed out the ambiguity in your language so as to avoid accidentally doing the wrong thing. The interpreter enforces correctness to a certain degree: this tends to be a relatively successful strategy and generally results in the least buggy programs.
Sadly, bash is not a strict interpreter.
In fact, bash's latitude is largely at fault for the general ineptitude toward bash scripting with most anyone introduced to the language, novice and professional alike. The result is not dissimilar to the state of the web around the turn of the century: many pages were written so badly that their ability to render properly on any kind of standards-compliant browser was sufficiently compromised to force these browsers to implement all sorts of "smart" hacks in an attempt
to render the pages as they might have been intended to render, rather than what they were written to render. Similarly, the gross part of the scripts you are going to run into will be buggy. Sometimes subtly so, often to the point where simply using it with a file whose name is somewhat unexpected may cause irreversible damage to your system.
Don't be that person.
This guide exists to teach you to write good bash code. It will empower you to convey your true intentions and have a computer solve your problems. Since bash is a lax interpreter, the responsibility of discipline lies with you. If you're not up to honoring this prerequisite, I recommend you stop reading now and find a strict interpreter instead. There is too much bad bash code in the world, and this guide will not be responsible for empowering people to write
more.
At the highest level, there are a few different kinds of commands. We'll explain each type, give a brief example and go more in depth on each command type in a later section. Don't worry too much about the syntax of these commands yet: that'll become clear when we focus on the different command types later on. What you should take away from this is a high-level understanding that bash commands come in different shapes and sizes, and a rough understanding of different syntaxes.
This is the most common kind of command. It specifies the name of a command to execute, along with an optional set of arguments, environment variables and file descriptor redirections.
[ var=value ... ] name [ arg ... ] [ redirection ... ] echo "Hello world." IFS=, read -a fields < file
Before the command's name you can optionally put a few var assignments. These variable assignments apply to the environment of this one command only. We'll go more in depth on variables and environment later on.
The command's name is the first word (after the optional assignments). Bash finds the command with that name and starts it. We'll learn more about what kind of named commands there are and how bash finds them later on.
A command's name is optionally followed by a list of arg words, the command arguments. We'll soon learn what arguments and their syntax are.
Finally, a command can also have a set of redirection operations applied to it. If you recall our explanation of file descriptors in an earlier section, redirections are the operations that change what the file descriptor plugs point to. They change the streams that connect to our command processes. We'll learn about the power of redirections in a future section.
Bash comes with a lot of "syntax sugar" to make common tasks easier to perform than by using just the basic syntax. Pipelines are an example of sugar that you'll be using a lot. They are a convenient way of "connecting" two commands by way of linking the first process' standard output to the second process' standard input. This is the most common way for terminal commands to talk to one another and convey information.
[time [-p]] [ ! ] command [ [|||&] command2 ... ] echo Hello | rev ! rm greeting.txt
We rarely use the time
keyword, but it is convenient for finding out how long it takes to run our commands.
The !
keyword is a little odd at first, and just like the time keyword it doesn't have much to do with connecting commands. We'll learn about what it does when we discuss conditionals and testing the success of commands.
The first command
and the second command2
can be any type of command from this section. Bash will create a subshell for each command and set up the first command's standard output file descriptor such that it points to the second command's standard input file descriptor. The two commands will run simultaneously and bash will wait for both of them to end. We'll explain what exactly these "subshells" are in a later chapter.
Inbetween the two commands goes the |
symbol. This is also called the "pipe" symbol, and it tells bash to connect the output of the first to the input of the second command. Alternatively, we can use the |&
symbol inbetween the commands to indicate that we want not only the standard output of the first command, but also its standard error to be connected to the second command's input. This is usually undesirable since the standard
error file descriptor is normally used to convey messages to the user. If we send those messages to the second command rather than the terminal display, we need to make sure the second command can handle receiving these messages.
A list is a sequence of other commands. In essence, a script is a command list: one command after another. Commands in lists are separated by a control operator which indicates to bash what to do when executing the command before it.
command control-operator [ command2 control-operator ... ] cd music; mplayer *.mp3 rm hello.txt || echo "Couldn't delete hello.txt." >&2
The command can be any of the other types of commands from this section.
After the command comes the control operator which tells bash how the command should be executed. The simplest control operator is just starting a new line, which is equivalent to ;
and tells bash to just run the command and wait for it to end before advancing to the next command in the list. The second example uses the ||
control operator which tells bash to run the command before it as it normally would, but after finishing
that command move to the next command only if the command before it failed. If the command before it didn't fail, the ||
operator will make bash skip the command after it. This is useful for showing error messages when a command fails. We'll go more in depth on all the control operators in later sections.
Notice that since a script is essentially a list of commands on separate lines, it is effectively a command list that uses newlines as the control operators between all the commands.
Compound commands are commands with special syntax inside them. They can do a lot of different things but behave as a single command in a command list. The most obvious example is a block of commands: The block itself behaves as a single big command but inside it are a bunch of "sub" commands. There are a lot of different kinds of compound commands and we will cover them all in-depth later.
if list [ ;|<newline> ] then list [ ;|<newline> ] fi { list ; } if ! rm hello.txt; then echo "Couldn't delete hello.txt." >&2; exit 1; fi rm hello.txt || { echo "Couldn't delete hello.txt." >&2; exit 1; }
Both examples perform the same operation. The first example is a compound command, the second is a compound command in a command list. We discussed the ||
operator briefly before: The command on the right side of it is skipped unless the command before it fails. This is a good example to illustrate an important property of compound commands: they behave as one command in a command list. The compound command in the second example begins at
{
and continues until the next }
, as a result everything inside the braces is considered a single command, meaning we have a command list of two commands: the rm
command followed by the { ... }
compound. If we were to forget the braces, we would get a command list of three commands: the rm
command followed by the echo
command, followed by the exit
command. The difference is mainly important to
the ||
operator in deciding what to do when the preceding rm
command is successfully completed. If the rm
succeeds, ||
will skip the command after it, which, if we leave out the braces, would be only the echo
command. The braces combine the echo
and exit
commands into a single compound command, allowing ||
to skip both of them when rm
succeeds.
A coprocess is some more bash syntax sugar: it allows you to easily run a command asynchronously (without making bash wait for it to end, also said to be "in the background") and also set up some new file descriptor plugs that connect directly to the new command's input and output. You won't be using coprocesses too often, but they're a nice convenience for those times you're doing advanced things.
coproc [ name ] command [ redirection ... ] coproc auth { tail -n1 -f /var/log/auth.log; } read latestAuth <&"${auth[0]}" echo "Latest authentication attempt: $latestAuth"
The example starts an asynchronous tail
command. While it runs in the background, the rest of the script continues. First the script reads a line of output from the coprocess called auth
(which is the first line of the tail
command output). Next, we write a message showing the latest authentication attempt we read from the coprocess. The script can continue and each time it reads from the coprocess pipe, it will get the next line from the
tail
command.
When you declare a function in bash, you're essentially creating a temporary new command which you can invoke later in the script. Functions are a great way to group a list of commands under a custom name for convenience when you repeat the same task more than once in your script.
name () compound-command [ redirection ] exists() { [[ -x $(type -P "$1" 2>/dev/null) ]]; } exists gpg || echo "Please install GPG." <&2
You begin by specifying a name
for your function. This is the name of your new command, you'll be able to run it later on by writing a simple command with that name.
After the command name go the ()
parentheses. Some languages use these parentheses to declare the arguments the function accepts: bash does not. The parentheses should always be empty. They simply denote the fact that you're declaring a function.
Next comes the compound command that will be executed each time you run the function.
To change the file descriptors of the script for the duration of running the function, you can optionally specify the function's custom file redirections.
Phew! That was a lot all at once. Most of this might have gone over your head - that's fine. We'll get back to the simple stuff and let you build up your knowledge with a thorough understanding. But it's important to take away that bash has different kinds of commands, and most of the syntax is actually pretty similar: Most commands take redirections, control operators and accept "subcommands" somehow. We'll explain these concepts, but first let's make sure we understand simple commands well.
It is vital that you understand simple commands well because they are the foundation of everything you will do in bash. You might have noticed in the previous section that all other bash commands are composed of at least one simple command: they merely take simple commands and perform special operations with them.
[ var=value ... ] name [ arg ... ] [ redirection ... ]
Let's have another look at the definition of a simple command. We're going to take it step by step, because although the definition seems short, there's a lot going on here.
We're first going to focus on the command's name. The name tells bash what the job is that you want this command to perform. To figure out what you want your command to do, bash performs a search to find out what to execute. In order, bash uses the name
to try and find a:
PATH
(which we'll explain in a moment).If bash finds no way to execute a command by the name you gave it, your command will result in an error and bash will show an error message:
$ buy beer
bash: buy: command not found
I'll make only brief mention of aliases: before bash performs this search, it first checks if you've declared any aliases by the name of the command. If you did, it will replace the name by the value of the alias before proceeding. Aliases are only rarely useful, only work in interactive sessions and are almost completely superseded by functions. You should avoid using them in almost all cases.
PATH
to a programWe have all sorts of programs installed on our computer. Different programs are installed in different places. Some programs shipped with our OS, others were added by our distribution and yet others were installed by us or our systems administrator. On a standard UNIX system, there are a few standardized locations for programs to go. Some programs will be installed in /bin
, others in /usr/bin
,
yet others in /sbin
and so on. It would be a real bother if we had to remember the exact location of our programs, especially since they may vary between systems. To the rescue comes the PATH
environment variable. Your PATH
variable contains a set of directories that should be searched for programs.
$ ping 127.0.0.1
PATH=/bin:/sbin:/usr/bin:/usr/sbin
│ │
│ ╰──▶ /sbin/ping ? found!
╰──▶ /bin/ping ? not found.
Bash honors this variable by looking through its listed directories whenever you try to to start a program it doesn't yet know the location of. Say you're trying to start the ping
program which is installed at /sbin/ping
. If your PATH
is set to /bin:/sbin:/usr/bin:/usr/sbin
then bash will first try to start /bin/ping
, which doesn't exist. Failing that, it will try /sbin/ping
. It finds the
ping
program, records its location in case you need ping
again in the future and goes ahead and runs the program for you.
If you're ever curious about exactly where bash finds the program to run for a command name, you can use the type
built-in to find out:
$ type ping
ping is /sbin/ping
$ type -a echoThe -a switch tells type to show us all the possibilities
echo is a shell builtinIf we just run 'echo', bash will use the first possibility
echo is /bin/echoWe have an echo built-in but also a program called echo!
Remember from the previous section how bash has some functionality built into it? One of those is the functionality of the echo
program. If you run the echo command in bash, even before bash tries a PATH
search, it will notice there's a built-in by that name and use it. type
is a great way to visualize this lookup process. Note that it's much faster to execute a command that's built-in than it is to start an extra program. But if
you're ever in need of echo
's functionality without being in bash, you'll be able to use the echo
program instead.
Sometimes you'll need to run a program that isn't installed in any of the PATH
directories. In that case, you'll have to manually specify the path to where bash can find the program, rather than just its name:
$ /sbin/ping -c 1 127.0.0.1
PING 127.0.0.1 (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: icmp_seq=0 ttl=64 time=0.075 ms
--- 127.0.0.1 ping statistics ---
1 packets transmitted, 1 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.075/0.075/0.075/0.000 ms
$ ./hello.txtRemember our hello.txt script?
Your name? We use the path "." which means "our current directory"
You can add more directories to your PATH
. A common practice is to have a /usr/local/bin
and a ~/bin
(where ~
represents your user's home directory). Remember that PATH
is an environment variable: you can update it like this:
$ PATH=~/bin:/usr/local/bin:/bin:/usr/bin
$
This will change the variable in your current bash shell. As soon as you close the shell, the change will be lost, though. We'll go more in-depth on how environment variables work and how you should configure them in a later section.
ls
program.$ ls
ls
program.$ type lsBoth 'type' and 'command' are acceptable here. The 'which' program is not.
ls is /bin/ls
$ echo "$PATH"
/bin:/sbin:/usr/bin:/usr/sbin
$ exYou can substitute your favorite editor here.
: i
#!/usr/bin/env bash
echo "Hello world."
.
: w myscript
"myscript" [New] 2L, 40C written
: q
$ chmod +x myscript
$ PATH=$PATH:~
$ myscript
Hello world.
[ var=value ... ] name [ arg ... ] [ redirection ... ]
Now that you understand how bash finds and runs your command, let's learn how to pass our instructions to those commands. These instructions tell our command what exactly it needs to do. We might run the rm
command to delete a file, or the cp
command to copy a file, we might run the echo
command to output a string or the read
command to read a line of text. But these commands generally can't do much without more
details, more context. We need to tell rm
what file to delete, cp
what file to copy and where to put the copy. echo
wants to know what you want it to output and read
can be told where to put the line of text it's read. We provide this kind of context using arguments.
As you can see from the command syntax, arguments come after the command's name. They are words separated by blank space. When we say words in the context of bash, we do NOT mean linguistic words. In bash, a word is defined as a sequence of characters considered as a single unit by the shell. A word is also known as a token. A bash word can contain many linguistic words, in fact it could contain prose. For sake of clarity, the rest of this guide will use the term arguments wherever applicable to avoid the ambiguity of the term words. What's important is that the word or argument is a single unit to the shell: it could be a filename, a variable name, the name of a program or the name of a person:
$ rm hello.txt
$ mplayer '05 Between Angels and Insects.ogg' '07 Wake Up.ogg'
In the above examples, the words are highlighted. Notice how they aren't linguistic words, but meaningful units. In this case, they all refer to file names. To separate multiple arguments we use blank space. That can be either or both spaces and tabs. Usually you will use a single space between arguments.
A problem now arises: we have a space after 05
, separating it from Between
. How should the shell know that you mean for your filename to be 05 Between Angels and Insects.ogg
and not 05
? How do we tell the shell that the blank space after 05
is literal and not intended as syntax for "split the word now"? Our intention is for the whole file name to remain "together". That is:
the blank spaces in it should not split it into separate arguments. What we need is a way to tell the shell that it should treat something
literally; meaning, use it as-is, ignoring any syntactical meaning. If we can make the spaces literal, they will no longer tell bash to split the 05
from the Between
, and bash will use it as a normal ordinary space character.
There are two ways in bash to make characters literal: quoting and escaping. Quoting is the practice of wrapping " or ' characters around the text that we want to make literal. Escaping is the practice of placing a single \ character in front of the character that we want to make literal. The example above uses quotes to make the entire filename literal, but not the space inbetween the filenames. We strongly recommend you use quotes over escaping, since it results in much clearer and more readable code. More importantly: escaping makes it much more difficult to tell exactly which parts of your code are literal and which aren't. It also becomes more precarious to edit the literal text later on without introducing mistakes. Using escaping rather than quoting, our example would look like this:
$ mplayer 05\ Between\ Angels\ and\ Insects.ogg 07\ Wake\ Up.ogg
Quoting is one of the most important skills you will need to master as a bash user. Its importance cannot be overstated. The nice thing about quotes is that while it is sometimes unnecessary, it is rarely ever wrong to quote your data. These are both perfectly valid:
$ ls -l hello.txt
-rw-r--r-- 1 lhunath staff 131 29 Apr 17:07 hello.txt
$ ls -l 'hello.txt'
-rw-r--r-- 1 lhunath staff 131 29 Apr 17:07 hello.txt
$ ls -l '05 Between Angels and Insects.ogg' '07 Wake Up.ogg'
So, if in doubt: quote your data. And never remove quotes in an attempt to make something work.
You should use "double quotes" for any argument that contains expansions (such as $variable
or $(command)
expansions) and 'single quotes' for any other arguments. Single quotes make sure that everything in the quotes remains literal, while double quotes still allow some bash syntax such as expansions:
echo "Good morning, $USER."Double quotes allow bash to expand $USER
echo 'You have won SECOND PRIZE in a beauty contest.' \Single quotes prevent even the $
-syntax
'Collect $10'from triggering expansion.
Don't be caught off-guard! This is definitely not correct:
$ ls -l 05 Between Angels and Insects.ogg
ls: 05: No such file or directory
ls: Angels: No such file or directory
ls: Between: No such file or directory
ls: Insects.ogg: No such file or directory
ls: and: No such file or directory
You will not have these handy yellow markers in your shell. Try to make a habit of mentally picturing them so that you keep yourself from making mistakes. You definitely won't be the first to have accidentally destroyed all the files in their home directory as a result of a stray or unquoted space character.
You'll find it good practice to develop a sense of pragmatism with regards to quoting: for any glance upon a block of bash code, unquoted arguments should immediately jump out at you, and you should feel a compulsion to resolve these before you can allow yourself to proceed with anything else. Quoting issues are at the core of at least nine out of ten bash problems, and the vast majority of causes for issues people seek help with. Seeing as quoting is actually very easy, a disciplined quoter has that much less to worry about.
[ var=value ... ] name [ arg ... ] [ redirection ... ]
We've already been briefly introduced to the concept of file descriptors and how they can be used to connect processes to each other. Let's find out how that's done in bash.
Recall that processes use file descriptors to connect to streams. Each process will generally have three standard file descriptors: standard input (FD 0), standard output (FD 1) and standard error (FD 2). When bash starts a program, it sets up a set of file descriptors for that program first. It does this by looking at its own file descriptors and setting up an identical set of descriptors for the new process: we say new processes "inherit" bash's file descriptors. When you open your terminal to a new bash shell, the terminal will have set bash up by connecting bash's input and output to the terminal. This is how the characters from your keyboard end up in bash and bash's messages end up in your terminal window. Each time bash starts a program of its own, it gives that program a set of file descriptors that match its own. This way, a bash command's messages end up on your terminal as well and your keyboard input ends up with the program (the command's output and input is connected to your terminal):
╭──────────╮
Keyboard ╾──╼┥0 bash 1┝╾─┬─╼ Display
│ 2┝╾─┘
╰──────────╯
$ ls -l a bImagine we have a file called "a", but not a file called "b".
ls: b: No such file or directoryError messages are emitted on FD 2
-rw-r--r-- 1 lhunath staff 0 30 Apr 14:43 aResults are emitted on FD 1
╭──────────╮
Keyboard ╾┬─╼┥0 bash 1┝╾─┬─╼ Display
│ │ 2┝╾─┤
│ ╰─────┬────╯ │
│ ╎ │
│ ╭─────┴────╮ │
└─╼┥0 ls 1┝╾─┤
│ 2┝╾─┘
╰──────────╯
When bash
starts an ls
process, it first looks at its own file descriptors. It then creates file descriptors for the ls
process, connected to the same streams as its own: FD 1 and FD 2 leading to the Display
, FD 0 coming from the Keyboard
. As a result, ls
' error message (emitted on FD 2) and its regular output (emitted on FD 1) both end up on your terminal display.
If we want to gain control over where our commands connect to, we need to employ redirection: it is the practice of changing the source or destination of a file descriptor. One thing we could do with redirection is write ls
' result to a file instead of to the terminal display:
╭──────────╮ Keyboard ╾──╼┥0 bash 1┝╾─┬─╼ Display │ 2┝╾─┘ ╰──────────╯ $ ls -l a b >myfiles.lsWe redirect FD 1 to the file "myfiles.ls" ls: b: No such file or directoryError messages are emitted on FD 2 ╭──────────╮ Keyboard ╾┬─╼┥0 bash 1┝╾─┬─╼ Display │ │ 2┝╾─┤ │ ╰─────┬────╯ │ │ ╎ │ │ ╭─────┴────╮ │ └─╼┥0 ls 1┝╾─╌─╼ myfiles.ls │ 2┝╾─┘ ╰──────────╯ $ cat myfiles.lsThe cat command shows us the contents of a file -rw-r--r-- 1 lhunath staff 0 30 Apr 14:43 aThe result is now in myfiles.ls
You've just performed file redirection by redirecting the command's standard output to a file. Redirecting standard output is done using the >
operator. Envision it as an arrow sending output from the command to the file. This is by far the most common and useful form of redirection.
Another common thing redirection is used for is hiding error messages. You'll notice that our redirected ls
command is still displaying an error message. Usually this is a good thing. Sometimes, though, we might find that error messages produced by some commands in our scripts are unimportant to the user and should be hidden. To do this, we can use file redirection again, in a similar fashion as redirecting standard output caused ls
' result to
disappear:
╭──────────╮ Keyboard ╾──╼┥0 bash 1┝╾─┬─╼ Display │ 2┝╾─┘ ╰──────────╯ $ ls -l a b >myfiles.ls 2>/dev/nullWe redirect FD 1 to the file "myfiles.ls" and FD 2 to the file "/dev/null" ╭──────────╮ Keyboard ╾┬─╼┥0 bash 1┝╾─┬─╼ Display │ │ 2┝╾─┘ │ ╰─────┬────╯ │ ╎ │ ╭─────┴────╮ └─╼┥0 ls 1┝╾───╼ myfiles.ls │ 2┝╾───╼ /dev/null ╰──────────╯ $ cat myfiles.lsThe cat command shows us the contents of a file -rw-r--r-- 1 lhunath staff 0 30 Apr 14:43 aThe result is now in myfiles.ls $ cat /dev/nullThe /dev/null file is empty? $
Notice how you can redirect any FD by prefixing the >
operator with the number of the FD. We used 2>
to redirect FD 2 to /dev/null
while >
still redirects FD 1 to myfiles.ls
. If you omit the number, output redirections default to redirecting FD 1 (standard output).
Our ls
command no longer showed an error message and the results were properly stored in myfiles.ls
. Where has the error message gone? We've written it to the file /dev/null
. But when we show the contents of that file, we don't see our error message. Did something go wrong?
The clue for this little mystery is in the directory name. The file null
is in the /dev
directory: This is a special directory for device files. Device files are special files that represent devices in our system. When we write to or read from them, we're communicating directly with those devices through the kernel. The null
device is a special device that is always empty. Anything you write to it will be lost
and nothing can be read from it. That makes it a very useful device for discarding information. We stream our unwanted error message to the null
device and it disappears.
What if we wanted to save all the output that would normally appear on the terminal to our myfiles.ls
file; both the results and error messages? Intuition might suggest:
$ ls -l a b >myfiles.ls 2>myfiles.lsRedirect both file descriptors to myfiles.ls?
╭──────────╮
Keyboard ╾┬─╼┥0 bash 1┝╾─┬─╼ Display
│ │ 2┝╾─┘
│ ╰─────┬────╯
│ ╎
│ ╭─────┴────╮
└─╼┥0 ls 1┝╾───╼ myfiles.ls
│ 2┝╾───╼ myfiles.ls
╰──────────╯
$ cat myfiles.lsContents may be garbled depending on how streams were flushed
-rw-r--r-- 1 lhunath stls: b: No such file or directoryaff 0 30 Apr 14:43 a
But you'd be wrong! Why is this not correct? Upon inspection of the myfiles.ls
it may appear as though things worked out, but there is actually something very dangerous going on here. If you're lucky, you'll see the output of the file isn't exactly as you might expect: it might be a little garbled, out of order, or it might even be just right. The problem is, you can't predict and no less guarantee the result of this command.
What's going on here? The problem is that both file descriptors now have their own stream to the file. This is problematic because of the way streams work internally, a topic which is out-of-scope for this guide, but suffice it to say that when both streams are merged into the file, the results are an arbitrary mix-together of the streams.
To solve this problem, you need to send both your output and error bytes on the same stream. And to do that, you're going to need to know how to duplicate file descriptors:
$ ls -l a b >myfiles.ls 2>&1Make FD 2 write to where FD 1 is writing
╭──────────╮
Keyboard ╾┬─╼┥0 bash 1┝╾─┬─╼ Display
│ │ 2┝╾─┘
│ ╰─────┬────╯
│ ╎
│ ╭─────┴────╮
└─╼┥0 ls 1┝╾─┬─╼ myfiles.ls
│ 2┝╾─┘
╰──────────╯
$ cat myfiles.ls
ls: b: No such file or directory
-rw-r--r-- 1 lhunath staff 0 30 Apr 14:43 a
Duplicating file descriptors, otherwise referred to as "copying" file descriptors, is the act of copying one file descriptor's stream connection to another file descriptor. As a result, both file descriptors are connected to the same stream. We use the >&
operator, prefixing it with the file descriptor we want to change and following it with the file descriptor whose stream we need to "copy". You will use this operator fairly frequently, and in most
cases it'll be to copy FD 1 to FD 2 as is done above. You can translate the syntax 2>&1
as Make FD
2
write(>
) to where FD(&
) 1
is currently writing.
We've seen quite a few redirection operations now, and we've even combined them. Before you go wild, there is one more important rule you need to understand: redirections are evaluated from left to right, conveniently the same way as we read them. This might seem obvious, but neglect for this has caused many of your predecessors to make this mistake:
$ ls -l a b 2>&1 >myfiles.lsMake FD 2 go to FD 1 and FD 1 go to myfiles.ls?
Somebody who writes this code might assume that since FD 2's output is going to FD 1, and FD 1 is going to myfiles.ls
, errors should end up in the file. The logical error in their reasoning is the assumption that 2>&1
sends FD 2's output to FD 1. It does NOT. It sends FD 2's output to the stream FD 1 is connected to, which at the time is probably the terminal and not the file, because FD 1 hasn't been
redirected yet. The result of the above command might frustrate, because it will appear as though the redirection of standard error isn't taking effect, when in reality, you've merely redirected standard error to the terminal (standard output's target), which is where it was already pointing before.
If we fix the order of the redirections:
$ ls -l a b >myfiles.ls 2>&1Make FD 1 target myfiles.ls and FD 2 target FD 1's target
We now first change FD 1's target to stream to myfiles.ls
. Then, we make FD 2 target the same stream FD 1 is currently using, which is the new stream to myfiles.ls
. Both file descriptors are now targeting myfiles.ls
and any output written by ls
on either FD 1 or FD 2 will end up in the file.
There are quite a few other redirection operators, but they aren't all as useful as the ones you've just learned. What has certainly proven useful is for people to learn to read their command redirections as plain English. I'm going to enumerate bash's redirection operators now, along with a short description and a sentence you can use to translate the operation into plain English.
[x]>file, [x]<file echo Hello >~/world rm file 2>/dev/null read line <fileMake FD x write to / read from file.
A stream to file is opened for either writing or reading and connected to file descriptor x. When x is omitted, it defaults to FD 1 (standard output) when writing and FD 0 (standard input) when reading.
[x]>&y, [x]<&y ping 127.0.0.1 >results 2>&1 exec 3>&1 >mylog; echo moo; exec 1>&3 3>&-Make FD x write to / read from FD y's stream.
The connection to the stream used by FD y is copied to FD x. The second example is quite advanced: to understand it you need to know that exec
can be used to change the file descriptors of bash itself (rather than those of a new command) and if you use an x that doesn't yet exist, bash will create a new file descriptor ("plug") for you with that number.
[x]>>file echo Hello >~/world echo World >>~/worldMake FD x append to the end of file.
A stream to file is opened for writing in append mode and is connected to file descriptor x. The regular file redirection operator >
empties the file's contents when it opens the file so that only your bytes will be in the file. In append mode (>>
), the file's existing contents is left and your stream's bytes are added to the end of it.
&>file ping 127.0.0.1 &>resultsMake both FD 1 (standard output) and FD 2 (standard error) write to file.
This is a convenience operator which does the same thing as >file 2>&1
but is more concise. Again, you can append rather than truncate by doubling the arrow: &>>file
<<[-]delimiter here-document delimiter cat <<.We chooseMake FD 0 (standard input) read from the string between the delimiters..
as the end delimiter. Hello world. Since I started learning bash, you suddenly seem so much bigger than you were before. .Our previously chosen.
ends the here-document.
Here documents are a great way to feed large blocks of text to a command's input. They begin on the line after your delimiter and end when bash encounters a line with just your delimiter on it. It is important to remember that your terminating delimiter cannot be indented, because then it is no longer just your delimiter on that line.
-
, this will tell bash to ignore any tabs you put in front of your heredoc. That way, you can indent the heredoc without the indenting showing in your input string. It also allows you to indent the terminating delimiter with tabs.
Finally, it is possible to put variable expansions within the here document's string. This allows you to inject variable data into the here document. We'll learn more about variables and expansions later on, but suffice it to say that if expansion is not desired, you need to put quotes around your 'delimiter'
's initial declaration.
<<<string cat <<<"Hello world. Since I started learning bash, you suddenly seem so much bigger than you were before."Make FD 0 (standard input) read from the string.
Here strings are very similar to here documents but more concise. They are generally preferred over here documents.
x>&-, x<&- exec 3>&1 >mylog; echo moo; exec 1>&3 3>&-Close FD x.
The stream is disconnected from file descriptor x and the file descriptor is removed from the process. It cannot be used again until it is recreated. When x is omitted, >&-
defaults to closing standard output and <&-
defaults to closing standard input. You will rarely use this operator.
[x]>&y-, [x]<&y- exec 3>&1- >mylog; echo moo; exec >&3-Replace FD x with FD y.
The file descriptor at y is copied to x and y is closed. Effectively, it replaces x with y. It is a convenience operator for [x]>&y y>&-
. Again, you will rarely use this operator.
[x]<>file exec 5<>/dev/tcp/ifconfig.me/80 echo "GET /ip HTTP/1.1 Host: ifconfig.me " >&5 cat <&5Open FD x for both reading and writing to file.
The file descriptor at x is opened with a stream to the file that can be used for writing as well as reading bytes. Usually you'll use two file descriptors for this. One of the rare cases where this is useful is when setting up a stream with a read/write device such as a network socket. The example above writes a few lines of HTTP to the ifconfig.me
host at port 80
(the standard HTTP port) and subsequently reads the bytes coming
back from the network, both using the same file descriptor 5
set up for this by exec
.
As a final note about redirections, I'd like to point out that for simple commands the redirection operators can appear anywhere in the simple command. That is, they don't need to appear at the end of it. While it is a good idea to keep them at the end of your commands if mainly for consistency and to avoid surprise or missing the operator in long commands, there are cases where some people make a habit of placing the redirection operator elsewhere. In particular, placing
the redirection operator after the echo
or printf
command name is often done, especially when there is a sequence of them, in the interest of readability:
echo >&2 "Usage: exists name"
echo >&2 " Check to see if the program 'name' is installed."
echo >&2
echo >&2 "RETURN"
echo >&2 " Success if the program exists in the user's PATH and is executable. Failure otherwise."
$ ls /bin/bash
/bin/bash*
$ ls /bob/bash
ls: /bob/bash: No such file or directory
$ ls /bin/bash /bob/bash
ls: /bob/bash: No such file or directory
/bin/bash*
errors.log
. Then show the contents of errors.log
on the terminal.$ ls /bin/bash /bob/bash 2>errors.log
/bin/bash*
$ cat errors.log
ls: /bob/bash: No such file or directory
errors.log
. Then show the contents of errors.log
on the terminal again.$ ls /bin/bash /bob/bash >>errors.log 2>&1
$ cat errors.log
ls: /bob/bash: No such file or directory
ls: /bob/bash: No such file or directory
/bin/bash*
$ cat <<< 'Hello world.'
Hello world.
log
file and such that FD 3 is properly closed afterwards: exec 3>&2 2>log; echo 'Hello!'; exec 2>&3$ exec 3>&1 >log; echo 'Hello!'; exec 1>&3 3>&-