![]() |
UNIX 101A Hands-on How-toSMfrom Brass Cannon ConsultingA little vague handwaving can often save hours of tedious explanation. |
So you've got Linux, or FreeBSD -- or MacOS X -- and you're wondering how to talk to it. You've come to the right place.
In the case of Linux, if you chose to install X (also known as X11 or "X Windows") you probably have a console window open by default. If you didn't install X, your entire screen is a console.
On a Mac running OS X, you can open a Terminal window (Applications, scroll down, down, down... it's there, somewhere... aha, Utilities. Terminal lives in there.); or, to get that real nitty-gritty kernel hacker feeling, you can also get a command-line only "console" environment with just a little more effort. According to xfree86.org:
To get to the text console in Mac OS X you need to logout and type >console as the user name. This will shutdown Core Graphics and bring up the console login prompt. Login again as your user.
(I just tried this on Tiger. Had to reset login options to accept username and password rather than presenting the list of known users, but it worked like a charm! Type the "short name" and password for your account.)
Ah, peace at last. Now it's just you and a blinking cursor. No wallpaper, no stupid, cryptic icons... this is how computers are meant to be used.
Unix (or Linux) is very powerful, and very flexible. The best book of tips and tricks I've seen to date is the size of a phone book, so obviously we won't be getting into that kind of detail... but we can provide a little framework so that your other reference material will make sense to you, perhaps a little sooner than it would have otherwise.
The concept of "logging in" should, I hope, make some sense; you have to give your "user name" and a password to connect to your account. That brings up the first and most important difference between Unix and the "PC" operating system you may have used:
- UNIX IS A MULTI-USER SYSTEM
This has deeper significance than merely being asked to log in. Windows 98 might ask you log in, but it doesn't really mean it. (If you don't believe me, next time you start up Win98 press the "Esc" key when it asks for your password.) Windows NT and 2000 are much more like Unix -- if you don't log in, you don't use the machine.
But why do you have to log in? Simply put, it's to protect you from the consequences of your own mistakes. In DOS you are the only user. You can say a file is "read-only," but you can always change your mind and undo that protection. (And so can any random virus...) In Unix. there is the concept of an administrator, the so-called "superuser" or "root," who can protect you from shooting yourself in the foot. Root decides WHO can read or write any file. That's why the system needs to know who you are, so it knows whether or not you are "root".
Here's another difference that isn't so surprising, but takes a little getting used to:
- UNIX IS CASE-SENSITIVE
Kevin is not the same as KEVIN or kevin or kEvin... get the picture? Maybe it took you a day or two to get used to the idea that your computer (actually the MS-DOS inside it) treats them all the same way. To me, it seemed natural that Kevin and KEVIN would be two different words. Well, in Unix, they are different.
A "shell script" is a text file that contains the same kind of Unix commands that you would key in at the command line. (If you know what a .BAT file is in DOS, you've got the idea.) It is interpreted by a command shell. In the DOS world, the shell is the program COMMAND.COM which is written to every bootable floppy. If you are a real old-time hacker, you may have encountered 4DOS, a third-party replacement shell for DOS. In the Unix world, you have even more choices, but they fall into two major families: the Bourne shell family and the C shell family. The default shell on most Linux distributions is the Bourne-Again Shell, or bash.
There are enough differences between shells that you need some way to make sure you're using the one you expect, especially if you are writing a script that may do a lot of important stuff. The way to do that is to start your shell script with a "hash bang" (#!) line like this one:
#!/bin/sh
This says that this script is to be run by the Bourne shell that resides in a specific directory, (/bin in this case). This insures the script will run as expected, even if your default shell is the C shell or even something else. We'll get to directories in just a minute.
In Unix, a file and a device are treated the same way -- they are both viewed as "a stream of bytes." Commands will accept input from EACH OTHER as easily as from your keyboard.
In order to get maximum flexibility, Unix commands try very hard to present "standard output" that can be piped into other commands as "standard input." You usually will not see headings on standard output, or any acknowledgement from a successful command, because both of these would interfere with this "pipeline" concept.
For instance, to display a text file, we would use the "cat" command:
$ cat file
To append "myfile" to the end of "anotherfile":
$ cat myfile >> anotherfile
We'll come back to this example in a moment.
We can create a new text file called myfile by typing it in:
$ cat > myfile
This is a line of text I want to go into myfile.
^D
Try this for yourself. See how the terminal waits for you to enter text? Of course, you need to be able to tell it when you're done. The Control-D character (^D) terminates an "input stream." (If you terminate your input stream at the command prompt, the default behavior of most Unix terminals is to log you off.)
Because 'cat > myfile' creates a new file, it replaces (destroys!) any old file by that name.
To append to a file, you can enter:
$ cat >> myfile
This is a line of text I want to ADD to myfile.
^D
Note the DOUBLE >> redirection symbol. Compare this to the example above, where we appended myfile to anotherfile. In this case, your keyboard entry takes the place of "myfile".
The moral is: Unix doesn't care where the stream of bytes comes from.
We've talked about files; where do those files live? On disks, in directories.
First, you are no longer limited to 26 "drive letters", A: through Z: A Unix machine is built around a filesystem, which starts at a "root" from which it grows and branches like a tree or a corporate organization chart. (We tend to think of it as an upside-down tree, though. Perhaps a family tree is a better example, with children descending from parents.)
Animals Wild Birds Farm Cows Horses Pets Dogs Spot Rover Cats Fluffy
In this example, any of the directories (Pets, Dogs, Cats) could be a completely separate physical disk, but mounted and visible at the logical place within the overall "Animals" filesystem.
To add more drives, it's a matter of creating a "mount point" -- which is just a directory -- and using the mount command to make the new device part of the existing filesystem tree. That's getting into advanced territory; your version of Unix may provide "automount" to handle floppies or CD-ROMs automagically, but some old-timers prefer to do a manual mount of their removable media. Why? Because MS-DOS/Windows makes assumptions -- perhaps too many -- every time you stick a floppy into your floppy drive. Unix could do that too, but it would mean giving up control.
"man mount" will tell you more about this topic.
Here are some commands related to filesystems (the $ is the command prompt, and the # marks a comment):
$ pwd # ("Present Working Directory" - where am I?)
/home/kevin # You are here
$ cd # change directory ("go $HOME")
# Examples using filesystem commands:
$ cd / # go to the ROOT directory
$ cd /bin # do this, then do a "pwd" to see
$ pwd # ...our new location:
/bin # we are in "bin" just below root (/).
$ echo $HOME # HOME is defined as the directory you go
/home/kevin # to when you say "cd" with no value.
# $HOME gives us the value of the variable
# named HOME.
$ echo HOME # See the difference between HOME and $HOME?
HOME
$ cd # Unlike DOS, 'cd' with no parameter
$ pwd # ALWAYS takes you "HOME".
/home/kevin
$ cd bin # This time use "bin" with no leading "/":
$ pwd # Where are we now?
/home/kevin/bin
The root of the filesystem tree is identified by the single character "/" or "slash." Do not confuse it with the Microsoft backslash, "\". Note the difference between an absolute path, starting from the root of the filesystem, and a relative path, which does not start with "/". The cd command "moves" you from directory to directory, like moving between rooms in a house. A path that starts with a dot (period) is relative to the current directory, so "cd ." does exactly nothing. ("Move me from where I am to, uh, where I am.") A double-dot (..) refers to the directory above the current one. Thus:
$ pwd /home/kevin/marsupial/wombat $ cd .. $ pwd /home/kevin/marsupial $ cd .. $ pwd /home/kevin
You do NOT have to cd into a directory in order to run commands that refer to it -- these two examples do exactly the same thing, but notice the wasted steps in the first one.
# Example one: $ pwd /home/silly $ cd /var/tmp $ ls thisfile thatfile otherfile $ cd $ # Example two: $ pwd /home/silly $ ls /var/tmp thisfile thatfile otherfile $
Here is an example using the pipe operator, "|":
$ ps -ef | grep kevin
"List every active process on the system (process status, every, full) and pipe the results into the search utility grep; i.e. tell me what processes if any have the string "kevin" associated with them. Or more simply, "find all of kevin's jobs."
If this doesn't work, try "ps -aux" instead of "ps -ef"
This brings up another important difference between Unix and DOS: Unix is not only multi-user, it is multi-process. A Unix machine can run many programs at the same time, unlike DOS, which only does one thing at a time well. (You can have "TSR" programs under DOS, but that is not nearly the same thing as having a true multiprocessing operating system.)
The Unix philosophy is to avoid overly complex programs that try to be all things to all people. Instead. Unix commands are very small, and very simple. Each one does just one thing, and they are all written so that they can be hooked together with the "plumbing" syntax we have just seen.
Every properly-written Unix program has one input and two outputs, called stdin (standard input), stdout (standard output), and stderr (standard error). When you connect two commands with a "pipe" you are connecting the standard output of the first program to the standard input of the second.
Run the command ls -l to get a "long" directory listing. (That's a lowercase letter L as in long. If you typed the number 1 instead, you would get a one-column listing.)
Note that the output displayed by this command does not include a header to tell you what each column means. This seems inconsiderate, but there is a reason for it: Displaying a header would get in the way of reusing that information as input to another command.
The command "more" passes through whatever it sees on its standard input, but pauses after each "n" lines, where "n" is the height of your display screen:
$ more file
So, if the output from ls -l is too long to fit on one screen, you might want to pipe it through the more command, like so:
ls -l | more
Unix What they should have called it :-)
---- ------------------------------------------------
man HELP (displays a section of the 'manual,' as
in "Read The Fine Manual" or RTFM.)
mv f1 f2 RENAME (MoVe the name entry)
rm file DELETE (ReMove)
rm -i file DELETE with prompt to confirm (ReMove -with Inform)
who SHOW USERS (See note *)
ps -e SHOW SYSTEM
xd -c file DUMP and display ASCII (-c=char) values
ls DIRECTORY/SIZE/OWNER (LiSt of files)
ls -l DIR/PROT/SIZE/DATE/OWNER (LiSt -long format)
*These uppercase English-like commands were used on VMS, a multi-user
system that gave Unix a pretty good run for its money back in the 1980s.
Time to leave DOS behind. Unix is a multi-user system. The users on a Unix machine are divided into three groups. There is YOU, the "owner" of your account; there is your GROUP, other people who belong with you for some reason; and there is the rest of the WORLD. You'll hear a lot about setting "permissions" on files. What does that imply?
First, the idea of permission is tied up with ownership, and you can't really describe one without the other. You'll be using two tools, not one, to control who can see your files: chmod (change mode) and chown (change owner).
Everyone on a Unix machine is a member of one or more "groups." A group is just a way of saying "these people belong together in some way." By default, Red Hat Linux makes a new group for each user you create, so you are in a "group of one." (Other systems would place all regular users into a group called "users.") The root user can move other users into different groups, and there is a way to belong to more than one group at the same time.
What can you do with a file? You can read it, you can write it, and if it's a program you can execute it. And you can decide whether to allow members of your group or the "world" to do the same. ls -l displays a "protection string" in the format: drwxrwxrwx How do you read that?
A "d" in the first position indicates the file is a directory. The letters that follow indicate WHAT KIND of access has been granted; the position of an access letter indicates WHO has that access.
The positions are "dxxxyyyzzz" where xxx=owner yyy=group zzz=world
The types of access are:
"r" = read "w" = write "x" = execute
So a permission string of -rwxr-x--- means it's an ordinary file (because there is no leading 'd'); the owner can read, write, or execute it (rwx); members of the same group can read or execute it but not write to it (r-x), and the rest of the world cannot access it at all (---).
Oldtimers like to think of the "mode" in terms of a three-digit octal (or "Base 8") number. The value of an octal number between 1 and 7 matches perfectly with the three bits that are turned on or turned off when you change a file's mode. For example, setting the "world" values to rwx can be represented as an octal 7 (binary 111), while setting them to rw- would be a 6 (binary 110). The command "chmod 750 filename" would set the file's protection to rwxr-x--- exactly as in our example above. [For more about binary numbers, check out Introduction to the TCP/IP LAN.]
See "man chmod" for more about setting permissions on a file. There is no permission bit to "hide" a file, though you can make files less obvious by giving them a leading ".dot". By default, the ls command will only show "dot files" if you ask for them with ls -a -- that's an example of a "switch," and there's more about those below.
The "execute bit" has a slightly different meaning if the file in question is a directory. (Yes, directories are files. In Unix, everything is a file... and a file is just a stream of characters.) You can't execute a directory, but you might want to search it. The execute bit is used to provide some control over that. You cannot get a directory listing for a directory if its execute bit is turned off. But if you have read access to a file within that directory, you can still read the file if you know its exact name. As a general rule, though, you want your directories to be "executable."
What about deleting files? Delete permission is controlled by the w (write) flag on the directory, not the file -- anyone who has write access to the directory can delete any file IN the directory. Think of it this way; the file exists only as long as the directory provides a "link" to it. If you can write into the directory, you can erase or write over that link, and when you do, the file becomes just another block of free space.
To run a script, it has to be an executable file (chmod +x scriptname) and it has to be in a directory that is in your $PATH (mv scriptname $HOME/bin) If both of those conditions are met, then simply typing the name of the script at the command prompt will run it.
For a practical example, see my writeup on scripting support for the Yarn offline news and mail reader.
is like TYPE, but also CREATE or APPEND, when used with output redirection (> and >>).
display the input stream one screenfull at a time; press the space bar for next page, 'q' to quit; enter '/text' with no quotes to find the string 'text'
'grep pattern file' is like 'FIND "pattern" file' in DOS. grep is short for General Regular Expression Print.
"file filename" tells you what kind of file "filename" is. Trying to display a binary file can easily lock up your terminal; use 'file' first to find out whether it's a text file.
If you forget to do "file somefile" before you cat or edit somefile, and blow up your console by doing so, you may be able to recover by typing "reset" and hitting enter... if you're a touch typist, that is.
You'll note that some of the commands above had "switches" -- either a letter that is set off by a hyphen, or a word that is set off by a pair of them. Some commands even have positive (on) and negative (off) switches: ... but most just use the hyphen as a delimiter.
Note that Unix command switches do not use /. That is used to separate directories, not to identify the parameters of a command. Unix directories are never, ever separated by a backslash \. The purpose of the backslash is to change the meaning of the character that follows, usually to "quote" it.
In this case, the -v stands for "inVert the test". This command will find all lines in "file" that DON'T have an instance of "pattern".
For instance, this command will list all the processes that are NOT being run by "root":
ps -ef | grep -v root
The greatest weakness of Unix, in my opinion, is that there is no single authority or standard to determine what switches will be recognized by any single program. For example, the use of -v with grep (above) is unique to grep. How do we cope with this chaos?
The safest way to find out how to run a command on your machine is to look at the man page for that command (by typing man command ). If you're not sure what command you want, the command "apropos" or "man -k" will list all the man pages that contain a certain string in their title. man -k search would list all the commands that relate to "search"ing. If there are too many, don't forget you can pipe the output through more to scroll, or even use "grep -v" to remove unwanted words:
apropos search | grep -v Tcl
A riskier way to get information is to run the command with the switch --help If it doesn't have a built-in help display, the fact that it doesn't understand "--help" should make it display an "invalid switch" error and a "usage" line... but that won't always save you from a destructive command like rm. Always have some idea what a command does before you run it!
To find executables, Unix uses a PATH, much like MS-DOS. You may not be familiar with this, because MS-DOS (and Microsoft Windows) always included the CURRENT directory by default. This meant you could run any program by "going" to the directory where it was installed. (In fact, this is one reason that the concept of using cd and "going to" a directory matters.)
Unix does not do this, unless you include "current directory" explicitly as the dot symbol ("."). There are good reasons not to do that, but they all boil down to "Don't run that! You don't know where it's been!" You don't want to run something that just happens to be lying around in any old directory.
To see your path, you can type:
$ echo $PATH # Note the lowercase command, UPPERCASE
# variable, and a leading $ to get its
# contents.
| To find out: | Run this command: | |
| "What's your HOME directory?" | $ echo $HOME | |
| "What's in your dot-profile?" | $ more .profile |
Those UPPERCASE variables with leading $ are called environment variables. By convention, they are written in UPPERCASE. Ones you define yourself will work in lowercase or mixed case, but certain ones used by Unix itself, such as PATH, have to be in uppercase. Following that model for yours is merely a good idea.
In DOS you'd create them with a SET command. In Unix (except in the C-shell) you just assign them with an = and no spaces. It would look something like this (The $ at the start of the line is the command prompt; the lines that have no prompt are the output of the echo command):
$ CRITTER=beast # no $ on the assignment
$ export CRITTER # it doesn't "count" until you export it
$ echo $CRITTER # leading $ means "The value of"
beast
$ echo CRITTER
CRITTER # see the difference?
$ echo $CRITTER # how about now?
beast
At any given time, there are lots of "processes" running on a Unix machine. Every command you run exists as a "child" of your own login process. The export command is important because it makes your environment variables visible to your child processes. The example above would have worked even without doing the export, because we are remaining within a single interactive process... but in almost every other case, you have to use export to get the desired effect. You especially need it if you want to set an environment variable at the command prompt and then use it inside a script.
By convention, system executables reside in a directory called /bin (the directory named bin located immediately under the root directory). Some other special directories are /usr/local/bin, and /etc/bin. If you have private scripts or programs, they should go into a bin directory under your $HOME directory. You would then add $HOME/bin to your PATH by adding a line like this to your .profile file:
PATH=$PATH:$HOME/bin # Keep the current value of PATH and add my
# personal bin directory to it, as the last
# place to look for executables.
Files that begin with a period (like .profile) are used by various system utilities. They don't appear by default on directory lists, mostly for neatness' sake. You can see them by doing ls -a (think "a" as in "all").
A way to make Unix more friendly is to make up your commands by using the "alias" command. For instance, if you have trouble remembering to type "ls -l" consistently, you can create your own "dir" command like so:
alias dir="ls -l"
This assumes you're using something with a bit more moxie than the old Bourne shell -- you need bash or the Korn shell to support aliases. Please resist the urge to define so many aliases that you've created your own language. You're better off learning the same Unix that (almost) everyone else uses. Dialects or flavors of Unix are more alike than they are different.
Power delights, and absolute power is absolutely delightful. Enjoy the power that understanding Unix gives you, and please use it only for Good!
| Unix for Dummies | IDG Books - I tease them, but this is a good book |
| Unix Power Tools | O'Reilly and Associates / Random House - The "phone book" I mentioned above |
| The Unix Development Environment | Kernigan & Pike / Prentice Hall - They invented Unix. |
| Unix System Security | Rik Farrow/Addison Wesley (ISBN 0-201-57030-0) |
Write programs that do one thing and do it well.
Write programs that work together.
Write programs that handle text streams, because that is a universal
interface.
You are invited to discuss this article with the author on the Brass Cannon webboard.