29,99 €
As part of the best-selling Pocket Primer series, this book introduces readers to useful command-line utilities for creating powerful shell scripts. It focuses on the “bash” command set, though many concepts apply to other command shells like sh, ksh, zsh, and csh. The book covers piping data between commands and using versatile sed and awk commands. Aimed at beginners, it also serves as a good reference for those with some experience in shell scripting.
The journey starts with an introduction to bash, covering files and directories, useful commands, and conditional logic with loops. Readers then learn to filter data with grep, transform data with sed, and work with awk. The book introduces shell scripts, showcasing their use with grep and awk for data manipulation. Various scripts are provided for data scientists and analysts needing shell-based solutions for text file cleaning.
Understanding these concepts is crucial for simplifying routine tasks and creating efficient shell scripts. This book transitions readers from novices to proficient scriptwriters, combining theoretical knowledge and practical skills. Companion files with source code examples enhance learning. By the end, readers will be equipped to implement shell scripts in real-world scenarios.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 325
Veröffentlichungsjahr: 2024
BASH COMMAND LINE AND SHELL SCRIPTS
POCKET PRIMER
Oswald Campesato
MERCURY LEARNING AND INFORMATION
Dulles, Virginia
Boston, Massachusetts
New Delhi
Copyright © 2020 by MERCURY LEARNING AND INFORMATION LLC.
All rights reserved.
This publication, portions of it, or any accompanying software may not be reproduced in any way, stored in a retrieval system of any type, or transmitted by any means, media, electronic display or mechanical display, including, but not limited to, photocopy, recording, Internet postings, or scanning, without prior permission in writing from the publisher.
Publisher: David Pallai
MERCURY LEARNING AND INFORMATION
22841 Quicksilver Drive
Dulles, VA 20166
www.merclearning.com
(800) 232-0223
O. Campesato. Bash Command Line and Shell Scripts Pocket Primer.
ISBN: 978-1-68392-504-0
The publisher recognizes and respects all marks used by companies, manufacturers, and developers as a means to distinguish their products. All brand names and product names mentioned in this book are trademarks or service marks of their respective companies. Any omission or misuse (of any kind) of service marks or trademarks, etc. is not an attempt to infringe on the property of others.
Library of Congress Control Number: 2020935567
202122321 Printed on acid-free paper in the United States of America.
Our titles are available for adoption, license, or bulk purchase by institutions, corporations, etc. For additional information, please contact the Customer Service Dept. at (800) 232-0223(toll free).
Digital versions of our titles are available at: www.academiccourseware.com and other electronic vendors. Companion files are available from the publisher by writing to [email protected].
The sole obligation of MERCURY LEARNING AND INFORMATION to the purchaser is to replace the book and/or disc, based on defective materials or faulty workmanship, but not based on the operation or functionality of the product.
I’d like to dedicate this book to my parents – may this bring joy and happiness into their lives.
Preface
Chapter 1:Introduction
What is Unix?
Available Shell Types
What is bash?
Getting help for bash Commands
Navigating Around Directories
The history Command
Listing Filenames with the ls Command
Displaying Contents of Files
The cat Command
The head and tail Commands
The Pipe Symbol
The fold Command
File Ownership: Owner, Group, and World
Hidden Files
Handling Problematic Filenames
Working with Environment Variables
The env Command
Useful Environment Variables
Setting the PATH Environment Variable
Specifying Aliases and Environment Variables
Finding Executable Files
The printf Command and the echo Command
The cut Command
The echo Command and Whitespaces
Command Substitution (“backtick”)
The “pipe” Symbol and Multiple Commands
Using a Semicolon to Separate Commands
The paste Command
Inserting Blank Lines with the paste Command
A Simple Use Case with the paste Command
A Simple Use Case with cut and paste Commands
What about zsh?
Switching between bash and zsh
Configuring zsh
Summary
Chapter 2:Files and Directories
Create, Copy, Remove, and Move Files
Creating Text Files
Copying Files
Copy Files with Command Substitution
Deleting Files
Moving Files
The ln Command
The basename, dirname, and file Commands
The wc Command
The cat Command
The more Command and the less Command
The head Command
The tail Command
Comparing File Contents
The Parts of a Filename
Working with File Permissions
The chmod Command
Changing owner, permissions, and groups
The umask and ulimit Commands
Working with Directories
Absolute and Relative Directories
Absolute/Relative Pathnames
Creating Directories
Removing Directories
Navigating to Directories
Moving Directories
Using Quote Characters
Streams and Redirection Commands
Working with Metacharacters
Working with Character Classes
MetaCharacters and Character Classes
Digits and Characters
Working with “^” and “\” and “!”
Filenames and Metacharacters
Summary
Chapter 3:Useful Commands
The join Command
The fold Command
The split Command
The sort Command
The uniq Command
How to Compare Files
The od Command
The tr Command
A Simple Use Case
The find Command
The tee Command
File Compression Commands
The tar command
The cpio Command
The gzip and gunzip Commands
The bunzip2 Command
The zip Command
Commands for zip Files and bz Files
Internal Field Separator (IFS)
Data From a Range of Columns in a Dataset
Working with Uneven Rows in Datasets
Summary
Chapter 4:Conditional Logic and Loops
Quick Overview of Operators in bash
Arithmetic Operations and Operators
The expr Command
Arithmetic Operators
Boolean and Numeric Operators
Compound Operators and Numeric Operators
Working with Variables
Assigning Values to Variables
The read Command for User Input
Boolean Operators and String Operators
Compound Operators and String Operators
File Test Operators
Compound Operators and File Operators
Conditional Logic with if/else/fi Statements
The case/esac Statement
Working with Strings in Shell Scripts
Working with Loops
Using a for loop
Checking Files in a Directory
Working with Nested Loops
Using a while Loop
The while, case, and if/elif/else/fi Statements
Using an until Loop
User-defined Functions
Creating a Simple Menu from Shell Commands
Arrays in bash
Working with Arrays
Summary
Chapter 5:Filtering Data withgrep
What is the grep Command?
Metacharacters and the grep Command
Escaping Metacharacters with the grep Command
Useful Options for the grep Command
Character Classes and the grep Command
Working with the –c Option in grep
Matching a Range of Lines
Using Back References in the grep Command
Finding Empty Lines in Datasets
Using Keys to Search Datasets
The Backslash Character and the grep Command
Multiple Matches in the grep Command
The grep Command and the xargs Command
Searching zip Files for a String
Checking for a Unique Key Value
Redirecting Error Messages
The egrep Command and fgrep Command
Displaying “Pure” Words in a Dataset with egrep
The fgrep Command
A Simple Use Case
Summary
Chapter 6:Transforming Data with sed
What is the sed Command?
The sed Execution Cycle
Matching String Patterns Using sed
Substituting String Patterns Using sed
Replacing Vowels from a String or a File
Deleting Multiple Digits and Letters from a String
Search and Replace with sed
Datasets with Multiple Delimiters
Useful Switches in sed
Working with Datasets
Printing Lines
Character Classes and sed
Removing Control Characters
Counting Words in a Dataset
Back References in sed
Displaying Only “Pure” Words in a Dataset
One Line sed Commands
Summary
Chapter 7:Working withawk
The awk Command
Built-in Variables That Control awk
How Does the awk Command Work?
Aligning Text with the printf Command
Conditional Logic and Control Statements
The while Statement
A for loop in awk
A for loop with a break Statement
The next and continue Statements
Deleting Alternate Lines in Datasets
Merging Lines in Datasets
Printing File Contents as a Single Line
Joining Groups of Lines in a Text File
Joining Alternate Lines in a Text File
Matching with Metacharacters and Character Sets
Printing Lines Using Conditional Logic
Splitting Filenames with awk
Working with Postfix Arithmetic Operators
Numeric Functions in awk
One Line awk Commands
Useful Short awk Scripts
Printing the Words in a Text String in awk
Count Occurrences of a String in Specific Rows
Printing a String in a Fixed Number of Columns
Printing a Dataset in a Fixed Number of Columns
Aligning Columns in Datasets
Aligning Columns and Multiple Rows in Datasets
Removing a Column from a Text File
Subsets of Columns Aligned Rows in Datasets
Counting Word Frequency in Datasets
Displaying Only “Pure” Words in a Dataset
Working with Multiline Records in awk
A Simple Use Case
Another Use Case
Summary
Chapter 8:Intro to Shell Scripts
What are Shell Scripts?
A Simple Shell Script
Setting Environment Variables via Shell Scripts
Sourcing or “Dotting” a Shell Script
Working with Functions in Shell Scripts
Passing values to Functions in a Shell Script (1)
Passing values to Functions in a Shell Script (2)
Iterate through values passed to a Function
Positional Parameters in User-defined Functions
Shell Scripts, Functions, and User Input
Recursion and Shell Scripts
Iterative Solutions for Factorial Values
Calculating Fibonacci Numbers
Calculating the GCD of Two Positive Integers
Calculating the LCM of two Positive Integers
Calculating Prime Divisors
Summary
Chapter 9:Shell Scripts withgrepandawkCommand
The grep Command with zip Files
The grep Command with Multiple Files
Simulating Relational Data with the grep Command
Checking Updates in a Logfile
Processing Multiline Records
Adding the Contents of Records
Using the split Function in awk
Scanning Diagonal Elements in Datasets
Adding Values From Multiple Datasets (1)
Adding Values From Multiple Datasets (2)
Adding Values From Multiple Datasets (3)
Calculating Combinations of Field Values
Summary
Chapter 10:Miscellaneous Shell Scripts
Using rm and mv with Directories
Using the find Command with Directories
Creating a Directory of Directories
Cloning a set of Sub-directories
Executing Files in Multiple Directories
The case/esac Command
Compressing/uncompressing Files
The dd Command
The crontab Command
Uncompressing Files as a cron Job
Scheduled Commands and Background Processes
How to Schedule Tasks
The nohup Command
Executing Commands Remotely
How to Schedule Tasks in the Background
How to Terminate Processes
Terminating Multiple Processes
Process-Related Commands
How to Monitor Processes
Checking Execution Results
System Messages and Log Files
Disk Usage Commands
Trapping and Ignoring Signals
Arithmetic with the bc and dc Commands
Working with the date Command
Print-related Commands
Creating a Report with the printf() Command
Checking Updates in a Logfile
Listing Active Users on a Machine
Miscellaneous Commands
Summary
Index
The goal of this book is to introduce readers to an assortment of powerful command line utilities that can be combined to create simple, yet powerful shell scripts. While all examples and scripts use the “bash” command set, many of the concepts translate into other command shells (such as sh, ksh, zsh, and csh), including the concept of piping data between commands, regular expression substitution, and the sed and awk commands. Aimed at a reader relatively new to working in a bash environment, the book is comprehensive enough to be a good reference and teach a few new tricks to those who already have some experience with creating shells scripts.
This short book contains a variety of code fragments and shell scripts for data scientists, data analysts, and other people who want shell-based solutions to “clean” various types of text files. In addition, the concepts and code samples in this book are useful for people who want to simplify routine tasks.
This book takes introductory concepts and commands in bash, and then demonstrates their use in simple yet powerful shell scripts. This book does not cover “pure” system administration functionality for Unix or Linux.
This book is intended for general users, data scientists, data analysts, and other people who perform a variety of tasks from the command line, and who also have a limited knowledge of shell programming.
You will acquire an understanding of how to use various bash commands, often as part of short shell scripts. The chapters also contain simple use cases that illustrate how to perform various tasks involving text files, such as switching the order of a two-column text file, removing control characters in a text file, find specific lines and merge them, reformatting a date field in a text file, and removing nested quotes.
This book saves you the time required to search for relevant code samples, adapting them to your specific needs, which is a potentially time-consuming process.
The code samples in this book were created and tested using bash on a Macbook Pro with OS X 10.12.6 (macOS Sierra). The code samples are derived primarily from scripts prepared by the author, and in some cases there are code samples that incorporate short sections of code from discussions in online forums. The key point to remember is that the code samples follow the “Four Cs”: they must be Clear, Concise, Complete, and Correct to the extent that it’s possible to do so, given the page length of this book.
You need some familiarity with working from the command line in a Unix-like environment. However, there are subjective prerequisites, such as a desire to learn shell programming, along with the motivation and discipline to read and understand the code samples. In any case, if you’re not sure whether or not you can absorb the material in this book, glance through the code samples to get a feel for the level of complexity.
The commands that do not meet any of the criteria listed in the previous section are not included in this Primer. Consequently, there is no coverage of commands for system administration (e.g., shutting down a machine, scheduling backups, and so forth). The purpose of the material in the chapters is to illustrate how to use bash commands for handling common data cleaning tasks with text files, after which you can do further reading to deepen your knowledge.
If you are a Mac user, there are three ways to do so. The first method is to use Finder to navigate to Applications > Utilities and then double click on the Utilities application. Next, if you already have a command shell available, you can launch a new command shell by typing the following command:
open /Applications/Utilities/Terminal.app
A second method for Mac users is to open a new command shell on a Macbook from a command shell that is already visible simply by clicking command+n in that command shell, and your Mac will launch another command shell.
If you are a PC user, you can install Cygwin (open source https://cygwin.com/) that simulates bash commands, or use another toolkit such as MKS (a commercial product). Please read the online documentation that describes the download and the installation processes.
The answer to this question varies widely, mainly because the answer depends heavily on your objectives. The best answer is to try a new tool or technique from the book out on a problem or task you care about, professionally or personally. Precisely what that might be depends on who you are, as the needs of a data scientist, manager, student or developer are all different. In addition, keep what you learned in mind as you tackle new data cleaning or manipulation challenges. Sometimes knowing that a particular technique is possible can make finding a solution easier, even if you have to re-read the section to remember exactly how the syntax works.
If you have reached the limits of what you have learned here and want to get further technical depth on these commands, there is a wide variety of literature published and online resources describing the bash shell, Unix programming, and the grep, sed, and awk commands.
O. Campesato
April 2020
This chapter contains a fast-paced introduction to basic commands in the bash shell, such as navigating around the file system, listing files, and displaying the contents of files. As you will soon see, this chapter is dense and contains a very eclectic mix of topics in order to prepare you for later chapters. If you already have some knowledge of bash commands, you can probably skim quickly through this introductory chapter and then proceed to Chapter 2. Incidentally, sometimes you will “bash shell” instead of just bash (as in the first sentence of this paragraph), and although the former is actually redundant, there won’t be any confusion about its intended meaning.
The first part of this chapter starts with a brief introduction to some Unix shells, followed by a discussion about files, file permissions, and directories. You will also learn how to create files and directories and how to change their access permissions.
The second part of this chapter introduces simple shell scripts, along with commands for making them executable. Since shell scripts involve various bash commands (and can optionally contain user-defined functions), it’s a good idea to learn about bash commands before you create bash scripts.
The third portion of this chapter discusses two useful bash commands: the cut command (for cutting or extracting columns and/or fields from a dataset) and the paste command (for “pasting” text or datasets together vertically).
In addition, the final part of this chapter contains a use case involving the cut command and paste command that illustrates how to switch the order of two columns in a dataset. You can also perform this task using the awk command (discussed in Chapter 7 and Chapter 9).
There are a few points to keep in mind before delving into the details of shell scripts. First, shell scripts can be executed from the command line after adding “execute” permissions to the text file containing the shell script. Second, you can use the crontab utility to schedule the execution of your shell scripts according to a schedule of your choice. Specifically, the crontab utility allows you to specify the execution of a shell script on an hourly, daily, weekly, or monthly basis. Tasks that are commonly scheduled via crontab include performing backups, removing unwanted files, and so forth. If you are completely new to Unix-based systems, just keep in mind that there is a way to run scripts both from the command line and in a “scheduled” manner. Setting file permissions to run the script from the command line will be discussed later.
Third, the contents of any shell script can be as simple as a single command or can comprise hundreds of lines of bash commands. In general, the more useful (and often more interesting) shell scripts involve a combination of several bash commands. A learning tip: since there are usually several ways to produce the desired result, it’s helpful to read other people’s shell scripts to learn how to combine commands in useful ways.
Unix is an operating system created by Ken Thompson in the early 1970s, which eventually led to a number of variations, such as HP/UX for HP machines and AIX for IBM machines. Linux Torvalds developed the Linux operating system during the 1990s, and many Linux commands are the same as their bash counterparts (but differences exist, often in the commands for system administrators). The Mac OS X operating system is based on AT&T Unix.
Unix has a rich and storied history, and if you are really interested in learning about its past, you can read online articles and also Wikipedia. This book foregoes those details and focuses on helping you quickly learn how to become productive with various commands.
The original Unix shell is the Bourne shell, which was written in the mid-1970s by Stephen R. Bourne. In addition, the Bourne shell was the first shell to appear on bash systems, and you will sometimes hear “the shell” as a reference to the Bourne shell. The Bourne shell is a POSIX standard shell, usually installed as /bin/sh on most versions of Unix, whose default prompt is the $ character. Consequently, Bourne shell scripts will execute on almost every version of Unix. In essence, the AT&T branches of Unix support the Bourne shell (sh), bash, Korn shell (ksh), tsh, and zsh.
However, there is also the BSD branch of Unix that uses the “C” shell (csh), whose default prompt is the % character. In general, shell scripts written for csh will not execute on AT&T branches of Unix, unless the csh shell is also installed on those machines (and vice versa).
The Bourne shell is the most ‘unadorned’ in the sense that it lacks some commands that are available in the other shells, such as history, noclobber, and so forth. Some well-known variants for Bourne Shell are listed as follows:
Korn shell (ksh)
Bourne Again shell (bash)
POSIX shell (sh)
zsh (“Zee shell”)
The different C-type shells are as shown below:
C shell (csh)
TENEX/TOPS C shell (tcsh)
The commands and the shell scripts in this book are based on the bash shell, and many of the commands also work in other Bourne-related shells (and the remaining shells have a similar command to accomplish the same goal). When you are unable to perform a particular shell-related task, perform an Internet search for “how to use <bash command> in <shell name>” and you will often find an answer. Keep in mind that sometimes there are variations in syntax for a given command in a particular shell, and typing “man <command>” in a command shell can provide useful information.
Bash is an acronym for “Bourne Again Shell”, which has its roots in the Bourne shell created by Stephen R. Bourne. Shell scripts based on the Bourne shell will execute in bash, but the converse is not necessarily true. The bash shell provides additional features that are unavailable in the Bourne shell, such as support for arrays (discussed later in this chapter).
On Mac OS X, the /bin directory contains the following executable shells:
-r-xr-xr-x 1 root wheel 1377872 Apr 28 2017 /bin/ksh -r-xr-xr-x 1 root wheel 630464 Apr 28 2017 /bin/sh -rwxr-xr-x 1 root wheel 375632 Apr 28 2017 /bin/csh -rwxr-xr-x 1 root wheel 592656 Apr 28 2017 /bin/zsh -r-xr-xr-x 1 root wheel 626272 Apr 28 2017 /bin/bash
In case you’re interested, a nice comparison matrix of the support for various features among the preceding shells is here:
https://stackoverflow.com/questions/5725296/difference-between-sh-and-bash Something else that might surprise you: in some environments the Bourne shell shis the Bash shell, which you can check by typing the following command:
sh --version GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin16) Copyright (C) 2007 Free Software Foundation, Inc.
If you are new to the command line (be it Mac, Linux, or PCs), please read the Preface that provides some useful guidelines for accessing command shells.
If you want to see the options for a specific bash command, invoke the man command to see a description of that bash command and its options:
man cat
Keep in mind that the man command produces terse explanations, and if those explanations are not clear enough, you can search for online code samples that provide more details.
In a command shell, you will often perform some common operations, such as displaying (or changing) the current directory, listing the contents of a directory, displaying the contents of a file, and so forth. The following set of commands show you how to perform these operations, and you can execute a subset of these commands in the sequence that is relevant to you. Options for some of the commands in this section (such as the ls command) are described in greater detail later in this chapter.
A frequently used Bash command is pwd (“print working directory”) that displays the current directory, as shown here:
pwd
The output of the preceding command might look something like this:
/Users/jsmith
Use the cd (“change directory”) command to go to a specific directory. For example, type the command cd /Users/jsmith/Mail to navigate to this directory (or some other existing directory). If you are currently in the /Users/jsmith directory, just type cd Mail.
You can navigate to your home directory with either of these commands:
$ cd $HOME $ cd
One convenient way to return to the previous directory is the command cd –. Keep in mind that the cd command on Windows merely displays the current directory and does not change the current directory (unlike the cd command in bash).
The history command displays a list (i.e., the history) of commands that you executed in the current command shell, as shown here:
history
A sample output of the preceding command is given below:
1202 cat sample.txt > longfile2.txt 1203 vi longfile2.txt 1204 cat longfile2.txt |fold -40 1205 cat longfile2.txt |fold -30 1206 cat longfile2.txt |fold -50 1207 cat longfile2.txt |fold -45 1208 vi longfile2.txt 1209 history 1210 cd /Library/Developer/CommandLineTools/usr/include/ c++/ 1211 cd /tmp 1212 cd $HOME/Desktop 1213 history
If you want to navigate to the directory that is shown in line 1210, you can do so simply by typing the following command:
!1210
The command !cd will search backwards through the history of commands to find the first command that matches the cd command, in this case, line 1212 is the first match. If there aren’t any intervening cd commands between the current command and the command in line 1210, then !1210 and !cd will have the same effect.
NOTE
Be careful with the “!” option with bash commands because the command that matches the “!” might not be the one you intended, so it’s safer to use the history command and then explicitly specify the correct number (in that history) when you invoke the “!” operator.
The ls command is for listing filenames, and there are many switches available that you can use, as shown in this section. For example, the ls command displays the following filenames (the actual display depends on the font size and the width of the command shell) on my Mac:
apple-care.txt iphonemeetup.txt outfile.txt ssl-instructions.txt checkin-commands.txt kyrgyzstan.txt output.txt
The command ls -1 (the digit “1”) displays a vertical listing of filenames:
apple-care.txt checkin-commands.txt iphonemeetup.txt kyrgyzstan.txt outfile.txt output.txt ssl-instructions.txt
The command ls -1 (the letter “l”) displays a long listing of filenames:
total 56 -rwx------ 1 ocampesato staff 25 Apr 06 19:21 apple-care.txt -rwx------ 1 ocampesato staff 146 Apr 06 19:21 checkin- commands.txt -rwx------ 1 ocampesato staff 478 Apr 06 19:21 iphonemeetup.txt -rwx------ 1 ocampesato staff 12 Apr 06 19:21 kyrgyzstan.txt -rw-r--r-- 1 ocampesato staff 11 Apr 06 19:21 outfile.txt -rw-r--r-- 1 ocampesato staff 12 Apr 06 19:21 output.txt -rwx------ 1 ocampesato staff 176 Apr 06 19:21 ssl-instructions.txt
The command ls -1t (the letters “l” and “t”) display a time-based long listing:
total 56 -rwx------ 1 ocampesato staff 25 Apr 06 19:21 apple-care.txt -rwx------ 1 ocampesato staff 146 Apr 06 19:21 checkincommands.txt -rwx------ 1 ocampesato staff 478 Apr 06 19:21 iphonemeetup.txt -rwx------ 1 ocampesato staff 12 Apr 06 19:21 kyrgyzstan.txt -rw-r--r-- 1 ocampesato staff 11 Apr 06 19:21 outfile.txt -rw-r--r-- 1 ocampesato staff 12 Apr 06 19:21 output.txt -rwx------ 1 ocampesato staff 176 Apr 06 19:21 ssl-instructions.txt
The command ls -ltr (the letters “l”, “t”, and “r”) display a reversed time-based long listing of filenames:
total 56 -rwx------ 1 ocampesato staff 176 Apr 06 19:21 ssl-instructions.txt -rw-r--r-- 1 ocampesato staff 12 Apr 06 19:21 output.txt -rw-r--r-- 1 ocampesato staff 11 Apr 06 19:21 outfile.txt -rwx------ 1 ocampesato staff 12 Apr 06 19:21 kyrgyzstan. txt -rwx------ 1 ocampesato staff 478 Apr 06 19:21 iphonemeetup.txt -rwx------ 1 ocampesato staff 146 Apr 06 19:21 checkin- commands.txt -rwx------ 1 ocampesato staff 25 Apr 06 19:21 apple-care.txt
Here is the description of all the listed columns in the preceding output:
Column #1: represents file type and permission given on the file (see below)
Column #2: the number of memory blocks taken by the file or directory
Column #3: the (Bash user) owner of the file
Column #4: represents a group of the owner
Column #5: represents the file size in bytes.
Column #6: the date and time when this file was created or last modified
Column #7: represents a file or directory name
In the ls -l listing example, every file line began with a d, -, or l. These characters indicate the type of file that is listed. These (and other) initial values are described below:
-
Regular file (ASCII text file, binary executable, or hard link)
b
Block special file (such as a physical hard drive)
c
Character special file (such as a physical hard drive)
d
Directory file that contains a listing of other files and directories.
l
Symbolic link file
p
Named pipe (a mechanism for interprocess communications)
s
Socket (for interprocess communication)
If you look back at the long listing that is displayed earlier in this section, you will see that the leftmost character is a dash (“-”), which means that it’s a long listing of regular files.
You can invoke the wc (word count) command to display the number of lines, words and characters in any text file, an example of which is shown here:
wc iphonemeetup.txt 10 5 478 iphonemeetup.txt
The preceding output shows that the file iphonemeetup.txt contains 10 lines, 5 words and 478 characters, which means that the file size is actually quite small.
Another point to keep in mind: this book works with files and directories, and occasionally with symbolic links; the other file types are primarily useful for programmers. Consult online documentation for more details regarding the ls command.
This section introduces you to several commands for displaying different lines of text in a text file. The commands that you will learn about are cat, head, tail, fold, and also the pipe (“|”) command.
Invoke the cat command to display the entire contents of sample.txt:
cat sample.txt
The preceding command displays the following text:
the contents of this long file are too long to see in a single screen and each line contains one or more words and if you use the cat command the (other lines are omitted)
The cat command displays the entire contents of a file, which might be inconvenient when you want to see a small portion of a file. Fortunately, the head and tail commands are available, along with several commands that display only a portion of a file, such as less and more that are discussed later.
You can also display the contents of multiple files via the cat command and a metacharacter (discussed in more detail later), such as ? or *. For example, suppose that the file temp1 has the following contents:
this is line1 of temp1 this is line2 of temp1 this is line3 of temp1
Let’s also suppose that the file temp2 has these contents:
this is line1 of temp2 this is line2 of temp2
Now type the following command that contains the ? metacharacter:
cat temp?
The output from the preceding command is shown here:
this is line1 of temp1 this is line2 of temp1 this is line3 of temp1 this is line1 of temp2 this is line2 of temp2
If you type the command cat temp* then the output will be the contents of all the files whose name starts with temp in the current directory. If you have a file – let’s call it temp2 – that contains binary data, then you will probably see some strange-looking output on your screen!
The head command displays the first ten lines of a text file (by default), an example of which is here:
head sample.txt
The preceding command displays the following text:
the contents of this long file are too long to see in a single screen and each line contains one or more words
The head command also provides an option to specify a different number of lines to display, as shown here:
head -4 sample.txt
The preceding command displays the following text:
the contents of this long file are too long
The tail command displays the last 10 lines (by default) of a text file:
tail sample.txt
The preceding command displays the following text:
is available in every shell including the bash shell csh zsh ksh and Bourne shell
NOTE
The last two lines in the preceding output are blank lines (not a typographical error in this page).
Similarly, the tail command allows you to specify a different number of lines to display: tail –4 sample.txt displays the last 4 lines of sample.txt.
Use the more command to display a screenful of data, as shown here:
more sample.txt
Press the <spacebar> to view the next screenful of data, and press the <return> key to see the next line of text in a file. Incidentally, some people prefer the less command, which generates essentially the same output as the more command. (A geeky joke: “What’s less? It’s more.”)
A very useful feature of bash is its support for the pipe symbol (“ | ”) that enables you to “pipe” or redirect the output of one command to become the input of another command. The pipe command is very handy when you want to perform a sequence of operations involving various bash commands.
For example, the following code snippet combines the head command with the cat command and the pipe (“| ”) symbol:
cat sample.txt| head -2
A technical point: the preceding command creates two bash processes (more about processes later) whereas the command head -2 sample.txt only creates a single bash process.
You can use the head and tail commands in more interesting ways. For example, the following command sequence displays lines 11 through 15 of
sample.txt: head -15 sample.txt |tail -5
The preceding command displays the following text:
and if you use the cat command the file contents scroll
Display the line numbers for the preceding output as follows:
cat –n sample.txt | head –15 | tail –5
The preceding command displays the following text:
11 and if you 12 use the cat 13 command the 14 file contents 15 scroll
You won’t see the “tab” character from the output, but it’s visible if you redirect the previous command sequence to a file and then use the “-t” option with the cat command:
cat –n sample.txt | head –15 | tail –5 > 1 cat –t 1 11^Iand if you 12^Iuse the cat 13^Icommand the 14^Ifile contents 15^Iscroll
The fold command enables you to “fold” the lines in a text file, which is useful for text files that contain long lines of text that you want to split into shorter lines. For example, here are the contents of longfile2.txt:
the contents of this long file are too long to see in a single screen and each line contains one or more words and if you use the cat command the file contents scroll off the screen so you can use other commands such as the head or tail or more commands in conjunction with the pipe command that is very useful in Bash and is available in every shell including the bash shell csh zsh ksh and Bourne shell
You can “fold” the contents of longfile2.txt into lines whose length is 45 (just as an example) with this
command: cat longfile2.txt |fold -45
The output of the preceding command is here:
the contents of this long file are too long t o see in a single screen and each line contai ns one or more words and if you use the cat c ommand the file contents scroll off the scree n so you can use other commands such as the h ead or tail or more commands in conjunction w ith the pipe command that is very useful in U nix and is available in every shell including the bash shell csh zsh ksh and Bourne shell
Notice that some words in the preceding output are split based on the line width, and not “newspaper style.”
In Chapter 4, you will learn how to display the lines in a text file that match a string or a pattern, and in Chapter 5 you will learn how to replace a string with another string in a text file.
Use the chmod command to change permissions for files. For example, if you need to set the owner, group, and other permissions equal to rwx rwr-- for a file, use the following command:
chmod u=rwx g=rw o=r filename
In the preceding command the options u, g, and o represent user permissions, group permissions, and others permissions, respectively.
Modify permissions on a file by specifying + to add permission to a user, group or others and specify - to remove permissions. For example, given a file with the permissions rwx rw- r--, add the executable permission to “others” as follows:
chmod o+x filename
Add the executable permission to all permission categories, that is, for the user, group, and others as follows:
chmod a+x filename
As you can surmise, the letter a in the preceding code snippet means “all groups”. Conversely, specify a - in order to remove permissions from all groups, as shown here:
chmod a-x filename
A so-called “hidden” file is a filename that starts with a period character (.). Bash programs (including the shell) use most of these files to store configuration information. Some common examples of hidden files include the files:
.profile: the Bourne shell (sh) initialization script
.bash_profile: the bash shell (bash) initialization script
.kshrc: the Korn shell (ksh) initialization script
.cshrc: the C shell (csh) initialization script
.rhosts: the remote shell configuration file
You can display a list of hidden files in a directory via the ls command and the -a option, as shown here:
ls -a . .profile docs lib test_results .. .rhosts hosts pub users .emacs bin hw1 res.01 work .exrc ch07 hw2 res.02 .kshrc ch07.bak hw3 res.03
Keep in mind that a single dot (“.”) represents the current directory and a double dot (“..”) represents the parent directory of the current directory.
A “problematic” filename is a filename that contains one or more whitespaces, hidden (non-printing) characters, or starts with a dash (“-”) character.
You can use double quotes to list filenames that contain whitespaces, or you can precede each whitespace by a backslash “\”) character. For example, if you have a file named One Space.txt, you can use the ls command as follows:
ls -1 "One Space.txt" ls –l One\ Space.txt
Filenames that start with a dash (“-”) character are difficult to handle because the dash character is the prefix that specifies options for bash commands. Consequently, if you have a file whose name is –abc, then the command ls –abc will not work correctly, because the “-a” is interpreted as a switch for the ls command (and there is no “a” option).
In most cases, the best solution to this type of filename is to rename the file. This can be done in your operating system if your client isn’t a bash shell, or you can use the following special syntax for the mv (“move”) command to rename the file. The preceding two dashes tell mv to ignore the dash in the filename. An example is here:
mv -- -abc.txt renamed-abc.txt
There are many built-in environment variables available, and the following subsections discuss the env command that displays the variables that have values in the environment, along with some common variables that are available in the environment of a command shell.
The env (“environment”) command displays the variables that are in your bash environment. An example of the output of the env command is here:
SHELL=/bin/bash TERM=xterm-256color TMPDIR=/var/folders/73/39lngcln4dj_scmgvsv53g_w0000gn/T/ OLDPWD=/tmp TERM_SESSION_ID=63101060-9DF0-405E-84E1-EC56282F4803 USER=ocampesato COMMAND_MODE=bash2003PATH=/opt/local/bin:/Users/ocampesato/ android-sdk-mac_86/platform-tools:/Users/ocampesato/ android-sdk-mac_86/tools:/usr/local/bin: PWD=/Users/ocampesato JAVA_HOME=/System/Library/Java/ JavaVirtualMachines/1.6.0.jdk/Contents/Home LANG=en_US.UTF-8 NODE_PATH=/usr/local/lib/node_modules HOME=/Users/ocampesato LOGNAME=ocampesato DISPLAY=/tmp/launch-xnTgkE/org.macosforge.xquartz:0 SECURITYSESSIONID=186a4 _=/usr/bin/env
The common environment variables that are pre-defined for you include HOME, LOGNAME, PWD, SHELL, TERM, and TMPDIR. Use the echo command to see the value of a single environment variable. For example, if you want to see the value of the SHELL environment variable, type the following command (notice the “$” character):
echo
$
SHELL
Based on the output of the env command that you saw earlier in this section, the output of the preceding command is here:
SHELL=/bin/bash
One other point: if you do not specify the $ character, you will not see the value of the environment variable. For example, if you type:
echo SHELL
Then you will see the following output:
SHELL
Later you will learn how to change the value of a variable, and if you are feeling impatient, you can see some interesting examples of setting an environment variable:
https://stackoverflow.com/questions/13998075/setting-environment-variable-for-one-program-call-in-bash-using-env
This section discusses some important environment variables, most of which you probably will not need to modify, but it’s useful to be aware of the existence of these variables and their purpose.
The HOME variable contains the absolute path of the user’s home directory
The HOSTNAME variable specifies the Internet name of the host
The LOGNAME variable specifies the user’s login name
The PATH variable specifies the search path (see next subsection)