OS Lab 5 - Bash Scripting
Objectives
Upon completion of this lab, you will be able to:
- Explain how the kernel executes scripts via the shebang mechanism and how bash interprets script contents.
- Distinguish between shell builtins and external programs, understanding why certain commands must be builtins.
- Write scripts using proper variable quoting, recognizing that commands are fundamentally lists of strings.
- Use exit status to control script flow with conditionals and logical operators.
- Redirect file descriptors to control input and output streams.
- Implement loops with
whileandfor, incorporating command substitution. - Define functions with proper variable scoping and work with arrays.
- Apply defensive scripting practices using
set -euo pipefail.
Introduction
In previous labs, we explored processes, job control, and the permission model. These concepts form the foundation for understanding how programs execute and interact with the system. We now turn to bash scripting, which allows us to automate tasks and build tools without writing compiled programs.
This lab emphasizes underlying mechanisms rather than syntax alone. We will examine how the kernel interprets shebangs, why certain commands must be shell builtins, and how bash's command model treats everything as lists of strings. Understanding these foundations prepares you for the next lab, where we'll explore signals and inter-process communication using bash scripts rather than C code.
Prerequisites
System Requirements
A running instance of the course-provided Linux virtual machine with SSH or direct terminal access.
Required Packages
All necessary utilities (bash, seq, grep, wc) are pre-installed. No additional packages are required.
Knowledge Prerequisites
You should be familiar with:
- Basic command-line navigation and file operations
- Process concepts from Lab 3 (PIDs, exit status)
- File permissions and the execute bit from Lab 4
Bash Scripting
Shebangs and Script Execution
The Kernel's Role
When you execute a file like ./script.sh, the kernel first checks the execute permission bit. If set, the kernel examines the file's first bytes. If they are #! (the "shebang"), the kernel reads the rest of that line as a path to an interpreter and re-executes the file using that interpreter.
For example, a script beginning with #!/bin/bash causes the kernel to execute /bin/bash ./script.sh. This mechanism is general-purpose: the kernel doesn't care whether the interpreter is bash, Python, or any other program.
Requirements for script execution:
- Execute bit must be set (
chmod +x script.sh) - File must begin with a valid shebang pointing to an interpreter
- The interpreter itself must be an executable binary
Without the execute bit, the kernel's permission check fails. Without a shebang, the kernel cannot determine which interpreter to use, and behavior becomes system-dependent.
The Shell's Role
Once the kernel launches bash with your script as an argument, bash reads the script line by line, parsing each line according to its syntax rules. This involves expanding variables, performing word splitting, executing commands, and capturing results.
The fundamental principle: every command is a list of strings. When bash executes echo Hello World, it constructs a list of three strings: "echo", "Hello", and "World". The first string identifies the command; remaining strings are arguments. This model applies universally to builtins, external programs, and functions.
Builtins vs External Programs
External Programs and PATH
When bash encounters a command that isn't a builtin, it searches the directories in the PATH environment variable for an executable with that name. When found, bash forks a child process and executes the program. Commands like ls, grep, and wc run in separate processes with their own address space.
Use which to locate external programs:
which ls
# Output: /usr/bin/ls
Shell Builtins
Builtins are commands built directly into bash. Use type to identify them:
type cd
# Output: cd is a shell builtin
Builtins exist for two reasons:
1. Process State Modification
Some commands must modify the shell's own process state. Consider cd: if it were an external program, it would run in a child process, change that child's directory, then exit. The parent shell's directory would remain unchanged because child process state doesn't propagate to parents. For cd to work, it must execute within the shell's own process, making it a necessary builtin.
Other state-modifying builtins include ulimit (resource limits) and umask (permission mask). In the next lab, we'll see export (environment variables) and trap (signal handlers).
2. Performance
Commands like true, false, echo, and [[ are builtins for performance, avoiding process creation overhead. They could theoretically be external programs, but making them builtins is faster.
Exercise A: Your First Script and the Shebang
This exercise demonstrates the kernel's shebang mechanism and the importance of the execute bit.
Steps:
Create a script
/tmp/lab5_hello.shcontaining:#!/bin/bash echo "Hello from bash!"
Check the file's permissions with
ls -l. Note that the execute bit is not set.Attempt to execute the script directly:
/tmp/lab5_hello.sh. This should fail with "Permission denied".Make the script executable with
chmod +xand verify withls -l.Execute the script directly. It should now succeed.
Execute the script by passing it to bash:
bash /tmp/lab5_hello.sh. Observe that this works even without the execute bit.Create a second script without a shebang, make it executable, and attempt to run it. Observe the unpredictable behavior.
Clean up both test scripts.
Deliverable A: Provide ls -l output before and after setting the execute bit, the "Permission denied" error, and the successful execution output.
Exercise B: Builtins vs External Commands
This exercise demonstrates why certain commands must be builtins.
Steps:
- Use
typeto identify whethercd,echo,ls,[[, andgrepare builtins or external commands. - Use
whichto locate external programs. Trywhich cdand note that it doesn't find the builtin. - Create
/tmp/lab5_cd_test.shthat prints the current directory, changes to/tmp, and prints the directory again. - Make the script executable and run it. Observe that the script successfully changes to
/tmp. - Check your shell's current directory with
pwd. Note that your shell is still in the original directory. - Test
cddirectly in your shell to confirm it works when run as a builtin. - Clean up the test script.
Deliverable B: Provide the type command outputs, the script output followed by your shell's pwd output.
Variables and Quoting
The List-of-Strings Model
Variables in bash are expanded before commands execute. How they're expanded determines the final command's argument list.
Consider:
greeting="Hello World"
echo $greeting
Bash expands $greeting to "Hello World", then performs word splitting, breaking it into separate strings at whitespace. The result: three strings passed to echo: "echo", "Hello", "World".
This works fine for echo, but consider:
filename="my document.txt"
ls $filename
Word splitting breaks "my document.txt" into "my" and "document.txt". The ls command receives two arguments and tries to list two separate files, which fails.
Quoting Fixes This
Double quotes prevent word splitting:
ls "$filename"
Now ls receives one argument: "my document.txt". The expansion happens, but the result stays as a single string.
Single quotes prevent all expansion:
echo '$name' # Outputs: $name (literal)
echo "$name" # Outputs: the value of $name
Best practice: Always quote variables unless you specifically need word splitting. Write "$variable" not $variable. This makes scripts robust when variables contain spaces or special characters.
Exercise C: Quoting and Word Splitting
This exercise demonstrates why quoting variables is essential.
Steps:
- Create a file named
/tmp/my test file.txt(with spaces). - Store the filename in a variable and try to list it without quotes:
ls $myfile. This will fail. - Try again with proper quoting:
ls "$myfile". This succeeds. - Test echo with a variable containing multiple spaces, both with and without quotes.
- Create
/tmp/lab5_quote_test.shthat demonstrates variable quoting and array expansion both correctly and incorrectly. The script should show:- A variable with spaces, echoed with and without quotes
- An array containing an element with spaces, iterated both ways
- Make the script executable and run it. Observe how the "wrong way" processes four items instead of three.
- Clean up the test files.
Deliverable C: Provide the error from unquoted ls, the success from quoted ls, complete script output.
Exit Status and Conditionals
Every command returns an exit status: a number from 0 to 255. By convention, 0 means success; non-zero means failure. Bash stores the last command's exit status in $?:
ls /tmp
echo $? # Outputs: 0 (success)
ls /nonexistent
echo $? # Outputs: non-zero (failure)
This exit status is bash's fundamental true/false mechanism. An if statement executes a command and checks if it succeeded:
if grep -q "ERROR" /var/log/syslog; then
echo "Errors found"
fi
If grep finds "ERROR", it returns 0, and the then block executes.
The [[ Command
The [[ builtin performs tests and returns an exit status. Despite unusual syntax, it's conceptually just another command:
[[ -f "file.txt" ]] # Returns 0 if file exists, 1 otherwise
Common operators:
-f file: True if regular file exists-d dir: True if directory exists-e path: True if path existsstring1 == string2: True if strings equalnum1 -lt num2: True if num1 less than num2
Example:
if [[ -f "config.txt" ]]; then
echo "Config file found"
fi
Logical Operators
&& (AND): Runs next command only if previous succeeded (returned 0):
mkdir /tmp/newdir && cd /tmp/newdir
|| (OR): Runs next command only if previous failed (returned non-zero):
cd /tmp/important || echo "Failed to change directory"
Explicit error ignoring:
rm -f /tmp/maybe-exists.txt || true
The || true pattern documents that you're intentionally ignoring potential errors.
Exercise D: Exit Status and Conditionals
This exercise explores exit status and conditional logic.
Steps:
- Execute
ls /tmpand check$?. It should be 0 (success). - Execute
ls /nonexistent(redirect stderr to/dev/null) and check$?. It should be non-zero. - Test
-f /etc/passwdand-f /nonexistent, checking$?after each. - Create
/tmp/lab5_conditionals.shthat demonstrates:- File existence tests with
if - The
&&operator for success chaining - The
||operator for failure handling - The
|| truepattern for explicit error ignoring - Using command exit status directly in
if(e.g., withgrep) - Numeric and string comparisons with
[[
- File existence tests with
- Make the script executable and run it. Observe the output from each test.
- Clean up the script.
Deliverable D: Provide exit status values from steps 1-3, complete script output.
Output Redirection and File Descriptors
Recall from Lab 3 that every process has file descriptors (FDs):
- FD 0: Standard input (stdin)
- FD 1: Standard output (stdout)
- FD 2: Standard error (stderr)
Redirection allows you to control where these streams go.
Basic Redirection
Redirect stdout to a file:
echo "output" > file.txt
Redirect stderr:
ls /nonexistent 2> error.txt
Redirect both:
command &> combined.txt
Or more explicitly:
command > output.txt 2>&1
Order matters! 2>&1 must come after > output.txt to redirect stderr to where stdout is pointing.
Pipes
The pipe operator | connects one command's stdout to another's stdin:
cat /var/log/syslog | grep "ERROR" | wc -l
Data flows left to right. The kernel creates an anonymous pipe, connecting the write end to the first command's stdout and the read end to the second command's stdin.
Input Redirection
Redirect stdin from a file:
wc -l < file.txt
This differs from wc -l file.txt: with <, bash opens the file and connects it to stdin before running wc.
Loops
The for Loop
Iterates over a list of values:
for item in value1 value2 value3; do
echo "$item"
done
Common sources for the list:
Filename expansion (globbing):
for file in *.txt; do
echo "Processing $file"
done
Command substitution:
for i in $(seq 1 10); do
echo "Number $i"
done
The while Loop
Executes while a command succeeds:
counter=1
while [[ $counter -le 5 ]]; do
echo "Iteration $counter"
counter=$((counter + 1))
done
Reading files line-by-line:
while read -r line; do
echo "Line: $line"
done < file.txt
The read command returns 0 while reading lines successfully, and non-zero at end-of-file.
Exercise E: File Descriptor Redirection
This exercise explores redirection, connecting to Lab 3's file descriptor concepts.
Steps:
- Redirect stdout to a file with
>, then display the file contents. - Redirect stderr to a file with
2>by trying to list a nonexistent directory. - Redirect both stdout and stderr to the same file using
&>. - Demonstrate the
2>&1pattern with explicit redirection order. - Use a pipe to chain
cat /etc/passwd,grep "root", andcutto extract the username. - Create
/tmp/lab5_redirect.shthat demonstrates:- Output redirection with
>and>> - Stderr capture with
2> - Combined stream redirection with
&> - A pipeline example
- Input redirection with
<
- Output redirection with
- Make the script executable and run it.
- Clean up all test files.
Deliverable E: Provide the captured stderr contents and complete script output.
Command Substitution
Command substitution captures a command's stdout for use elsewhere:
current_date=$(date +%Y-%m-%d)
echo "Today is $current_date"
Bash executes the command in $(), captures its output, and substitutes it in place.
Common in loops:
for i in $(seq 1 10); do
echo "$i"
done
The seq output (numbers 1-10, one per line) is captured, word-split, and becomes the loop's value list.
Functions
Functions group commands for reuse:
greet() {
echo "Hello, $1"
}
greet "Alice" # Outputs: Hello, Alice
Arguments: Access via $1, $2, etc. $@ expands to all arguments; $# gives the count.
Return values: Functions return exit status (0-255) via return. For other values, write to stdout and capture with command substitution:
get_uppercase() {
echo "$1" | tr '[:lower:]' '[:upper:]'
}
result=$(get_uppercase "hello")
Variable Scope
By default, variables are global. Use local for function-local variables:
calculate() {
local temp=$1
local result=$((temp * 2))
echo "$result"
}
Without local, these variables would affect the global scope.
Arrays
Arrays are collections of strings:
files=("file1.txt" "file 2.txt" "file3.txt")
Access elements:
echo "${files[0]}" # First element
echo "${files[@]}" # All elements
echo "${#files[@]}" # Number of elements
Critical syntax: Use "${array[@]}" (with quotes) to preserve each element as a separate string:
for file in "${files[@]}"; do
echo "Processing: $file"
done
This correctly processes three files, including "file 2.txt" as one item.
Without quotes:
for file in ${files[@]}; do # WRONG
echo "Processing: $file"
done
Word splitting breaks "file 2.txt" into "file" and "2.txt", processing four items instead of three.
Defensive Scripting: set -euo pipefail
Bash scripts, by default, continue after errors. This can cause silent failures that are hard to debug. These options make scripts safer:
set -e (Exit on Error)
Exit immediately if any command returns non-zero:
set -e
cp important.txt backup.txt
process backup.txt # Won't run if cp failed
To allow specific failures:
rm -f /tmp/file.txt || true
set -u (Error on Undefined Variables)
Treat undefined variables as errors:
set -u
echo "$undefined_var" # Error: undefined_var: unbound variable
This catches typos and logic errors.
set -o pipefail (Pipeline Failure)
Normally, a pipeline's exit status is the last command's status:
grep "ERROR" missing.log | head -n 10
# Returns 0 (head succeeded) even though grep failed
With pipefail, the pipeline fails if any command fails:
set -o pipefail
grep "ERROR" missing.log | head -n 10
# Returns non-zero (grep failed)
Recommended Practice
Start every script with:
#!/bin/bash
set -euo pipefail
These options prevent silent errors and make debugging easier.
Scripting Challenges
Challenge 1: Log File Analyzer
Write /tmp/lab5_log_analyzer.sh that analyzes a log file.
Requirements:
- Accept one argument: log file path
- Print usage message if wrong number of arguments
- Print error if file doesn't exist or isn't readable
- Count lines containing "ERROR", "WARN", and "INFO"
- Print summary report
- Use: shebang,
set -euo pipefail, proper quoting,ifstatements,grep, counting method
Test data:
cat > /tmp/lab5_test.log <<'EOF'
2024-01-15 10:00:00 INFO Application started
2024-01-15 10:05:23 INFO User login successful
2024-01-15 10:12:45 WARN Connection timeout, retrying
2024-01-15 10:15:00 ERROR Database connection failed
2024-01-15 10:15:30 INFO Connection restored
2024-01-15 10:20:00 ERROR Invalid configuration parameter
2024-01-15 10:25:00 WARN Disk space low
2024-01-15 10:30:00 INFO Backup completed successfully
EOF
Expected output:
Log Analysis Report for /tmp/lab5_test.log ========================================== ERROR: 2 WARN: 2 INFO: 4
Test:
chmod +x /tmp/lab5_log_analyzer.sh
/tmp/lab5_log_analyzer.sh /tmp/lab5_test.log
/tmp/lab5_log_analyzer.sh
/tmp/lab5_log_analyzer.sh /nonexistent.log
Deliverable Challenge 1:
- Complete script with comments
- Output from test log
- Output with no arguments (usage)
- Output with nonexistent file (error)
Challenge 2: Batch File Processor
Write lab5_batch_processor.sh that processes multiple files.
Requirements:
- Accept two arguments: directory path and operation ("count", "uppercase", or "list")
- Print usage if wrong arguments or invalid operation
- Define three functions (one per operation):
count: Count lines in each.txtfileuppercase: Create.upper.txtfiles with uppercase contentlist: List.txtfiles with sizes
- Store files in array, use loop to process
- Must use: shebang,
set -euo pipefail, functions withlocal, array with"${array[@]}", loop, case/if for operation selection
Test data:
mkdir -p /tmp/lab5_batch_test
echo -e "line 1\nline 2\nline 3" > /tmp/lab5_batch_test/file1.txt
echo -e "hello world\ntest file" > /tmp/lab5_batch_test/file2.txt
echo -e "single line" > /tmp/lab5_batch_test/file3.txt
Expected output for "count":
Processing files in /tmp/lab5_batch_test Operation: count ======================================== file1.txt: 3 lines file2.txt: 2 lines file3.txt: 1 lines
Test:
chmod +x lab5_batch_processor.sh
./lab5_batch_processor.sh /tmp/lab5_batch_test count
./lab5_batch_processor.sh /tmp/lab5_batch_test uppercase
cat /tmp/lab5_batch_test/file1.upper.txt
./lab5_batch_processor.sh /tmp/lab5_batch_test list
Deliverable Challenge 2:
- Complete script
- Output from "count" operation
- Output from "uppercase" operation + contents of one
.upper.txtfile - Output from "list" operation
Reference: Common Patterns
Quick reference for common bash patterns:
Script header:
#!/bin/bash
set -euo pipefail
Check arguments:
if [[ $# -lt 1 ]]; then
echo "Usage: $0 <arg>" >&2
exit 1
fi
Test file types:
if [[ -f "$file" ]]; then
echo "Regular file"
elif [[ -d "$file" ]]; then
echo "Directory"
fi
Read file line-by-line:
while read -r line; do
echo "$line"
done < "$filename"
Iterate over files:
for file in *.txt; do
[[ -f "$file" ]] || continue
echo "$file"
done
Command substitution:
date=$(date +%Y-%m-%d)
count=$(wc -l < "$file")
Arrays:
arr=("item1" "item 2" "item3")
for item in "${arr[@]}"; do
echo "$item"
done
Functions:
process() {
local file="$1"
echo "Processing $file"
}
Here's a table of common commands for file usage and text manipulation:
Common Commands Reference
File Operations
| Command | Description | Example |
|---|---|---|
ls
|
List directory contents | ls -la /tmp
|
cd
|
Change directory | cd /home/user
|
pwd
|
Print working directory | pwd
|
cp
|
Copy files/directories | cp file.txt backup.txt
|
mv
|
Move/rename files | mv old.txt new.txt
|
rm
|
Remove files | rm file.txt
|
mkdir
|
Create directory | mkdir -p /path/to/dir
|
rmdir
|
Remove empty directory | rmdir olddir
|
touch
|
Create empty file or update timestamp | touch newfile.txt
|
ln
|
Create links | ln -s target link
|
find
|
Search for files | find /home -name "*.txt"
|
chmod
|
Change file permissions | chmod +x script.sh
|
chown
|
Change file owner | chown user:group file.txt
|
Text Viewing and Editing
| Command | Description | Example |
|---|---|---|
cat
|
Concatenate and display files | cat file.txt
|
less
|
View file with pagination | less largefile.txt
|
more
|
View file page by page | more file.txt
|
head
|
Display first lines of file | head -n 10 file.txt
|
tail
|
Display last lines of file | tail -n 20 file.txt
|
nano
|
Simple text editor | nano file.txt
|
vim
|
Advanced text editor | vim file.txt
|
Text Manipulation
| Command | Description | Example |
|---|---|---|
grep
|
Search for patterns in text | grep "error" log.txt
|
sed
|
Stream editor for text transformation | sed 's/old/new/g' file.txt
|
awk
|
Pattern scanning and processing | awk '{print $1}' file.txt
|
cut
|
Extract sections from lines | cut -d: -f1 /etc/passwd
|
sort
|
Sort lines of text | sort file.txt
|
uniq
|
Remove duplicate lines | uniq sorted.txt
|
tr
|
Translate or delete characters | tr '[:lower:]' '[:upper:]'
|
wc
|
Count lines, words, characters | wc -l file.txt
|
diff
|
Compare files line by line | diff file1.txt file2.txt
|
paste
|
Merge lines of files | paste file1.txt file2.txt
|
join
|
Join lines based on common field | join file1.txt file2.txt
|
Text Processing Utilities
| Command | Description | Example |
|---|---|---|
echo
|
Display text | echo "Hello World"
|
printf
|
Formatted output | printf "%s\n" "text"
|
xargs
|
Build and execute commands from input | xargs rm |
tee
|
Read from stdin, write to stdout and files | tee file.txt |
column
|
Format output into columns | column -t |
expand
|
Convert tabs to spaces | expand file.txt
|
unexpand
|
Convert spaces to tabs | unexpand file.txt
|
fold
|
Wrap lines to specified width | fold -w 80 file.txt
|
File Information and Comparison
| Command | Description | Example |
|---|---|---|
file
|
Determine file type | file document.pdf
|
stat
|
Display file status | stat file.txt
|
du
|
Disk usage | du -sh /home/user
|
df
|
Disk free space | df -h
|
md5sum
|
Calculate MD5 checksum | md5sum file.txt
|
sha256sum
|
Calculate SHA256 checksum | sha256sum file.txt
|
cmp
|
Compare two files byte by byte | cmp file1 file2
|
Compression and Archives
| Command | Description | Example |
|---|---|---|
tar
|
Archive files | tar -czf archive.tar.gz dir/
|
gzip
|
Compress files | gzip file.txt
|
gunzip
|
Decompress gzip files | gunzip file.txt.gz
|
zip
|
Create zip archives | zip archive.zip file1 file2
|
unzip
|
Extract zip archives | unzip archive.zip
|
bzip2
|
Compress with bzip2 | bzip2 file.txt
|
xz
|
Compress with xz | xz file.txt
|
Text Search and Pattern Matching
| Command | Description | Example |
|---|---|---|
grep
|
Search using basic regex | grep "pattern" file.txt
|
grep -E
|
Extended regex (same as egrep)
|
pat2" file.txt |
grep -F
|
Fixed strings (same as fgrep)
|
grep -F "literal" file.txt
|
grep -r
|
Recursive search | grep -r "pattern" /path/
|
locate
|
Find files by name (uses database) | locate filename
|
which
|
Show full path of commands | which python
|
whereis
|
Locate binary, source, and man pages | whereis bash
|
Common Command Options
Frequently used flags across commands:
-ror-R: Recursive operation-v: Verbose output or invert match (grep)-f: Force operation-i: Ignore case (grep, sort) or interactive (rm, mv)-n: Show line numbers (grep, cat)-a: All/append-h: Human-readable output-l: Long format or list files only
Deliverables and Assessment
Submit a single document (PDF or similar) containing screenshots of:
Exercise Deliverables:
- Exercise A: Permission outputs, error message, successful output, explanation
- Exercise B:
typeoutputs, script + pwd output, explanation - Exercise C: Error/success outputs, script output, explanation
- Exercise D: Exit status values, script output, explanation
- Exercise E: Stderr contents, script output, FD explanations
Challenge Deliverables:
- Challenge 1: Script code, test outputs (success, no args, nonexistent file)
- Challenge 2: Script code, outputs for all three operations,
.upper.txtcontents
Additional Resources
This lab provides foundations for the next exercise on signals and inter-process communication. You'll use bash's trap builtin and named pipes without writing C code.
For further study:
- Bash manual:
man bash - ShellCheck: https://www.shellcheck.net/
- Advanced Bash-Scripting Guide: https://tldp.org/LDP/abs/html/