OS Lab 5 - Bash Scripting

Objectives

Upon completion of this lab, you will be able to:

Explain how the kernel executes scripts via the shebang mechanism and how bash interprets script contents.
Distinguish between shell builtins and external programs, understanding why certain commands must be builtins.
Write scripts using proper variable quoting, recognizing that commands are fundamentally lists of strings.
Use exit status to control script flow with conditionals and logical operators.
Redirect file descriptors to control input and output streams.
Implement loops with while and for, incorporating command substitution.
Define functions with proper variable scoping and work with arrays.
Apply defensive scripting practices using set -euo pipefail.

Introduction

In previous labs, we explored processes, job control, and the permission model. These concepts form the foundation for understanding how programs execute and interact with the system. We now turn to bash scripting, which allows us to automate tasks and build tools without writing compiled programs.

This lab emphasizes underlying mechanisms rather than syntax alone. We will examine how the kernel interprets shebangs, why certain commands must be shell builtins, and how bash's command model treats everything as lists of strings. Understanding these foundations prepares you for the next lab, where we'll explore signals and inter-process communication using bash scripts rather than C code.

Prerequisites

System Requirements

A running instance of the course-provided Linux virtual machine with SSH or direct terminal access.

Required Packages

All necessary utilities (bash, seq, grep, wc) are pre-installed. No additional packages are required.

Knowledge Prerequisites

You should be familiar with:

Basic command-line navigation and file operations
Process concepts from Lab 3 (PIDs, exit status)
File permissions and the execute bit from Lab 4

Bash Scripting

Shebangs and Script Execution

The Kernel's Role

When you execute a file like ./script.sh, the kernel first checks the execute permission bit. If set, the kernel examines the file's first bytes. If they are #! (the "shebang"), the kernel reads the rest of that line as a path to an interpreter and re-executes the file using that interpreter.

For example, a script beginning with #!/bin/bash causes the kernel to execute /bin/bash ./script.sh. This mechanism is general-purpose: the kernel doesn't care whether the interpreter is bash, Python, or any other program.

Requirements for script execution:

Execute bit must be set (chmod +x script.sh)
File must begin with a valid shebang pointing to an interpreter
The interpreter itself must be an executable binary

Without the execute bit, the kernel's permission check fails. Without a shebang, the kernel cannot determine which interpreter to use, and behavior becomes system-dependent.

The Shell's Role

Once the kernel launches bash with your script as an argument, bash reads the script line by line, parsing each line according to its syntax rules. This involves expanding variables, performing word splitting, executing commands, and capturing results.

The fundamental principle: every command is a list of strings. When bash executes echo Hello World, it constructs a list of three strings: "echo", "Hello", and "World". The first string identifies the command; remaining strings are arguments. This model applies universally to builtins, external programs, and functions.

Builtins vs External Programs

External Programs and PATH

When bash encounters a command that isn't a builtin, it searches the directories in the PATH environment variable for an executable with that name. When found, bash forks a child process and executes the program. Commands like ls, grep, and wc run in separate processes with their own address space.

Use which to locate external programs:

which ls
# Output: /usr/bin/ls

Shell Builtins

Builtins are commands built directly into bash. Use type to identify them:

type cd
# Output: cd is a shell builtin

Builtins exist for two reasons:

1. Process State Modification

Some commands must modify the shell's own process state. Consider cd: if it were an external program, it would run in a child process, change that child's directory, then exit. The parent shell's directory would remain unchanged because child process state doesn't propagate to parents. For cd to work, it must execute within the shell's own process, making it a necessary builtin.

Other state-modifying builtins include ulimit (resource limits) and umask (permission mask). In the next lab, we'll see export (environment variables) and trap (signal handlers).

2. Performance

Commands like true, false, echo, and [[ are builtins for performance, avoiding process creation overhead. They could theoretically be external programs, but making them builtins is faster.

Exercise A: Your First Script and the Shebang

This exercise demonstrates the kernel's shebang mechanism and the importance of the execute bit.

Steps:

Create a script /tmp/lab5_hello.sh containing:
```
#!/bin/bash
echo "Hello from bash!"
```
Check the file's permissions with ls -l. Note that the execute bit is not set.
Attempt to execute the script directly: /tmp/lab5_hello.sh. This should fail with "Permission denied".
Make the script executable with chmod +x and verify with ls -l.
Execute the script directly. It should now succeed.
Execute the script by passing it to bash: bash /tmp/lab5_hello.sh. Observe that this works even without the execute bit.
Create a second script without a shebang, make it executable, and attempt to run it. Observe the unpredictable behavior.
Clean up both test scripts.

Deliverable A: Provide ls -l output before and after setting the execute bit, the "Permission denied" error, and the successful execution output.

Exercise B: Builtins vs External Commands

This exercise demonstrates why certain commands must be builtins.

Steps:

Use type to identify whether cd, echo, ls, [[, and grep are builtins or external commands.
Use which to locate external programs. Try which cd and note that it doesn't find the builtin.
Create /tmp/lab5_cd_test.sh that prints the current directory, changes to /tmp, and prints the directory again.
Make the script executable and run it. Observe that the script successfully changes to /tmp.
Check your shell's current directory with pwd. Note that your shell is still in the original directory.
Test cd directly in your shell to confirm it works when run as a builtin.
Clean up the test script.

Deliverable B: Provide the type command outputs, the script output followed by your shell's pwd output.

Variables and Quoting

The List-of-Strings Model

Variables in bash are expanded before commands execute. How they're expanded determines the final command's argument list.

Consider:

greeting="Hello World"
echo $greeting

Bash expands $greeting to "Hello World", then performs word splitting, breaking it into separate strings at whitespace. The result: three strings passed to echo: "echo", "Hello", "World".

This works fine for echo, but consider:

filename="my document.txt"
ls $filename

Word splitting breaks "my document.txt" into "my" and "document.txt". The ls command receives two arguments and tries to list two separate files, which fails.

Quoting Fixes This

Double quotes prevent word splitting:

ls "$filename"

Now ls receives one argument: "my document.txt". The expansion happens, but the result stays as a single string.

Single quotes prevent all expansion:

echo '$name'    # Outputs: $name (literal)
echo "$name"    # Outputs: the value of $name

Best practice: Always quote variables unless you specifically need word splitting. Write "$variable" not $variable. This makes scripts robust when variables contain spaces or special characters.

Exercise C: Quoting and Word Splitting

This exercise demonstrates why quoting variables is essential.

Steps:

Create a file named /tmp/my test file.txt (with spaces).
Store the filename in a variable and try to list it without quotes: ls $myfile. This will fail.
Try again with proper quoting: ls "$myfile". This succeeds.
Test echo with a variable containing multiple spaces, both with and without quotes.
Create /tmp/lab5_quote_test.sh that demonstrates variable quoting and array expansion both correctly and incorrectly. The script should show:
- A variable with spaces, echoed with and without quotes
- An array containing an element with spaces, iterated both ways
Make the script executable and run it. Observe how the "wrong way" processes four items instead of three.
Clean up the test files.

Deliverable C: Provide the error from unquoted ls, the success from quoted ls, complete script output.

Exit Status and Conditionals

Every command returns an exit status: a number from 0 to 255. By convention, 0 means success; non-zero means failure. Bash stores the last command's exit status in $?:

ls /tmp
echo $?    # Outputs: 0 (success)

ls /nonexistent
echo $?    # Outputs: non-zero (failure)

This exit status is bash's fundamental true/false mechanism. An if statement executes a command and checks if it succeeded:

if grep -q "ERROR" /var/log/syslog; then
    echo "Errors found"
fi

If grep finds "ERROR", it returns 0, and the then block executes.

The `[[` Command

The [[ builtin performs tests and returns an exit status. Despite unusual syntax, it's conceptually just another command:

[[ -f "file.txt" ]]    # Returns 0 if file exists, 1 otherwise

Common operators:

-f file: True if regular file exists
-d dir: True if directory exists
-e path: True if path exists
string1 == string2: True if strings equal
num1 -lt num2: True if num1 less than num2

Example:

if [[ -f "config.txt" ]]; then
    echo "Config file found"
fi

Logical Operators

&& (AND): Runs next command only if previous succeeded (returned 0):

mkdir /tmp/newdir && cd /tmp/newdir

|| (OR): Runs next command only if previous failed (returned non-zero):

cd /tmp/important || echo "Failed to change directory"

Explicit error ignoring:

rm -f /tmp/maybe-exists.txt || true

The || true pattern documents that you're intentionally ignoring potential errors.

Exercise D: Exit Status and Conditionals

This exercise explores exit status and conditional logic.

Steps:

Execute ls /tmp and check $?. It should be 0 (success).
Execute ls /nonexistent (redirect stderr to /dev/null) and check $?. It should be non-zero.
Test -f /etc/passwd and -f /nonexistent , checking $? after each.
Create /tmp/lab5_conditionals.sh that demonstrates:
- File existence tests with if
- The && operator for success chaining
- The || operator for failure handling
- The || true pattern for explicit error ignoring
- Using command exit status directly in if (e.g., with grep)
- Numeric and string comparisons with [[
Make the script executable and run it. Observe the output from each test.
Clean up the script.

Deliverable D: Provide exit status values from steps 1-3, complete script output.

Output Redirection and File Descriptors

Recall from Lab 3 that every process has file descriptors (FDs):

FD 0: Standard input (stdin)
FD 1: Standard output (stdout)
FD 2: Standard error (stderr)

Redirection allows you to control where these streams go.

Basic Redirection

Redirect stdout to a file:

echo "output" > file.txt

Redirect stderr:

ls /nonexistent 2> error.txt

Redirect both:

command &> combined.txt

Or more explicitly:

command > output.txt 2>&1

Order matters! 2>&1 must come after > output.txt to redirect stderr to where stdout is pointing.

Pipes

The pipe operator | connects one command's stdout to another's stdin:

cat /var/log/syslog | grep "ERROR" | wc -l

Data flows left to right. The kernel creates an anonymous pipe, connecting the write end to the first command's stdout and the read end to the second command's stdin.

Input Redirection

Redirect stdin from a file:

wc -l < file.txt

This differs from wc -l file.txt: with <, bash opens the file and connects it to stdin before running wc.

Loops

The `for` Loop

Iterates over a list of values:

for item in value1 value2 value3; do
    echo "$item"
done

Common sources for the list:

Filename expansion (globbing):

for file in *.txt; do
    echo "Processing $file"
done

Command substitution:

for i in $(seq 1 10); do
    echo "Number $i"
done

The `while` Loop

Executes while a command succeeds:

counter=1
while [[ $counter -le 5 ]]; do
    echo "Iteration $counter"
    counter=$((counter + 1))
done

Reading files line-by-line:

while read -r line; do
    echo "Line: $line"
done < file.txt

The read command returns 0 while reading lines successfully, and non-zero at end-of-file.

Exercise E: File Descriptor Redirection

This exercise explores redirection, connecting to Lab 3's file descriptor concepts.

Steps:

Redirect stdout to a file with >, then display the file contents.
Redirect stderr to a file with 2> by trying to list a nonexistent directory.
Redirect both stdout and stderr to the same file using &>.
Demonstrate the 2>&1 pattern with explicit redirection order.
Use a pipe to chain cat /etc/passwd, grep "root", and cut to extract the username.
Create /tmp/lab5_redirect.sh that demonstrates:
- Output redirection with > and >>
- Stderr capture with 2>
- Combined stream redirection with &>
- A pipeline example
- Input redirection with <
Make the script executable and run it.
Clean up all test files.

Deliverable E: Provide the captured stderr contents and complete script output.

Command Substitution

Command substitution captures a command's stdout for use elsewhere:

current_date=$(date +%Y-%m-%d)
echo "Today is $current_date"

Bash executes the command in $(), captures its output, and substitutes it in place.

Common in loops:

for i in $(seq 1 10); do
    echo "$i"
done

The seq output (numbers 1-10, one per line) is captured, word-split, and becomes the loop's value list.

Functions

Functions group commands for reuse:

greet() {
    echo "Hello, $1"
}

greet "Alice"    # Outputs: Hello, Alice

Arguments: Access via $1, $2, etc. $@ expands to all arguments; $# gives the count.

Return values: Functions return exit status (0-255) via return. For other values, write to stdout and capture with command substitution:

get_uppercase() {
    echo "$1" | tr '[:lower:]' '[:upper:]'
}

result=$(get_uppercase "hello")

Variable Scope

By default, variables are global. Use local for function-local variables:

calculate() {
    local temp=$1
    local result=$((temp * 2))
    echo "$result"
}

Without local, these variables would affect the global scope.

Arrays

Arrays are collections of strings:

files=("file1.txt" "file 2.txt" "file3.txt")

Access elements:

echo "${files[0]}"           # First element
echo "${files[@]}"           # All elements
echo "${#files[@]}"          # Number of elements

Critical syntax: Use "${array[@]}" (with quotes) to preserve each element as a separate string:

for file in "${files[@]}"; do
    echo "Processing: $file"
done

This correctly processes three files, including "file 2.txt" as one item.

Without quotes:

for file in ${files[@]}; do    # WRONG
    echo "Processing: $file"
done

Word splitting breaks "file 2.txt" into "file" and "2.txt", processing four items instead of three.

Defensive Scripting: `set -euo pipefail`

Bash scripts, by default, continue after errors. This can cause silent failures that are hard to debug. These options make scripts safer:

`set -e` (Exit on Error)

Exit immediately if any command returns non-zero:

set -e

cp important.txt backup.txt
process backup.txt    # Won't run if cp failed

To allow specific failures:

rm -f /tmp/file.txt || true

`set -u` (Error on Undefined Variables)

Treat undefined variables as errors:

set -u

echo "$undefined_var"    # Error: undefined_var: unbound variable

This catches typos and logic errors.

`set -o pipefail` (Pipeline Failure)

Normally, a pipeline's exit status is the last command's status:

grep "ERROR" missing.log | head -n 10
# Returns 0 (head succeeded) even though grep failed

With pipefail, the pipeline fails if any command fails:

set -o pipefail

grep "ERROR" missing.log | head -n 10
# Returns non-zero (grep failed)

Recommended Practice

Start every script with:

#!/bin/bash
set -euo pipefail

These options prevent silent errors and make debugging easier.

Scripting Challenges

Challenge 1: Log File Analyzer

Write /tmp/lab5_log_analyzer.sh that analyzes a log file.

Requirements:

Accept one argument: log file path
Print usage message if wrong number of arguments
Print error if file doesn't exist or isn't readable
Count lines containing "ERROR", "WARN", and "INFO"
Print summary report
Use: shebang, set -euo pipefail, proper quoting, if statements, grep, counting method

Test data:

cat > /tmp/lab5_test.log <<'EOF'
2024-01-15 10:00:00 INFO Application started
2024-01-15 10:05:23 INFO User login successful
2024-01-15 10:12:45 WARN Connection timeout, retrying
2024-01-15 10:15:00 ERROR Database connection failed
2024-01-15 10:15:30 INFO Connection restored
2024-01-15 10:20:00 ERROR Invalid configuration parameter
2024-01-15 10:25:00 WARN Disk space low
2024-01-15 10:30:00 INFO Backup completed successfully
EOF

Expected output:

Log Analysis Report for /tmp/lab5_test.log
==========================================
ERROR: 2
WARN:  2
INFO:  4

Test:

chmod +x /tmp/lab5_log_analyzer.sh
/tmp/lab5_log_analyzer.sh /tmp/lab5_test.log
/tmp/lab5_log_analyzer.sh
/tmp/lab5_log_analyzer.sh /nonexistent.log

Deliverable Challenge 1:

Complete script with comments
Output from test log
Output with no arguments (usage)
Output with nonexistent file (error)

Challenge 2: Batch File Processor

Write lab5_batch_processor.sh that processes multiple files.

Requirements:

Accept two arguments: directory path and operation ("count", "uppercase", or "list")
Print usage if wrong arguments or invalid operation
Define three functions (one per operation):
- count: Count lines in each .txt file
- uppercase: Create .upper.txt files with uppercase content
- list: List .txt files with sizes
Store files in array, use loop to process
Must use: shebang, set -euo pipefail, functions with local, array with "${array[@]}", loop, case/if for operation selection

Test data:

mkdir -p /tmp/lab5_batch_test
echo -e "line 1\nline 2\nline 3" > /tmp/lab5_batch_test/file1.txt
echo -e "hello world\ntest file" > /tmp/lab5_batch_test/file2.txt
echo -e "single line" > /tmp/lab5_batch_test/file3.txt

Expected output for "count":

Processing files in /tmp/lab5_batch_test
Operation: count
========================================
file1.txt: 3 lines
file2.txt: 2 lines
file3.txt: 1 lines

Test:

chmod +x lab5_batch_processor.sh
./lab5_batch_processor.sh /tmp/lab5_batch_test count
./lab5_batch_processor.sh /tmp/lab5_batch_test uppercase
cat /tmp/lab5_batch_test/file1.upper.txt
./lab5_batch_processor.sh /tmp/lab5_batch_test list

Deliverable Challenge 2:

Complete script
Output from "count" operation
Output from "uppercase" operation + contents of one .upper.txt file
Output from "list" operation

Reference: Common Patterns

Quick reference for common bash patterns:

Script header:

#!/bin/bash
set -euo pipefail

Check arguments:

if [[ $# -lt 1 ]]; then
    echo "Usage: $0 <arg>" >&2
    exit 1
fi

Test file types:

if [[ -f "$file" ]]; then
    echo "Regular file"
elif [[ -d "$file" ]]; then
    echo "Directory"
fi

Read file line-by-line:

while read -r line; do
    echo "$line"
done < "$filename"

Iterate over files:

for file in *.txt; do
    [[ -f "$file" ]] || continue
    echo "$file"
done

Command substitution:

date=$(date +%Y-%m-%d)
count=$(wc -l < "$file")

Arrays:

arr=("item1" "item 2" "item3")
for item in "${arr[@]}"; do
    echo "$item"
done

Functions:

process() {
    local file="$1"
    echo "Processing $file"
}

Here's a table of common commands for file usage and text manipulation:

Common Commands Reference

File Operations

Command	Description	Example
`ls`	List directory contents	`ls -la /tmp`
`cd`	Change directory	`cd /home/user`
`pwd`	Print working directory	`pwd`
`cp`	Copy files/directories	`cp file.txt backup.txt`
`mv`	Move/rename files	`mv old.txt new.txt`
`rm`	Remove files	`rm file.txt`
`mkdir`	Create directory	`mkdir -p /path/to/dir`
`rmdir`	Remove empty directory	`rmdir olddir`
`touch`	Create empty file or update timestamp	`touch newfile.txt`
`ln`	Create links	`ln -s target link`
`find`	Search for files	`find /home -name "*.txt"`
`chmod`	Change file permissions	`chmod +x script.sh`
`chown`	Change file owner	`chown user:group file.txt`

Text Viewing and Editing

Command	Description	Example
`cat`	Concatenate and display files	`cat file.txt`
`less`	View file with pagination	`less largefile.txt`
`more`	View file page by page	`more file.txt`
`head`	Display first lines of file	`head -n 10 file.txt`
`tail`	Display last lines of file	`tail -n 20 file.txt`
`nano`	Simple text editor	`nano file.txt`
`vim`	Advanced text editor	`vim file.txt`

Text Manipulation

Command	Description	Example
`grep`	Search for patterns in text	`grep "error" log.txt`
`sed`	Stream editor for text transformation	`sed 's/old/new/g' file.txt`
`awk`	Pattern scanning and processing	`awk '{print $1}' file.txt`
`cut`	Extract sections from lines	`cut -d: -f1 /etc/passwd`
`sort`	Sort lines of text	`sort file.txt`
`uniq`	Remove duplicate lines	`uniq sorted.txt`
`tr`	Translate or delete characters	`tr '[:lower:]' '[:upper:]'`
`wc`	Count lines, words, characters	`wc -l file.txt`
`diff`	Compare files line by line	`diff file1.txt file2.txt`
`paste`	Merge lines of files	`paste file1.txt file2.txt`
`join`	Join lines based on common field	`join file1.txt file2.txt`

Text Processing Utilities

Command	Description	Example
`echo`	Display text	`echo "Hello World"`
`printf`	Formatted output	`printf "%s\n" "text"`
`xargs`	Build and execute commands from input	xargs rm
`tee`	Read from stdin, write to stdout and files	tee file.txt
`column`	Format output into columns	column -t
`expand`	Convert tabs to spaces	`expand file.txt`
`unexpand`	Convert spaces to tabs	`unexpand file.txt`
`fold`	Wrap lines to specified width	`fold -w 80 file.txt`

File Information and Comparison

Command	Description	Example
`file`	Determine file type	`file document.pdf`
`stat`	Display file status	`stat file.txt`
`du`	Disk usage	`du -sh /home/user`
`df`	Disk free space	`df -h`
`md5sum`	Calculate MD5 checksum	`md5sum file.txt`
`sha256sum`	Calculate SHA256 checksum	`sha256sum file.txt`
`cmp`	Compare two files byte by byte	`cmp file1 file2`

Compression and Archives

Command	Description	Example
`tar`	Archive files	`tar -czf archive.tar.gz dir/`
`gzip`	Compress files	`gzip file.txt`
`gunzip`	Decompress gzip files	`gunzip file.txt.gz`
`zip`	Create zip archives	`zip archive.zip file1 file2`
`unzip`	Extract zip archives	`unzip archive.zip`
`bzip2`	Compress with bzip2	`bzip2 file.txt`
`xz`	Compress with xz	`xz file.txt`

Text Search and Pattern Matching

Command	Description	Example
`grep`	Search using basic regex	`grep "pattern" file.txt`
`grep -E`	Extended regex (same as `egrep`)	pat2" file.txt
`grep -F`	Fixed strings (same as `fgrep`)	`grep -F "literal" file.txt`
`grep -r`	Recursive search	`grep -r "pattern" /path/`
`locate`	Find files by name (uses database)	`locate filename`
`which`	Show full path of commands	`which python`
`whereis`	Locate binary, source, and man pages	`whereis bash`

Common Command Options

Frequently used flags across commands:

-r or -R: Recursive operation
-v: Verbose output or invert match (grep)
-f: Force operation
-i: Ignore case (grep, sort) or interactive (rm, mv)
-n: Show line numbers (grep, cat)
-a: All/append
-h: Human-readable output
-l: Long format or list files only

Deliverables and Assessment

Submit a single document (PDF or similar) containing screenshots of:

Exercise Deliverables:

Exercise A: Permission outputs, error message, successful output, explanation
Exercise B: type outputs, script + pwd output, explanation
Exercise C: Error/success outputs, script output, explanation
Exercise D: Exit status values, script output, explanation
Exercise E: Stderr contents, script output, FD explanations

Challenge Deliverables:

Challenge 1: Script code, test outputs (success, no args, nonexistent file)
Challenge 2: Script code, outputs for all three operations, .upper.txt contents

Additional Resources

This lab provides foundations for the next exercise on signals and inter-process communication. You'll use bash's trap builtin and named pipes without writing C code.

For further study:

Bash manual: man bash
ShellCheck: https://www.shellcheck.net/
Advanced Bash-Scripting Guide: https://tldp.org/LDP/abs/html/

OS Lab 5 - Bash Scripting

Objectives

Introduction

Prerequisites

System Requirements

Required Packages

Knowledge Prerequisites

Bash Scripting

Shebangs and Script Execution

The Kernel's Role

The Shell's Role

Builtins vs External Programs

External Programs and PATH

Shell Builtins

Exercise A: Your First Script and the Shebang

Exercise B: Builtins vs External Commands

Variables and Quoting

The List-of-Strings Model

Quoting Fixes This

Exercise C: Quoting and Word Splitting

Exit Status and Conditionals

The [[ Command

Logical Operators

Exercise D: Exit Status and Conditionals

Output Redirection and File Descriptors

Basic Redirection

Pipes

Input Redirection

Loops

The for Loop

The while Loop

Exercise E: File Descriptor Redirection

Command Substitution

Functions

Variable Scope

Arrays

Defensive Scripting: set -euo pipefail

set -e (Exit on Error)

set -u (Error on Undefined Variables)

set -o pipefail (Pipeline Failure)

Recommended Practice

Scripting Challenges

Challenge 1: Log File Analyzer

Challenge 2: Batch File Processor

Reference: Common Patterns

Common Commands Reference

File Operations

Text Viewing and Editing

Text Manipulation

Text Processing Utilities

File Information and Comparison

Compression and Archives

Text Search and Pattern Matching

Common Command Options

Deliverables and Assessment

Additional Resources

Meniu de navigare

Căutare

The `[[` Command

The `for` Loop

The `while` Loop

Defensive Scripting: `set -euo pipefail`

`set -e` (Exit on Error)

`set -u` (Error on Undefined Variables)

`set -o pipefail` (Pipeline Failure)