OS Lab 9 - Application Protocols: DNS, SSH, HTTP(S)

De la WikiLabs
Jump to navigationJump to search

Objectives

Upon completion of this lab, you will be able to:

  • Understand core internet infrastructure: well-known ports, DNS hierarchy, and the roles of authoritative vs. recursive DNS servers.
  • Use DNS tools (dig, nslookup, host) to query records and analyze responses, including TTLs and caching behavior.
  • Configure secure SSH access using Ed25519 keys and explain SSH’s TOFU model compared with certificate-based PKI.
  • Build and route HTTP(S) services using a reverse proxy, and understand why this pattern underlies modern cloud architectures.
  • Diagnose common application-layer connectivity issues using protocol-specific debugging techniques.

Introduction

The Journey Through the Network Stack

In Labs 7 and 8, we systematically constructed the entire networking stack from the physical layer upward:

  • Lab 7 (Network Infrastructure): We built the foundational plumbing—network interfaces, MAC addresses, IP addressing, routing tables, network bridges, and the mechanisms that allow one machine to physically reach another across networks. We demonstrated how the kernel routes packets from source to destination based on Ethernet and Layer 3 IP addressing.
  • Lab 8 (Transport and Security): We ascended to the transport layer, introducing the critical concept of ports to enable process-to-process communication. We explored the fundamental trade-offs between UDP's connectionless simplicity and TCP's reliable, connection-oriented delivery. Most importantly, we implemented Transport Layer Security (TLS) using Public Key Infrastructure (PKI) to protect data in transit from eavesdropping and tampering.

We now reach the pinnacle of the network stack: called the Application Layer in the TCP/IP model. This is where protocols transition from answering "how do we reliably deliver data between processes?" to addressing "what should we do with this data and how should applications interact?"

The Application Layer

Consider what you actually accomplish with computers in daily practice. You do not consciously think about TCP sequence numbers, IP routing algorithms, or TLS cipher suites. Instead, you engage in high-level activities:

  • You type a domain name (e.g., "example.com") and a website appears instantaneously (DNS + HTTP)
  • You connect to a remote server to execute commands securely (SSH)
  • You upload files, stream videos, send API requests, interact with cloud services (predominantly HTTP/HTTPS)

The application layer is where networking infrastructure becomes visible as useful services. These are the protocols that system administrators, DevOps engineers, and software developers interact with constantly. While hundreds of application protocols exist, three protocols dominate modern networking:

  1. DNS (Port 53): The distributed directory service that answers "How do I find the IP address for this domain name?"
  2. SSH (Port 22): The secure remote access protocol that answers "How do I securely administer remote systems?"
  3. HTTP/HTTPS (Ports 80/443): The hypertext transfer protocol that has evolved far beyond web browsing to become the answer to "How do I communicate application data?" for nearly everything.

The Complete Network Request Journey

By the end of this lab, you will understand the complete, end-to-end journey of a modern web request. When you type https://api.example.com/users into your browser, the following sequence occurs:

  1. DNS Resolution (This lab): Your system queries DNS to resolve api.example.com to an IP address (e.g., 203.0.113.50)
  2. Routing (Lab 7): The kernel consults routing tables to determine the next hop toward that IP address
  3. TCP Connection (Lab 8): Your system establishes a three-way handshake to create a reliable TCP connection to port 443 at that IP address
  4. TLS Handshake (Lab 8): The client and server perform a TLS handshake, establishing an encrypted tunnel using certificates to verify the server's identity
  5. HTTP Request (This lab): Your browser sends an HTTP GET request to the /users path over the encrypted connection
  6. Reverse Proxy Routing (This lab): A reverse proxy on the server side examines the request path and routes it to the appropriate backend service
  7. Response Journey: The response flows back through all these layers—HTTP response, TLS encryption, TCP segments, IP packets, Ethernet frames—until it reaches your browser

Prerequisites

System Requirements

A running instance of a Linux virtual machine with root privileges (via sudo). You will need terminal access, either via SSH or directly through the VM console.

Network Topology: This lab builds upon the network namespace topology established in Lab 8. You should have:

  • Red namespace: Simulated host with IP address 10.0.0.1/24
  • Blue namespace: Simulated host with IP address 10.0.0.2/24
  • Host system: Bridge interface (br0) connecting the namespaces
  • Full bidirectional connectivity verified in Lab 8

If you did not complete Lab 8 or your topology has been reset, you will need to recreate it following Lab 8's setup instructions before proceeding.

Required Packages

The following packages must be installed. Run these commands on your host system:

sudo apt update
sudo apt install -y openssh-server openssh-client dnsutils bind9-dnsutils \
                    caddy curl net-tools

Package descriptions:

  • openssh-server: OpenSSH server daemon for accepting SSH connections
  • openssh-client: OpenSSH client tools (ssh, scp, sftp) for initiating connections
  • dnsutils: DNS client utilities including dig and nslookup
  • bind9-dnsutils: Additional DNS diagnostic tools including host
  • caddy: Modern web server with automatic HTTPS and simple configuration syntax
  • curl: Command-line HTTP client for testing and debugging
  • net-tools: Legacy networking tools (netstat, ifconfig) for compatibility

Knowledge Prerequisites

You should be familiar with:

From Lab 7 (Network Fundamentals):

  • Network interfaces, IP addresses, subnet masks, and CIDR notation
  • Routing tables and default gateways
  • Network namespaces and virtual ethernet pairs
  • Bridge devices and how they connect network segments

From Lab 8 (Transport and Security):

  • TCP and UDP transport protocols
  • Port numbers and socket addresses (IP:PORT)
  • TLS encryption and certificate-based authentication
  • The Public Key Infrastructure (PKI) trust model
  • Basic hostname resolution using /etc/hosts

General Linux Skills:

  • Command-line text manipulation (grep, awk, sed, cut)
  • Process management (starting, stopping, viewing processes)
  • File editing with text editors (nano, vim, or your preference)
  • Basic Bash scripting (variables, loops, conditionals)

You should also understand:

  • The client-server model of network communication
  • What an IP address represents and how packets are routed between networks
  • The concept of a socket as the combination of IP address and port number
  • Basic cryptographic concepts (public/private keys, certificates)

Theoretical Background

The Transport Layer Recap

Before diving into application protocols, let us briefly revisit why the transport layer exists and what problem it solves. This context is essential for understanding well-known ports and application protocol design.

The Fundamental Problem: An IP address identifies a machine on the network, but machines do not communicate with machines. Processes (running programs) communicate with processes. When a packet arrives at your machine's IP address, the kernel must determine which of potentially hundreds of running processes should receive that packet.

The Solution: The transport layer introduces port numbers—16-bit unsigned integers (range 0-65535) that identify specific communication endpoints. When combined with an IP address, a port creates a unique socket address (written as IP:PORT, e.g., 192.168.1.50:443) that identifies a specific process on a specific machine.

A complete connection is identified by a five-tuple:

  1. Source IP address
  2. Source port
  3. Destination IP address
  4. Destination port
  5. Protocol (TCP or UDP)

This five-tuple uniquely identifies a communication channel. For example:

  • Client: 192.168.1.50:54321 → Server: 203.0.113.10:443 (HTTPS connection)
  • Client: 10.0.0.1:35678 → Server: 10.0.0.2:22 (SSH connection)

With this foundation established, we can now explore how standardized port numbers enable service discovery at internet scale.

Well-Known Ports

The Problem: Service Discovery Without Coordination

Consider a scenario where you want to connect to a web server running at IP address 10.0.0.3. You know the machine's network address, but you face a critical problem: Which port is the web server listening on?

The server could be bound to any of 65,535 possible port numbers:

  • Port 80? (Traditional HTTP)
  • Port 8080? (Alternative HTTP)
  • Port 3000? (Common development port)
  • Port 9000? (Arbitrary choice)
  • Some completely random port like 47281?

Without knowing the correct port, you cannot establish a connection. It is analogous to knowing someone's street address but not their apartment number in a massive building with 65,535 apartments.

You could attempt to connect to all 65,535 ports sequentially (this is precisely what port scanning tools like nmap do), but this approach is:

  • Extremely slow (trying all ports could take minutes or hours)
  • Inefficient (wastes network bandwidth and computational resources)
  • Potentially suspicious (may trigger intrusion detection systems)
  • Impractical for everyday use (users expect instant connections)

There must be a superior solution.

The Solution: Standardized Port Assignments

The early architects of the internet solved this problem through an elegantly simple mechanism: social convention. The Internet Assigned Numbers Authority (IANA) maintains an official registry of port number assignments, and the community agrees to follow these standards.

This is not a technical enforcement mechanism—there is no kernel code that prevents you from running an SSH server on port 8080 or an HTTP server on port 22. Instead, it is a social contract: we collectively agree that certain services will use certain ports, making the internet predictable and functional.

Port Number Space Division

The IANA divides the port number space into three categories:

1. Well-Known Ports (0-1023)

These ports are reserved for standard, widely-used services. Key characteristics:

  • Officially registered with IANA for specific protocols
  • Binding to these ports requires elevated privileges (root/administrator) on Unix-like systems
  • This privilege requirement is a security feature: it prevents normal users from impersonating critical services
  • Examples: HTTP (80), HTTPS (443), SSH (22), DNS (53), SMTP (25)

The privilege requirement exists because if any user could bind to port 80, they could impersonate a web server and potentially trick other programs or users into trusting them.

2. Registered Ports (1024-49151)

These ports can be registered with IANA for specific services, but enforcement is less strict. Characteristics:

  • No special privileges required to bind
  • Software vendors often register ports for their applications
  • Examples: MySQL (3306), PostgreSQL (5432), MongoDB (27017), Redis (6379)
  • Applications can use these ports without requiring root access

3. Dynamic/Private/Ephemeral Ports (49152-65535)

These ports are used for temporary client-side connections. Characteristics:

  • Never registered for permanent services
  • Assigned automatically by the operating system when a client initiates an outbound connection
  • The client doesn't choose these ports explicitly
  • Used briefly for the duration of a connection, then released back to the pool

When your web browser connects to a server, it might use local address 192.168.1.50:52341 (your machine + random ephemeral port) connecting to remote address 203.0.113.10:443 (server + standard HTTPS port).

The Social Contract in Practice

When a server administrator configures a service, they listen on the well-known port for that service type. Clients, in turn, assume servers use these standard ports. This mutual agreement eliminates the need for explicit coordination.

Example 1: SSH Connection

When you execute:

ssh admin@server.example.com

The SSH client automatically attempts to connect to port 22. You do not need to specify the port. The client assumes the server follows the convention. Under the hood, this command is equivalent to:

ssh -p 22 admin@server.example.com

The -p 22 is implicit because port 22 is the well-known port for SSH.

Example 2: Web Browsing

When you enter a URL in your browser:

http://example.com

Your browser automatically connects to port 80. Similarly, for:

https://example.com

Your browser connects to port 443. These behaviors are hardcoded into HTTP client software based on the URL scheme.

Breaking the Convention

You are not technically required to follow these conventions. You can run an SSH server on port 8022, a web server on port 3000, or DNS on port 5353. However, breaking conventions creates friction—clients need explicit instructions:

# Non-standard SSH port requires explicit specification
ssh -p 8022 admin@server.example.com

# Non-standard HTTP port must be included in URL
curl http://example.com:3000

Analogy: Well-known ports function like standardized electrical outlets. In Europe, you expect 220V AC outlets with a specific two-prong configuration. You can plug in any device without checking each outlet—you trust the convention. Similarly, you can connect to any SSH server without checking its port—you trust it will be on port 22.

Critical Ports

As a system administrator or developer, you will usually interact with these port numbers:

  • Port 20/21: FTP (File Transfer Protocol) - Legacy, rarely used today
  • Port 22: SSH (Secure Shell) - Remote system access
  • Port 23: Telnet - Legacy insecure remote access (NEVER USE)
  • Port 25: SMTP (Simple Mail Transfer Protocol) - Email sending
  • Port 53: DNS (Domain Name System) - Name resolution
  • Port 80: HTTP (Hypertext Transfer Protocol) - Web traffic unencrypted
  • Port 110: POP3 (Post Office Protocol) - Email retrieval (legacy)
  • Port 143: IMAP (Internet Message Access Protocol) - Modern email retrieval
  • Port 443: HTTPS (HTTP Secure) - Encrypted web traffic
  • Port 3306: MySQL database server
  • Port 5432: PostgreSQL database server
  • Port 6379: Redis in-memory database
  • Port 8080: Alternative HTTP port (often used for development)

This social contract—standardized port numbers—is a foundational element that makes the internet function at global scale. It exemplifies how simple conventions can solve complex coordination problems.

Domain Name System (DNS)

The human-computer impedance mismatch is a fundamental challenge in computing: humans prefer memorable, meaningful names, while computers require numerical addresses. The Domain Name System (DNS) bridges this gap by providing a distributed, hierarchical database that maps human-readable domain names to machine-usable IP addresses (and other information).

Human Memory vs. Computer Addressing

IP addresses are the actual addressing mechanism for network communication. When your computer needs to send packets to a destination, it must know that destination's IP address. However, IP addresses present several problems for human users:

1. Memorization Difficulty

IPv4 addresses like 172.217.14.206 are difficult for humans to remember and type accurately. IPv6 addresses like 2607:f8b0:4004:c07::64 are even worse. While you might remember your own phone number, remembering hundreds of IP addresses for services you use is impractical.

2. Instability and Change

IP addresses can change due to:

  • Infrastructure migrations (moving servers between data centers)
  • Load balancing requirements (distributing traffic across multiple servers)
  • Failover scenarios (switching to backup servers during outages)
  • Dynamic allocation (DHCP assigns different IPs over time)

If users memorized IP addresses, every infrastructure change would break their connections.

3. Lack of Semantic Meaning

An IP address like 198.51.100.42 conveys no information about what service it provides or which organization operates it. The name store.example.com immediately suggests it's a commercial store operated by Example Corporation.

4. Service Multiplexing

A single IP address can host multiple services (using different ports) or multiple websites (using HTTP virtual hosting). The domain name helps identify which specific service the user wants.

The Solution: Domain Name System

DNS solves these problems by creating a globally distributed, hierarchical naming system that maps domain names to IP addresses and other resource records.

Historical Context: Before DNS, the internet used a centralized hosts file (/etc/hosts on Unix systems) maintained by the Network Information Center (NIC) at Stanford Research Institute. Administrators would request IP-to-hostname mappings, and NIC would periodically distribute an updated hosts file to all connected systems. This approach did not scale beyond a few thousand hosts.

In 1983, Paul Mockapetris designed the Domain Name System (RFC 882 and RFC 883, later superseded by RFC 1034 and RFC 1035), which has remained the foundation of internet naming for over 40 years.

DNS Hierarchy: Organizing the Global Namespace

DNS organizes domain names in a hierarchical tree structure, reading from right to left:

www.engineering.example.com.
 |       |          |      |
 |       |          |      └── Root (empty label)
 |       |          └────────── Top-Level Domain (TLD)
 |       └───────────────────── Second-Level Domain (SLD)
 └───────────────────────────── Subdomain(s)

The Root Domain (".")

At the top of the hierarchy is the root domain, represented by an empty label. Fully qualified domain names (FQDNs) technically end with a dot: example.com. The trailing dot is usually omitted in practice, but it represents the root.

There are 13 root name server systems (labeled A through M: a.root-servers.net through m.root-servers.net) operated by different organizations worldwide. These are the authoritative source for information about top-level domains.

Top-Level Domains (TLDs)

TLDs are the highest level of the DNS hierarchy below the root. Categories include:

  • Generic TLDs (gTLDs): .com, .org, .net, .edu, .gov, .mil, .int
  • Country-Code TLDs (ccTLDs): .us (United States), .uk (United Kingdom), .jp (Japan), .de (Germany)
  • New gTLDs: .app, .dev, .tech, .cloud, .blog, etc. (thousands added since 2013)

Each TLD is managed by a registry organization. For example, VeriSign operates the .com and .net TLDs, while Educause operates .edu.

Second-Level Domains (SLDs)

These are the domains that organizations register: example.com, github.com, mit.edu. You register these through domain registrars, who interface with the TLD registries.

Subdomains

Domain owners can create arbitrary subdomains: www.example.com, mail.example.com, api.v2.example.com. Subdomains do not require separate registration—the owner of the parent domain controls all subdomains.

DNS Servers: Authoritative vs. Recursive

Two fundamentally different types of DNS servers perform distinct roles in the DNS infrastructure:

Authoritative DNS Servers

Authoritative servers are the definitive source of truth for a specific zone (portion of the DNS namespace). Characteristics:

  • Responsibility: Provide answers for domains they have explicit authority over
  • Data Source: Return information directly from their configured zone files
  • Behavior: Only answer queries for domains in their zones; refer queries elsewhere for other domains
  • Operation: Typically operated by domain owners or their DNS providers
  • Caching: Do not cache responses for other domains
  • Example: If you own example.com, your authoritative servers know the IP addresses for www.example.com, mail.example.com, etc.

When you configure DNS records for your domain through your registrar or DNS provider, you are updating authoritative servers.

Recursive DNS Resolvers (Recursive Resolvers)

Recursive resolvers perform the DNS resolution process on behalf of clients. Characteristics:

  • Responsibility: Answer any DNS query by recursively querying authoritative servers
  • Data Source: Cache responses and query authoritative servers when cache misses occur
  • Behavior: Accept any query and traverse the DNS hierarchy to find answers
  • Operation: Typically provided by ISPs, corporations, or public DNS services (Google Public DNS 8.8.8.8, Cloudflare 1.1.1.1)
  • Caching: Extensively cache responses to improve performance and reduce load
  • Client-Facing: This is the type of server that end-user devices query

When your computer performs a DNS lookup, it queries a recursive resolver, which then does the work of finding the answer.

Critical Distinction

This distinction is fundamental:

  • Authoritative servers: "I have authority over example.com, and I can definitively tell you that www.example.com is 203.0.113.50"
  • Recursive resolvers: "I don't know the answer to your query, but I'll traverse the DNS hierarchy to find an authoritative server that does know, then give you the answer"

Analogy: An authoritative DNS server is like a property owner who knows everything about their own house. A recursive resolver is like a GPS navigation system that can find directions to any address by consulting various maps and data sources.

DNS Resolution Process: Following the Hierarchy

When you type www.example.com in your browser and press Enter, a complex resolution process occurs behind the scenes. Let us trace this process step by step.

Scenario: Your computer needs to resolve www.example.com to an IP address. Your system is configured to use the recursive resolver at 1.1.1.1 (Cloudflare's public DNS).

Step 1: Check Local Cache

Your operating system maintains a local DNS cache. If www.example.com was recently resolved and the TTL hasn't expired, the cached answer is returned immediately (typically in microseconds). No network query is needed.

If the cache contains no valid entry, proceed to Step 2.

Step 2: Query Recursive Resolver

Your computer sends a DNS query to its configured recursive resolver (1.1.1.1):

Query: What is the IP address for www.example.com?

The recursive resolver checks its own cache. If it has a recent answer, it returns immediately. If not, it begins the recursive resolution process.

Step 3: Query Root Servers

The recursive resolver queries one of the 13 root name servers:

Resolver → Root Server: What is the IP address for www.example.com?

The root server does not know the specific answer but knows which servers are authoritative for the .com TLD:

Root → Resolver: I don't know about www.example.com specifically, 
                 but you can ask the .com TLD servers. Here are their addresses:
                 - a.gtld-servers.net (192.5.6.30)
                 - b.gtld-servers.net (192.33.14.30)
                 [and others...]

This is called a referral—the root server refers the resolver to servers that are more specific.

Step 4: Query TLD Servers

The resolver queries one of the .com TLD servers:

Resolver → .com TLD Server: What is the IP address for www.example.com?

The TLD server knows which name servers are authoritative for example.com:

.com TLD → Resolver: I don't know about www.example.com specifically,
                     but the authoritative servers for example.com are:
                     - ns1.example.com (198.51.100.1)
                     - ns2.example.com (198.51.100.2)

Another referral, this time to the authoritative servers for the domain.

Step 5: Query Authoritative Servers

The resolver queries example.com's authoritative name server:

Resolver → ns1.example.com: What is the IP address for www.example.com?

This server has definitive authority over example.com and provides the answer:

ns1.example.com → Resolver: www.example.com is 203.0.113.50
                           (TTL: 3600 seconds)

Step 6: Return Answer to Client

The recursive resolver returns the answer to your computer:

Resolver → Your Computer: www.example.com is 203.0.113.50

Step 7: Cache at Multiple Levels

The answer is cached:

  • Your operating system caches the result (OS-level cache)
  • Your browser may have its own cache (application-level cache)
  • The recursive resolver caches the result (resolver cache)

Future queries for www.example.com within the TTL period (3600 seconds = 1 hour in this example) will be answered from cache without repeating the full resolution process.

DNS Record Types

DNS stores more than just IP addresses. Different record types serve different purposes:

A Record (Address Record)

Maps a domain name to an IPv4 address.

www.example.com.    IN  A   203.0.113.50

This is the most common record type. When you browse to a website, your browser requests the A record to find the server's IPv4 address.

AAAA Record (IPv6 Address Record)

Maps a domain name to an IPv6 address.

www.example.com.    IN  AAAA    2001:db8::1

Pronounced "quad-A record." As IPv6 adoption increases, these records are becoming more common.

CNAME Record (Canonical Name Record)

Creates an alias from one domain name to another. The resolver follows the chain to find the final IP address.

www.example.com.        IN  CNAME   webserver.example.com.
webserver.example.com.  IN  A       203.0.113.50

Use cases:

  • Simplifying infrastructure changes (update one A record instead of many)
  • Content delivery networks (CDNs): alias your domain to the CDN's domain
  • Load balancing and failover scenarios

MX Record (Mail Exchanger Record)

Specifies the mail servers responsible for receiving email for a domain. Includes a priority number (lower numbers have higher priority).

example.com.    IN  MX  10  mail1.example.com.
example.com.    IN  MX  20  mail2.example.com.

If mail1 is unavailable, the sending server tries mail2. Email delivery depends critically on properly configured MX records.

NS Record (Name Server Record)

Specifies the authoritative name servers for a domain or zone.

example.com.    IN  NS  ns1.example.com.
example.com.    IN  NS  ns2.example.com.

These records delegate authority. When you register a domain, you specify NS records pointing to your DNS provider's servers.

TXT Record (Text Record)

Stores arbitrary text data. Modern uses include:

  • SPF records for email authentication: "v=spf1 include:_spf.example.com ~all"
  • DKIM keys for email signing
  • Domain verification for services (Google, Microsoft, etc.): "google-site-verification=abc123xyz789"
  • DMARC policies for email protection

Example:

example.com.    IN  TXT "v=spf1 include:_spf.google.com ~all"

PTR Record (Pointer Record)

Provides reverse DNS lookup—mapping an IP address to a domain name. Used primarily for email server validation and logging.

50.113.0.203.in-addr.arpa.  IN  PTR mail.example.com.

Mail servers often check PTR records to verify sender legitimacy.

SOA Record (Start of Authority Record)

Defines authoritative information about a DNS zone, including the primary name server, administrator's email, serial number, and timing parameters for zone transfers and caching.

example.com.  IN  SOA ns1.example.com. admin.example.com. (
                      2024010101  ; Serial
                      3600        ; Refresh (1 hour)
                      1800        ; Retry (30 minutes)
                      1209600     ; Expire (2 weeks)
                      86400       ; Minimum TTL (1 day)
                      )

Time To Live (TTL)

Every DNS record includes a TTL (Time To Live) value specified in seconds. This value tells caching servers how long they can store the record before re-querying the authoritative server.

Short TTL (e.g., 60-300 seconds)

Advantages:

  • Changes propagate quickly
  • Useful before planned infrastructure changes
  • Allows rapid failover

Disadvantages:

  • Increases DNS query load
  • Slightly slower for end users (more frequent resolution)
  • Higher bandwidth consumption

Long TTL (e.g., 3600-86400 seconds)

Advantages:

  • Reduces DNS query load significantly
  • Faster for end users (more cache hits)
  • Lower bandwidth consumption
  • Better resilience if authoritative servers become unavailable

Disadvantages:

  • Changes propagate slowly (users may see old data until TTL expires)
  • Failover is slower

Secure Shell (SSH): Encrypted Remote Access

SSH (Secure Shell) is the standard protocol for secure remote access to systems. It replaced the insecure Telnet protocol in the 1990s and has become ubiquitous in system administration, DevOps, and software development.

SSH provides three critical security properties:

  1. Confidentiality: All data (including credentials) is encrypted using symmetric encryption (AES, ChaCha20)
  2. Integrity: Cryptographic MACs (Message Authentication Codes) prevent tampering
  3. Authentication: Public key cryptography verifies server and optionally client identity

SSH Architecture

SSH is not a single protocol but a protocol suite:

  1. SSH-TRANS (Transport Layer Protocol, RFC 4253): Establishes encrypted channel, handles server authentication
  2. SSH-AUTH (Authentication Protocol, RFC 4252): Handles client authentication (password, public key, etc.)
  3. SSH-CONN (Connection Protocol, RFC 4254): Multiplexes multiple channels (terminal session, port forwarding, etc.) over one TCP connection

Standard Port: TCP port 22 (well-known port)

Connection Establishment Sequence:

1. TCP three-way handshake establishes connection
2. SSH version exchange (both sides advertise SSH protocol version)
3. Algorithm negotiation (agree on encryption, MAC, compression algorithms)
4. Diffie-Hellman key exchange generates session keys
5. Server authenticates itself using its host key
6. Client authenticates itself (password or public key)
7. Encrypted session established, commands can be executed

SSH Authentication Methods

SSH supports multiple authentication methods, which can be combined or used exclusively:

1. Password Authentication

The client sends the password over the encrypted SSH connection (not plaintext like Telnet). While the password is protected in transit, password authentication has weaknesses:

  • Vulnerable to brute-force attacks if weak passwords are used
  • Requires users to memorize or manage passwords
  • Passwords can be stolen through phishing, keyloggers, or social engineering
  • No way to distinguish different clients (all use the same password)

2. Public Key Authentication (Recommended)

Uses asymmetric cryptography:

  • Client generates a public/private key pair
  • Public key is placed on the server in ~/.ssh/authorized_keys
  • Private key remains on the client and never leaves the system
  • Server challenges client to prove possession of the private key without transmitting it

Advantages:

  • Strong cryptographic security (typically 2048-4096 bit RSA or 256 bit Ed25519)
  • No password to steal or forget
  • Can use different keys for different purposes
  • Keys can be protected with passphrases
  • Supports automation (keys without passphrases for scripts)

Modern Best Practice: Disable password authentication entirely, allowing only public key authentication.

Public Key Cryptography Review

Since public key authentication is critical to SSH security, let us review the cryptographic foundation:

Asymmetric Cryptography Properties:

  1. Two mathematically related keys: public key and private key
  2. Public key can be shared openly
  3. Private key must be kept secret
  4. Data encrypted with public key can only be decrypted with private key
  5. Data signed with private key can be verified with public key

SSH Authentication Protocol:

  1. Client sends username and public key to server
  2. Server checks if public key is in ~/.ssh/authorized_keys for that user
  3. Server generates random challenge, encrypts it with client's public key, sends to client
  4. Client decrypts challenge with private key, computes hash, sends hash back
  5. Server verifies hash matches expected value
  6. If match: authentication succeeds (client proved possession of private key)

The private key never travels across the network. The server cannot impersonate the client because it only has the public key.

SSH Key Types

Modern SSH supports several key types with different security and performance characteristics:

RSA (Rivest-Shamir-Adleman)

  • Traditional and widely supported
  • Key size: 2048 or 4096 bits (3072 bits is middle ground)
  • Slower than newer algorithms
  • Still secure if key size is adequate (≥2048 bits)
  • Command: ssh-keygen -t rsa -b 4096

Ed25519 (Edwards-curve Digital Signature Algorithm)

  • Modern elliptic curve algorithm (introduced ~2013)
  • Key size: 256 bits (much smaller than RSA)
  • Faster than RSA
  • Strong security properties
  • Command: ssh-keygen -t ed25519

ECDSA (Elliptic Curve Digital Signature Algorithm)

  • Older elliptic curve algorithm
  • Key sizes: 256, 384, or 521 bits
  • Concerns about NSA influence on NIST curves
  • Ed25519 generally preferred over ECDSA for new keys
  • Command: ssh-keygen -t ecdsa -b 256

Best Practice: Use Ed25519 for new SSH keys. It offers excellent security with small key sizes and fast performance. Fall back to RSA 4096 only if connecting to older systems that don't support Ed25519.

SSH Host Key Verification: Trust On First Use (TOFU)

One of SSH's critical security features is host key verification, which protects against man-in-the-middle (MITM) attacks. This mechanism uses a "Trust On First Use" (TOFU) model.

The MITM Threat

Consider this attack scenario:

[Your Computer] ←→ [Attacker's Computer] ←→ [Real Server]

The attacker intercepts your connection, relays your commands to the real server, and relays responses back to you. From your perspective, everything appears normal, but the attacker can:

  • Log all your commands and responses
  • Steal credentials if you authenticate with passwords
  • Modify commands before forwarding them
  • Inject malicious commands

This is called a man-in-the-middle (MITM) attack.

SSH's Defense: Host Keys

Every SSH server has a unique host key (a public/private key pair) that identifies the server. The server's public host key is its cryptographic identity.

When you first connect to a server, SSH shows:

The authenticity of host 'server.example.com (203.0.113.50)' can't be established.
ED25519 key fingerprint is SHA256:abcd1234efgh5678ijkl9012mnop3456qrst7890uvwx.
Are you sure you want to continue connecting (yes/no/[fingerprint])?

This is the TOFU moment. SSH is asking: "I've never seen this server before. Do you trust this host key?"

If you type "yes":

SSH stores the server's host key in ~/.ssh/known_hosts:

server.example.com,203.0.113.50 ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAbc123...

On subsequent connections, SSH verifies the server presents the same host key. If the host key changes, SSH displays an alarming warning:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!

Why the Warning?

A changed host key indicates one of several scenarios:

  1. Legitimate: Server was reinstalled, host key regenerated
  2. Legitimate: Server moved to new hardware
  3. Malicious: MITM attack in progress
  4. Configuration Error: Connecting to wrong server

SSH conservatively assumes the worst-case scenario and refuses the connection.

The TOFU Model

"Trust On First Use" means:

  • First connection: You must manually verify the host key (typically by checking a fingerprint through an out-of-band channel)
  • Subsequent connections: SSH automatically verifies the host key matches the stored value

Comparison to PKI (Lab 8)

Recall from Lab 8 that TLS uses a Public Key Infrastructure (PKI) with Certificate Authorities (CAs). Let us contrast the two models:

PKI/CA Model (HTTPS/TLS):

  • Server presents a certificate signed by a trusted CA
  • Your browser has a list of trusted CA public keys
  • Browser verifies certificate signature cryptographically
  • No manual verification needed on first connection
  • Scales well: one CA can sign millions of certificates

TOFU Model (SSH):

  • Server presents its public host key
  • No centrally trusted authority
  • You must manually verify on first connection (or accept risk)
  • Subsequent connections are automatically verified
  • Simpler infrastructure, but first-use verification is user's responsibility

Why SSH Uses TOFU Instead of PKI:

  1. SSH predates widespread PKI adoption
  2. No consensus on which CAs to trust for SSH
  3. PKI adds complexity and cost (certificate purchase/management)
  4. TOFU works well for SSH's primary use case (administrators connecting to their own servers)
  5. SSH certificates exist but are less common than HTTPS certificates

Best Practice: On first connection, verify the host key fingerprint through a trusted channel:

  • Check the fingerprint displayed in server documentation
  • View the fingerprint on the server directly: ssh-keygen -lf /etc/ssh/ssh_host_ed25519_key.pub
  • Compare with fingerprint shown by SSH client

For production systems, always verify host keys. For test/lab environments, the risk is lower.

SSH Known Hosts Format

The ~/.ssh/known_hosts file stores server host keys. Format:

hostname,ip algorithm public_key

Example:

server.example.com,203.0.113.50 ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAbc123...

Hashed Format: Some systems hash the hostname for privacy:

|1|base64hash|base64hash ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAbc123...

This prevents someone with access to your known_hosts file from learning which servers you connect to.

Wildcard Entries: You can use wildcards:

*.example.com ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAbc123...

Management: Remove entries when servers are legitimately reinstalled:

ssh-keygen -R server.example.com

SSH Authorized Keys

The ~/.ssh/authorized_keys file on the server determines which public keys can authenticate as a specific user.

Format: One public key per line:

ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIdef456... user@laptop
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQCghi789... user@desktop

The trailing comment (user@laptop) is optional but helps identify keys.

Permissions: SSH is strict about file permissions for security:

chmod 700 ~/.ssh              # Directory: rwx------
chmod 600 ~/.ssh/authorized_keys  # File: rw-------

If permissions are too permissive, SSH may refuse to use the keys.

Restricting Keys: You can add restrictions before keys:

command="backup.sh" ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIdef456... backup-key
from="203.0.113.0/24" ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIghi789... admin-key
  • command="backup.sh": This key can only run backup.sh (useful for automation)
  • from="203.0.113.0/24": This key only works from specified IP range

SSH Agent: For convenience, use ssh-agent to cache decrypted private keys:

eval $(ssh-agent)
ssh-add ~/.ssh/id_ed25519

The agent holds the unlocked private key in memory, allowing multiple connections without re-entering the passphrase.

HyperText Transfer Protocol (HTTP): The Universal Application Protocol

HTTP (HyperText Transfer Protocol) was initially designed in 1991 by Tim Berners-Lee to transfer hypertext documents (HTML web pages). Over three decades, HTTP has evolved far beyond its original purpose to become the universal protocol for nearly all application-level communication on the internet.

Why HTTP is Everywhere

1. Simplicity

HTTP uses human-readable text commands:

GET /api/users HTTP/1.1
Host: api.example.com

This simplicity makes HTTP easy to:

  • Debug (you can read the protocol)
  • Implement (libraries exist for every language)
  • Extend (add custom headers easily)

2. Statelessness

Each HTTP request is independent—the server doesn't need to maintain state between requests. This design decision has profound implications:

  • Servers can handle millions of clients without per-client memory
  • Horizontal scaling is trivial (add more servers, any server can handle any request)
  • Failures don't cascade (if a request fails, retry without state recovery)
  • Caching is straightforward (each request is independent)

State management, when needed, is handled at the application layer (cookies, sessions, tokens).

3. Flexible Content

HTTP can transfer any type of data:

  • HTML documents
  • JSON API responses
  • XML data
  • Binary files (images, videos, executables)
  • Streaming data

The Content-Type header specifies what the body contains:

Content-Type: application/json
Content-Type: text/html
Content-Type: image/png

4. Firewall Friendliness

HTTP (port 80) and HTTPS (port 443) are almost universally allowed through firewalls. Companies may block many ports, but they cannot block web traffic without breaking internet access. Applications using HTTP/HTTPS thus avoid firewall issues that plague custom protocols.

5. Tooling and Infrastructure

Decades of investment have created rich ecosystems:

  • Web servers: Apache, Nginx, Caddy, IIS
  • Reverse proxies and load balancers
  • CDNs (Content Delivery Networks)
  • Caching proxies
  • API gateways
  • Monitoring and debugging tools

The Result: HTTP has become the default choice for application communication. APIs that might have used custom TCP protocols now use HTTP-based REST or GraphQL. Real-time protocols that might have used custom UDP protocols now use WebSockets (which upgrade HTTP connections). Even protocols that don't naturally map to HTTP often use it anyway for operational simplicity.

HTTP Request-Response Model

HTTP uses a simple request-response pattern:

Client sends HTTP Request:

METHOD PATH VERSION
Headers
(blank line)
Body (optional)

Server sends HTTP Response:

VERSION STATUS_CODE STATUS_MESSAGE
Headers
(blank line)
Body (optional)

Example Exchange:

Request:

GET /api/users/123 HTTP/1.1
Host: api.example.com
User-Agent: curl/7.68.0
Accept: application/json

Response:

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 58

{"id":123,"name":"Alice","email":"alice@example.com"}

HTTP Methods (Verbs)

HTTP defines several methods that indicate the desired action:

GET: Retrieve a resource

  • Should not modify server state (idempotent and safe)
  • Can be cached
  • Examples: Load web page, fetch API data, download file

POST: Submit data to create a new resource

  • May modify server state
  • Not idempotent (repeating creates multiple resources)
  • Examples: Submit form, create new user via API

PUT: Update an existing resource (or create if doesn't exist)

  • Idempotent (repeating has same effect as doing once)
  • Replaces entire resource
  • Examples: Update user profile, upload file

PATCH: Partially update a resource

  • Apply partial modifications
  • Not necessarily idempotent (depends on implementation)
  • Examples: Update single field of a record

DELETE: Remove a resource

  • Idempotent (deleting twice has same effect as deleting once)
  • Examples: Delete user account, remove file

HEAD: Same as GET but returns only headers (no body)

  • Check if resource exists
  • Check content size before downloading
  • Verify cache freshness

OPTIONS: Query supported methods for a resource

  • Used for CORS (Cross-Origin Resource Sharing) preflight requests

HTTP Status Codes

The status code indicates the result of the request. Codes are grouped:

1xx: Informational

  • 100 Continue: Client should continue with request
  • 101 Switching Protocols: Server switching protocols (e.g., WebSocket upgrade)

2xx: Success

  • 200 OK: Request succeeded
  • 201 Created: Resource created successfully
  • 204 No Content: Success but no response body

3xx: Redirection

  • 301 Moved Permanently: Resource moved, update bookmarks
  • 302 Found: Temporary redirect
  • 304 Not Modified: Cached version is still valid

4xx: Client Error

  • 400 Bad Request: Malformed request
  • 401 Unauthorized: Authentication required
  • 403 Forbidden: Authenticated but not authorized
  • 404 Not Found: Resource doesn't exist
  • 429 Too Many Requests: Rate limit exceeded

5xx: Server Error

  • 500 Internal Server Error: Server encountered error
  • 502 Bad Gateway: Proxy received invalid response from upstream
  • 503 Service Unavailable: Server temporarily overloaded or down
  • 504 Gateway Timeout: Proxy timeout waiting for upstream

HTTP Headers

Headers provide metadata about the request or response. Some critical headers:

Request Headers:

  • Host: api.example.com — Required in HTTP/1.1, specifies target hostname
  • User-Agent: curl/7.68.0 — Identifies client software
  • Accept: application/json — Content types client understands
  • Authorization: Bearer token123 — Authentication credentials
  • Content-Type: application/json — Type of request body
  • Content-Length: 58 — Size of request body in bytes

Response Headers:

  • Content-Type: application/json — Type of response body
  • Content-Length: 1234 — Size of response body
  • Cache-Control: max-age=3600 — Caching directives
  • Set-Cookie: session=abc123; HttpOnly — Set cookie in browser
  • Location: /api/users/456 — Redirect target (with 3xx status)
  • Server: nginx/1.18.0 — Server software (often hidden for security)

Security Headers (modern best practices):

  • Strict-Transport-Security: max-age=31536000 — Force HTTPS
  • Content-Security-Policy: default-src 'self' — Mitigate XSS attacks
  • X-Frame-Options: DENY — Prevent clickjacking

Virtual Hosting: Multiple Sites on One IP

HTTP's Host header enables virtual hosting—running multiple websites on a single IP address. The web server examines the Host header to determine which site the request targets:

GET / HTTP/1.1
Host: site1.example.com

vs.

GET / HTTP/1.1
Host: site2.example.com

Both requests go to the same IP address (e.g., 203.0.113.50) and same port (80 or 443), but the Host header tells the server which site to serve. This is how shared hosting providers can host thousands of websites on relatively few servers.

HTTPS Complication: Virtual hosting works seamlessly for HTTP, but HTTPS initially had issues because the TLS handshake occurs before the HTTP request (so the server doesn't know which certificate to present). Server Name Indication (SNI), added to TLS in 2003, solved this by including the hostname in the TLS handshake.

HTTP/1.1 vs. HTTP/2 vs. HTTP/3

HTTP/1.1 (1997, RFC 2616, updated RFC 7230-7235):

  • Text-based protocol
  • One request per connection (or serial requests with keep-alive)
  • Head-of-line blocking (one slow request delays all subsequent requests)
  • Still widely used

HTTP/2 (2015, RFC 7540):

  • Binary protocol (more efficient parsing)
  • Multiplexing (multiple concurrent requests on one connection)
  • Server push (server can send resources before client requests them)
  • Header compression (reduces overhead)
  • Requires HTTPS in practice (browsers only support HTTP/2 over TLS)
  • Growing adoption

HTTP/3 (2022, RFC 9114):

  • Uses QUIC instead of TCP (UDP-based transport with built-in TLS)
  • Reduces connection establishment latency
  • Better handling of packet loss
  • Connection migration (survive IP address changes)
  • Newest, increasing adoption

For this lab, we use HTTP/1.1 for simplicity, but modern production systems increasingly deploy HTTP/2 and HTTP/3.

Reverse Proxies: The Modern Web Architecture Pattern

A reverse proxy is a server that sits in front of backend servers and forwards client requests to them. The client believes it is talking directly to the origin server, unaware of the proxy's existence.

Terminology Clarification:

  • Forward Proxy: Client knows about proxy, uses it to access external servers (e.g., corporate proxy for internet access)
  • Reverse Proxy: Client unaware of proxy, thinks it's talking to the origin server directly

Reverse Proxy Benefits

1. Load Balancing

Distribute requests across multiple backend servers:

Client → Reverse Proxy → [Backend 1]
                      → [Backend 2]
                      → [Backend 3]

Strategies:

  • Round-robin: Cycle through backends in order
  • Least connections: Send to backend with fewest active connections
  • IP hash: Always send same client to same backend (session affinity)

2. TLS Termination

The reverse proxy handles TLS encryption/decryption:

Client ←(HTTPS)→ Reverse Proxy ←(HTTP)→ Backend Servers

Benefits:

  • Backend servers don't need TLS configuration or certificates
  • Centralized certificate management
  • Reduces computational load on backends (TLS crypto is expensive)
  • Simplifies backend server configuration

3. Caching

Cache responses to reduce backend load:

Client → Reverse Proxy (check cache) → Backend (if cache miss)

Subsequent requests for the same resource are served from cache without hitting backends. Can dramatically improve performance and reduce costs.

4. Compression

The proxy can compress responses (gzip, brotli) before sending to clients, reducing bandwidth:

Backend → Proxy (compress) → Client

5. Security

  • Hide backend server details (IP addresses, software versions)
  • Web Application Firewall (WAF) functionality
  • Rate limiting and DDoS protection
  • Centralized logging and monitoring

6. Path-Based Routing (Critical for Microservices)

Route requests to different backends based on URL path:

client request /api/users    → Reverse Proxy → User Service
client request /api/orders   → Reverse Proxy → Order Service
client request /api/products → Reverse Proxy → Product Service

From the client's perspective, everything is at one domain (e.g., api.example.com). The reverse proxy routes to appropriate microservices based on path. This is the architectural pattern that enables modern microservices.

Path-Based Routing in Depth

Path-based routing is the killer feature that made HTTP the universal protocol. Let us examine why:

The Problem Without Path-Based Routing:

Suppose you have three backend services:

  • User service (manages user accounts)
  • Order service (manages orders)
  • Product service (manages product catalog)

Without path-based routing, clients must know the specific addresses of each service:

user-service.example.com/users
order-service.example.com/orders
product-service.example.com/products

Problems:

  1. Clients must maintain knowledge of all services
  2. Adding/removing/renaming services breaks clients
  3. Difficult to manage CORS (Cross-Origin Resource Sharing) with multiple domains
  4. More complex DNS management
  5. More complex TLS certificate management (separate cert for each service)

The Solution With Path-Based Routing:

One unified API endpoint:

api.example.com/users     → routes to User Service
api.example.com/orders    → routes to Order Service
api.example.com/products  → routes to Product Service

Reverse proxy configuration:

if path starts with /users    → forward to user-service:3001
if path starts with /orders   → forward to order-service:3002
if path starts with /products → forward to product-service:3003

Benefits:

  1. Clients see one consistent API domain
  2. Backend services can change without client awareness
  3. One TLS certificate for all services
  4. Simplified CORS configuration
  5. Centralized authentication and rate limiting

This is How Modern Systems Work:

  • Kubernetes: Ingress controllers route paths to services
  • AWS: Application Load Balancers route paths to target groups
  • Microservices: API Gateway routes paths to microservices
  • Netflix, Uber, Airbnb: All use path-based routing at scale

You will implement this pattern in the exercises.

Evolution from Monoliths to Microservices

Understanding path-based routing requires understanding the architectural evolution:

Monolithic Architecture (Traditional):

Single Application
├── User Management
├── Order Processing
├── Product Catalog
├── Payment Processing
└── Notification System

All functionality in one codebase, runs as one process. Advantages: simple deployment, shared database, no network calls between components. Disadvantages: hard to scale specific components, one bug can crash entire system, difficult to update independently.

Microservices Architecture (Modern):

User Service      ──┐
Order Service     ──┤
Product Service   ──├── Reverse Proxy ── Clients
Payment Service   ──┤
Notification Svc  ──┘

Each service is independent: separate codebase, separate process, separate deployment, separate scaling. Advantages: scale components independently, update services without affecting others, use different technologies per service, isolated failures. Disadvantages: increased operational complexity, network latency between services, distributed system challenges.

The Reverse Proxy's Role: Makes microservices look like a monolith to clients. Clients don't know or care that backend is distributed—they just make requests to one API endpoint.

Observability and Debugging

Modern reverse proxies provide rich observability:

Access Logs: Record every request:

203.0.113.42 - - [05/Dec/2024:14:23:45 +0000] "GET /api/users HTTP/1.1" 200 1234 "-" "curl/7.68.0"

Fields: client IP, timestamp, method, path, protocol version, status code, response size, referer, user-agent

Error Logs: Record issues:

[error] 2024/12/05 14:23:50 upstream timed out (110: Connection timed out) while connecting to upstream

Metrics: Request rate, error rate, latency percentiles (p50, p95, p99), upstream health

Distributed Tracing: Track requests across services using trace IDs in headers:

X-Request-ID: 7f2a3b4c-8d9e-4f1a-b2c3-d4e5f6a7b8c9

When troubleshooting issues, these logs and metrics are invaluable.

Laboratory Exercises

Exercise A: DNS Resolution and Analysis

Objective: Use DNS client tools to query various record types and observe the DNS resolution process. Understand the difference between authoritative and recursive queries, analyze TTL values, and measure resolution performance.

Required Topology: You must have the Red and Blue namespaces from Lab 8 with IP addresses 10.0.0.1 and 10.0.0.2 respectively, connected via bridge br0. Verify connectivity before proceeding:

sudo ip netns exec red ping -c 2 10.0.0.2

If this fails, recreate your Lab 8 topology before continuing.

Basic DNS Queries with dig

The dig (Domain Information Groper) tool is the primary DNS debugging utility. It provides detailed information about DNS queries and responses.

Step 1: Verify dig is installed:

dig -v

Expected output:

DiG 9.18.1-1ubuntu1.3-Ubuntu

If not installed: sudo apt install dnsutils

Step 2: Perform a basic A record query for google.com:

dig google.com

Expected output (abbreviated):

; <<>> DiG 9.18.1-1ubuntu1.3-Ubuntu <<>> google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12345
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;google.com.                    IN      A

;; ANSWER SECTION:
google.com.             163     IN      A       142.250.185.46

;; Query time: 23 msec
;; SERVER: 127.0.0.53#53(127.0.0.53) (UDP)
;; WHEN: Fri Dec 05 14:30:22 UTC 2024
;; MSG SIZE  rcvd: 55

Analysis:

Let us dissect this output:

Header Section:

  • status: NOERROR — Query succeeded
  • id: 12345 — Random query ID for matching requests/responses
  • flags: qr rd ra
    • qr: Query Response (this is a response, not a query)
    • rd: Recursion Desired (client requested recursive resolution)
    • ra: Recursion Available (server supports recursive queries)

Question Section:

;google.com.                    IN      A

This shows what we asked for:

  • google.com. — Domain name (with trailing dot, the FQDN)
  • IN — Internet class (as opposed to other historical DNS classes)
  • A — Record type (IPv4 address)

Answer Section:

google.com.             163     IN      A       142.250.185.46

This is the answer:

  • google.com. — Domain queried
  • 163 — TTL (Time To Live) in seconds, countdown starts when received
  • IN — Internet class
  • A — Record type
  • 142.250.185.46 — The IPv4 address

Metadata:

  • Query time: 23 msec — Time to get response (includes network latency and resolution time)
  • SERVER: 127.0.0.53#53 — DNS server queried (systemd-resolved on Ubuntu)
  • MSG SIZE rcvd: 55 — Response packet size in bytes

Step 3: Query a specific DNS server directly (bypass systemd-resolved):

dig @8.8.8.8 google.com

The @8.8.8.8 specifies Google Public DNS. Compare the query time to the previous result.

Step 4: Query an IPv6 address (AAAA record):

dig google.com AAAA

Expected output (answer section):

;; ANSWER SECTION:
google.com.             299     IN      AAAA    2607:f8b0:4004:c07::66

Note: You might see multiple AAAA records if Google has multiple IPv6 addresses for redundancy.

Step 5: Query mail server records (MX records):

dig google.com MX

Expected output (answer section):

;; ANSWER SECTION:
google.com.             3599    IN      MX      10 smtp.google.com.

The number (10) is the priority. Lower numbers have higher priority. If multiple MX records exist, mail servers try the lowest priority number first.

Step 6: Query name server records (NS records):

dig google.com NS

Expected output (answer section):

;; ANSWER SECTION:
google.com.             21599   IN      NS      ns1.google.com.
google.com.             21599   IN      NS      ns2.google.com.
google.com.             21599   IN      NS      ns3.google.com.
google.com.             21599   IN      NS      ns4.google.com.

These are Google's authoritative name servers. These servers have definitive authority over google.com records.

Step 7: Query text records (TXT records):

dig google.com TXT

Expected output (answer section):

;; ANSWER SECTION:
google.com.             3599    IN      TXT     "v=spf1 include:_spf.google.com ~all"
google.com.             3599    IN      TXT     "facebook-domain-verification=22rm551cu4k0ab0bxsw536tlds4h95"
google.com.             3599    IN      TXT     "google-site-verification=TV9-DBe4R80X4v0M4U_bd_J9cpOJM0nikft0jAgjmsQ"

TXT records store arbitrary text. Common uses: SPF records for email authentication, domain verification for services, DKIM public keys.

Step 8: Request short output format:

dig +short google.com

Expected output:

142.250.185.46

Just the IP address, no verbose information. Useful for scripts.

Step 9: Trace the DNS resolution path:

dig +trace google.com

Expected output (abbreviated):

.                       518400  IN      NS      a.root-servers.net.
.                       518400  IN      NS      b.root-servers.net.
[... other root servers ...]

;; Received 239 bytes from 8.8.8.8#53 in 23 ms

com.                    172800  IN      NS      a.gtld-servers.net.
com.                    172800  IN      NS      b.gtld-servers.net.
[... other .com TLD servers ...]

;; Received 1173 bytes from 192.5.5.241#53(a.root-servers.net) in 87 ms

google.com.             172800  IN      NS      ns1.google.com.
google.com.             172800  IN      NS      ns2.google.com.
[... other google.com name servers ...]

;; Received 836 bytes from 192.12.94.30#53(e.gtld-servers.net) in 103 ms

google.com.             300     IN      A       142.250.185.46

;; Received 55 bytes from 216.239.32.10#53(ns1.google.com) in 11 ms

This shows the complete resolution process:

  1. Query root servers → get .com TLD servers
  2. Query .com TLD servers → get google.com authoritative servers
  3. Query google.com authoritative servers → get final answer

Deliverable A

Submit screenshots showing:

  1. Output of dig commands for A, AAAA, MX, NS, and TXT records for two different domains (suggestion: use a domain like station01.internal.arh.pub.ro, or stud.etti.pub.ro for MX)
  2. Output of dig +trace for any domain
  3. Output of two consecutive dig queries showing TTL countdown

Exercise B: SSH Configuration and Key-Based Authentication

Objective: Configure SSH server and client, generate Ed25519 key pairs, implement key-based authentication, understand the Trust On First Use (TOFU) security model, and observe host key verification

Part 1: Verifying SSH Installation

Step 1: Check if SSH server is installed:

dpkg -l | grep openssh-server

Expected output:

ii  openssh-server  1:8.9p1-3ubuntu0.4  amd64  secure shell (SSH) server

If not installed:

sudo apt install openssh-server openssh-client

Step 2: Check SSH service status:

sudo systemctl status sshd

If not running:

sudo systemctl start sshd
sudo systemctl enable sshd

Part 2: Generating SSH Key Pairs

We'll generate Ed25519 keys because they offer strong security with small key sizes and fast performance.

Step 1: Generate a key pair as your current user:

ssh-keygen -t ed25519 -C "lab-key"

Expected interaction:

Generating public/private ed25519 key pair.
Enter file in which to save the key (/home/youruser/.ssh/id_ed25519): [press Enter]
Enter passphrase (empty for no passphrase): [press Enter for no passphrase]
Enter same passphrase again: [press Enter]
Your identification has been saved in /home/youruser/.ssh/id_ed25519
Your public key has been saved in /home/youruser/.ssh/id_ed25519.pub
The key fingerprint is:
SHA256:xyz789abc123def456ghi789jkl012mno345pqr678 lab-key

Key Decisions:

  • File location: Accept default (~/.ssh/id_ed25519)
  • Passphrase: For this lab, skip the passphrase for simplicity. In production, always use a strong passphrase to protect the private key. A passphrase encrypts the private key file—even if an attacker steals the file, they cannot use it without the passphrase.

Step 2: Examine the generated keys:

ls -la ~/.ssh/

Expected output:

total 16
drwx------ 2 youruser youruser 4096 Dec  5 14:30 .
drwxr-x--- 8 youruser youruser 4096 Dec  5 14:30 ..
-rw------- 1 youruser youruser  411 Dec  5 14:30 id_ed25519
-rw-r--r-- 1 youruser youruser   99 Dec  5 14:30 id_ed25519.pub

Critical: Private key (id_ed25519) has mode 600 (readable/writable only by owner). Public key (.pub) has mode 644 (world-readable).

Step 3: View your public key:

cat ~/.ssh/id_ed25519.pub

This public key can be freely shared. You'll copy this to accounts you want to access.

Part 3: Setting Up Key-Based Authentication

Configure SSH so you can connect to the sshtest account using your key.

Step 1: Copy your public key to the test user's authorized_keys:

cat ~/.ssh/id_ed25519.pub ~/.ssh/authorized_keys
sudo chmod 600 /home/.ssh/authorized_keys

Step 2: Verify the setup:

sudo ls -la /home/sshtest/.ssh/
sudo cat ~/.ssh/authorized_keys

Critical Permissions:

  • ~/.ssh/: 700 (drwx------)
  • authorized_keys: 600 (-rw-------)
  • Owner must be the target user

If permissions are wrong, SSH will silently ignore the keys and fall back to password authentication.

Part 4: First Connection - Trust On First Use (TOFU)

Step 1: Connect to localhost as the test user:

ssh $USER@localhost

Expected output (first connection):

The authenticity of host 'localhost (127.0.0.1)' can't be established.
ED25519 key fingerprint is SHA256:abcd1234efgh5678ijkl9012mnop3456qrst7890uvwx.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])?

This is the TOFU (Trust On First Use) moment.

SSH is asking you to verify the server's identity. In production, you would:

  1. Check the server's host key fingerprint: ssh-keygen -lf /etc/ssh/ssh_host_ed25519_key.pub
  2. Compare the fingerprint displayed by SSH with the server's actual fingerprint
  3. If they match, type yes. If they don't match, do not proceed (possible MITM attack)

For this lab, type yes:

Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'localhost' (ED25519) to the list of known hosts.
sshtest@hostname:~$

You are now connected! The connection succeeded using key-based authentication (no password prompt).

Step 2: Explore the connection:

whoami
pwd
hostname

Step 3: Exit the SSH session:

exit

Part 5: Understanding known_hosts

Step 1: Examine the known_hosts file:

cat ~/.ssh/known_hosts

Expected output:

localhost ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAbc123def456ghi789jkl012mno345pqr678stu901vwx

This file stores trusted host keys. Format: hostname/ip algorithm public_key

On subsequent connections, SSH verifies the server presents the same host key. If the key changes, SSH displays a warning.

Step 2: Connect again:

ssh $USER@localhost

Expected behavior: No host key prompt this time—immediate connection. SSH verified the host key matches the one in known_hosts.

Step 3: Exit:

exit

Part 6: Simulating a Host Key Change

We'll deliberately change the server's host key to see SSH's security warning.

Step 1: Regenerate the host key:

sudo rm /etc/ssh/ssh_host_ed25519_key*
sudo ssh-keygen -t ed25519 -f /etc/ssh/ssh_host_ed25519_key -N ""
sudo systemctl restart sshd

Step 2: Attempt to connect:

ssh $USER@localhost

Expected output:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ED25519 key sent by the remote host is
SHA256:new_fingerprint_here.
Please contact your system administrator.
Add correct host key in /home/youruser/.ssh/known_hosts to get rid of this message.
Offending ED25519 key in /home/youruser/.ssh/known_hosts:1
  remove with:
  ssh-keygen -f "/home/youruser/.ssh/known_hosts" -R "localhost"
Host key for localhost has changed and you have requested strict checking.
Host key verification failed.

SSH refuses to connect! This is the security mechanism in action. SSH detected that the server's host key changed, which could indicate:

  1. Legitimate server reinstallation
  2. Man-in-the-middle attack

SSH conservatively assumes the worst case and blocks the connection.

Step 3: Remove the old host key:

ssh-keygen -f ~/.ssh/known_hosts -R "localhost"

Expected output:

# Host localhost found: line 1
/home/youruser/.ssh/known_hosts updated.
Original contents retained as /home/youruser/.ssh/known_hosts.old

Step 4: Reconnect (you'll see the TOFU prompt again):

ssh $USER@localhost

Type yes to trust the new host key.

Deliverable B

Submit screenshots showing:

  1. The TOFU prompt on first SSH connection
  2. Contents of ~/.ssh/id_ed25519.pub
  3. Contents of ~/.ssh/authorized_keys
  4. Contents of ~/.ssh/known_hosts
  5. The "WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!" message
  6. Successful SSH connection after resolving the host key change

Exercise C: HTTP Services and Reverse Proxy

Objective: Build simple HTTP backend services in network namespaces and configure Caddy as a reverse proxy with path-based routing.

Part 1: Installing Caddy

Step 1: Install Caddy:

sudo apt update
sudo apt install -y caddy curl

Step 2: Verify installation:

caddy version

Expected output: v2.6.2 or similar

Part 2: Creating Simple Backend Services

We'll create basic file-serving HTTP servers in the Red and Blue namespaces.

Step 1: Create content directories and files for Red service:

# Create directory for Red service
mkdir -p ~/red-service
cd ~/red-service

# Create an index file
cat > index.html << 'EOF'
<!DOCTYPE html>
<html>
<head><title>Red Service</title></head>
<body>
    <h1>Red Service</h1>
    <p>Hello from Red namespace!</p>
    <p>IP: 10.0.0.1 | Port: 8001</p>
</body>
</html>
EOF

# Create an API response file
mkdir -p api
cat > api/data.json << 'EOF'
{
  "service": "Red Service",
  "namespace": "red",
  "ip": "10.0.0.1",
  "status": "running"
}
EOF

Step 2: Start Red service in its namespace:

# Start HTTP server in Red namespace (run in background with &)
sudo ip netns exec red python3 -m http.server 8001 --bind 10.0.0.1 --directory ~/red-service &

Expected output: Serving HTTP on 10.0.0.1 port 8001 (http://10.0.0.1:8001/) ...

Step 3: Create content for Blue service:

# Create directory for Blue service
mkdir -p ~/blue-service
cd ~/blue-service

# Create an index file
cat > index.html << 'EOF'
<!DOCTYPE html>
<html>
<head><title>Blue Service</title></head>
<body>
    <h1>Blue Service</h1>
    <p>Hello from Blue namespace!</p>
    <p>IP: 10.0.0.2 | Port: 8002</p>
</body>
</html>
EOF

# Create an API response file
mkdir -p api
cat > api/data.json << 'EOF'
{
  "service": "Blue Service",
  "namespace": "blue",
  "ip": "10.0.0.2",
  "status": "running"
}
EOF

Step 4: Start Blue service in its namespace:

# Start HTTP server in Blue namespace
sudo ip netns exec blue python3 -m http.server 8002 --bind 10.0.0.2 --directory ~/blue-service &

Step 5: Test the backend services directly:

# Test Red service
curl http://10.0.0.1:8001/

# Test Blue service
curl http://10.0.0.2:8002/

# Test JSON endpoints
curl http://10.0.0.1:8001/api/data.json
curl http://10.0.0.2:8002/api/data.json

You should see the HTML and JSON content from each service.

Part 3: Configuring Caddy Reverse Proxy

Step 1: Create a Caddyfile:

cd ~
cat > Caddyfile << 'EOF'
# Caddy Reverse Proxy Configuration

:8080 {
    # Health check endpoint
    handle /health {
        respond "OK - API Gateway Running" 200
    }

    # Root path
    handle / {
        respond "API Gateway - Use /red/ or /blue/ paths" 200
    }

    # Route /red/* to Red service (strips /red prefix)
    handle_path /red/* {
        reverse_proxy 10.0.0.1:8001
    }

    # Route /blue/* to Blue service (strips /blue prefix)
    handle_path /blue/* {
        reverse_proxy 10.0.0.2:8002
    }

    # Enable logging
    log {
        output stdout
        format console
    }
}
EOF

Understanding the Configuration:

  • :8080 - Listen on port 8080
  • handle /health - Health check endpoint (doesn't go to backends)
  • handle_path /red/* - Strips /red prefix before forwarding
    • Client requests /red/api/data.json → Backend receives /api/data.json
  • reverse_proxy 10.0.0.1:8001 - Forward to Red service
  • Similar logic for Blue service

Step 2: Start Caddy:

caddy run --config ~/Caddyfile

Expected output:

INFO    using config from file
INFO    serving initial configuration

Leave this terminal open to see request logs.

Part 4: Testing Path-Based Routing

Open a new terminal for testing.

Step 1: Test the root path:

curl http://localhost:8080/

Expected output: API Gateway - Use /red/ or /blue/ paths

Step 2: Test health check:

curl http://localhost:8080/health

Expected output: OK - API Gateway Running

Step 3: Test Red service through proxy:

curl http://localhost:8080/red/

You should see the Red service HTML page.

curl http://localhost:8080/red/api/data.json

Expected output:

{
  "service": "Red Service",
  "namespace": "red",
  "ip": "10.0.0.1",
  "status": "running"
}

Step 4: Test Blue service through proxy:

curl http://localhost:8080/blue/

You should see the Blue service HTML page.

curl http://localhost:8080/blue/api/data.json

Expected output:

{
  "service": "Blue Service",
  "namespace": "blue",
  "ip": "10.0.0.2",
  "status": "running"
}

Step 5: Test with verbose output to see headers:

curl -v http://localhost:8080/red/api/data.json

Expected output (key parts):

> GET /red/api/data.json HTTP/1.1
> Host: localhost:8080
...
< HTTP/1.1 200 OK
< Content-Type: application/json
< Server: Caddy
...
{JSON response}

Observation:

  • Client requested /red/api/data.json
  • Caddy stripped /red and forwarded /api/data.json to backend
  • Response comes back through Caddy (notice Server: Caddy header)

Part 5: Understanding the Flow

Request Flow Diagram:

Client Request: http://localhost:8080/red/api/data.json
       |
       v
   Caddy (port 8080)
       |
       | Strips /red prefix
       | Path becomes: /api/data.json
       v
   Red Service (10.0.0.1:8001)
       |
       | Serves file: ~/red-service/api/data.json
       v
   Response back through Caddy
       |
       v
   Client receives JSON

Part 6: Cleanup

When finished testing:

# Stop Caddy (Ctrl+C in Caddy terminal)

# Kill Python servers
pkill -f "python3 -m http.server 8001"
pkill -f "python3 -m http.server 8002"

# Or kill all Python HTTP servers
sudo pkill -f "http.server"

Deliverable C

Submit screenshots showing:

  1. Output of curl http://localhost:8080/ (root path)
  2. Output of curl http://localhost:8080/health
  3. Output of curl http://localhost:8080/red/api/data.json
  4. Output of curl http://localhost:8080/blue/api/data.json
  5. Output of curl -v http://localhost:8080/red/ showing response headers
  6. Your complete Caddyfile contents

Common Troubleshooting

This section covers common issues you may encounter during the lab exercises.

DNS Issues

Problem: DNS queries fail or time out

Diagnostics:

# Check DNS resolver configuration
cat /etc/resolv.conf

# Test with different DNS servers
dig @8.8.8.8 google.com
dig @1.1.1.1 google.com

Solutions:

  • Verify network connectivity: ping 8.8.8.8
  • Check firewall rules: sudo iptables -L -n | grep 53
  • Try alternative DNS servers

Problem: "connection timed out; no servers could be reached"

Cause: No DNS resolver configured or resolver unreachable

Solution:

# Add Google DNS to resolv.conf
echo "nameserver 8.8.8.8" | sudo tee -a /etc/resolv.conf

SSH Issues

Problem: "Permission denied (publickey)"

Diagnostics:

# Check SSH with verbose output
ssh -v testuser@10.0.0.2

# Verify key permissions
ls -la ~/.ssh/id_ed25519

# Check authorized_keys on server
sudo ls -la /home/testuser/.ssh/authorized_keys

Common Causes:

  1. Wrong permissions on private key (must be 600)
  2. Wrong permissions on authorized_keys (must be 600)
  3. Wrong ownership of .ssh directory
  4. Public key not in authorized_keys
  5. Private key not loaded

Solutions:

# Fix permissions
chmod 600 ~/.ssh/id_ed25519
chmod 600 ~/.ssh/authorized_keys
chmod 700 ~/.ssh

# Verify key is loaded
ssh-add -l

# Add key if not loaded
ssh-add ~/.ssh/id_ed25519

Problem: "WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!"

Cause: Server's host key changed (legitimate reinstall or potential MITM attack)

Solution (if change is legitimate):

# Remove old host key
ssh-keygen -R 10.0.0.2

# Reconnect and verify new fingerprint
ssh testuser@10.0.0.2

Problem: SSH connection hangs

Diagnostics:

# Check if SSH server is running
sudo ss -tlnp | grep :22

# Test connectivity
nc -zv 10.0.0.2 22

Solutions:

  • Restart SSH server: sudo systemctl restart sshd
  • Check firewall rules
  • Verify network namespace connectivity

HTTP/Caddy Issues

Problem: Caddy fails to start

Diagnostics:

# Validate configuration
caddy validate --config /etc/caddy/Caddyfile

# Check if port is already in use
sudo ss -tlnp | grep :8080

# View Caddy logs
sudo journalctl -u caddy -n 50

Common Causes:

  1. Port already in use
  2. Syntax error in Caddyfile
  3. Permission issues

Solutions:

# Kill process using port 8080
sudo fuser -k 8080/tcp

# Fix Caddyfile syntax
caddy fmt --overwrite /etc/caddy/Caddyfile

# Run Caddy with elevated privileges if needed
sudo caddy run --config /etc/caddy/Caddyfile

Problem: Reverse proxy returns "502 Bad Gateway"

Cause: Backend service not running or unreachable

Diagnostics:

# Test backend directly
curl http://10.0.0.1:8001
curl http://10.0.0.2:8002

# Check if backend processes are running
sudo ip netns exec red ps aux | grep python
sudo ip netns exec blue ps aux | grep python

# Check connectivity from host to namespace
ping -c 2 10.0.0.1
ping -c 2 10.0.0.2

Solutions:

  • Restart backend services
  • Verify namespace network configuration
  • Check backend logs for errors

Problem: Path routing not working correctly

Diagnostics:

# Test with verbose curl
curl -v http://localhost:8080/red/

# Check Caddy access logs
sudo tail -f /var/log/caddy/access.log

Solution: Verify Caddyfile syntax, especially handle_path directives. Ensure path prefixes match exactly.

Network Namespace Issues

Problem: Cannot ping between namespaces

Diagnostics:

# Check interface status
sudo ip netns exec red ip link show
sudo ip netns exec blue ip link show

# Check IP addresses
sudo ip netns exec red ip addr show
sudo ip netns exec blue ip addr show

# Check routing
sudo ip netns exec red ip route show

Solutions:

  • Recreate namespace configuration from Lab 8
  • Verify veth pairs are connected
  • Check bridge configuration

Problem: Services in namespace cannot access internet

Cause: No default gateway or NAT configured

Solution: Configure NAT on host (if needed for external access):

# Enable IP forwarding
sudo sysctl -w net.ipv4.ip_forward=1

# Add NAT rule
sudo iptables -t nat -A POSTROUTING -s 10.0.0.0/24 -j MASQUERADE

Additional Protocol Information

This appendix provides information about other application protocols you may encounter.

Email Protocols

SMTP (Simple Mail Transfer Protocol)

  • Port: 25 (unencrypted), 587 (submission with STARTTLS), 465 (SMTPS)
  • Purpose: Sending email between servers
  • Client-to-server: Port 587
  • Server-to-server: Port 25
  • Note: Port 25 often blocked by ISPs to prevent spam

IMAP (Internet Message Access Protocol)

  • Port: 143 (unencrypted), 993 (IMAPS)
  • Purpose: Reading email with server synchronization
  • Advantages: Emails stay on server, accessible from multiple devices
  • Modern replacement for POP3

POP3 (Post Office Protocol v3)

  • Port: 110 (unencrypted), 995 (POP3S)
  • Purpose: Downloading email to client
  • Downloads emails and typically deletes from server
  • Legacy protocol, IMAP preferred today

Database Protocols

MySQL

  • Port: 3306
  • Protocol: MySQL wire protocol (proprietary)
  • Common tools: mysql client, MySQL Workbench

PostgreSQL

  • Port: 5432
  • Protocol: PostgreSQL wire protocol
  • Common tools: psql client, pgAdmin

MongoDB

  • Port: 27017
  • Protocol: MongoDB wire protocol
  • NoSQL document database

Redis

  • Port: 6379
  • Protocol: RESP (Redis Serialization Protocol)
  • In-memory data structure store

Other Common Protocols

FTP (File Transfer Protocol)

  • Port: 21 (control), 20 (data)
  • Legacy file transfer protocol
  • Security issues: Transmits credentials in cleartext
  • Modern alternatives: SFTP (SSH File Transfer Protocol), FTPS (FTP over SSL/TLS)

LDAP (Lightweight Directory Access Protocol)

  • Port: 389 (unencrypted), 636 (LDAPS)
  • Purpose: Directory services (user authentication, organizational data)
  • Common in enterprises for centralized authentication

RDP (Remote Desktop Protocol)

  • Port: 3389
  • Purpose: Remote desktop access (Windows)
  • Proprietary Microsoft protocol

VNC (Virtual Network Computing)

  • Port: 5900+ (5900, 5901, etc.)
  • Purpose: Remote desktop access (cross-platform)
  • Open protocol, multiple implementations

NTP (Network Time Protocol)

  • Port: 123
  • Purpose: Clock synchronization
  • Critical for distributed systems, logging, security

Syslog

  • Port: 514 (UDP/TCP)
  • Purpose: Logging infrastructure
  • Centralized log collection

Ports to Avoid

Insecure Legacy Protocols (never use in production):

  • Port 21: FTP (use SFTP or FTPS instead)
  • Port 23: Telnet (use SSH instead)
  • Port 69: TFTP (use SFTP or SCP instead)
  • Port 110: POP3 unencrypted (use POP3S or IMAP)
  • Port 143: IMAP unencrypted (use IMAPS)
  • Port 389: LDAP unencrypted (use LDAPS)
  • Port 445: SMB (often exploited, use VPN if needed)
  • Port 512-514: rlogin, rsh, rexec (use SSH instead)

These protocols transmit data (including credentials) in cleartext and/or have known security vulnerabilities.

Deliverables and Assessment

Submit a single PDF document containing all required elements. Organize clearly with section headers matching exercise labels.

Additional Resources

This lab introduced the dominant application layer protocols that form the foundation of modern internet services. You now understand how DNS enables service discovery, SSH provides secure remote access, and HTTP(S) serves as the universal protocol for APIs and web services.

For Further Study

DNS Deep Dives:

  • Setting up authoritative DNS with BIND9 or PowerDNS
  • DNS security: DNSSEC for cryptographic verification of DNS responses
  • DNS over HTTPS (DoH) and DNS over TLS (DoT) for encrypted DNS queries
  • Dynamic DNS (DDNS) and service discovery patterns
  • Split-horizon DNS for internal vs. external name resolution
  • DNS-based load balancing and geographic routing (GeoDNS)

SSH Advanced Topics:

  • SSH tunneling: Local forwarding (-L), remote forwarding (-R), dynamic forwarding (-D)
  • ProxyJump (-J) for accessing hosts through bastion servers
  • SSH certificates (different from host keys) for scalable authentication
  • SSH agent forwarding security considerations
  • SSH config files (~/.ssh/config) for managing multiple hosts
  • SSH as SOCKS proxy for secure browsing

HTTP/HTTPS Evolution:

  • HTTP/2 improvements: multiplexing, server push, header compression, binary protocol
  • HTTP/3 and QUIC: UDP-based transport with built-in encryption
  • WebSockets for real-time bidirectional communication
  • Server-Sent Events (SSE) for server-push updates
  • gRPC for efficient RPC over HTTP/2
  • GraphQL as alternative to REST for flexible API queries

Web Server Configuration:

  • Nginx advanced reverse proxy patterns
  • HAProxy for high-performance load balancing
  • Load balancing algorithms: round-robin, least connections, IP hash, consistent hashing
  • SSL/TLS termination best practices
  • Certificate management with Let's Encrypt
  • HTTP caching: proxy caching, CDN integration, cache invalidation strategies
  • Security headers: HSTS, Content-Security-Policy, X-Frame-Options, CORS

Modern Architecture Patterns:

  • Microservices architecture: benefits and challenges
  • Service mesh: Istio, Linkerd, Consul Connect
  • API gateways: Kong, Ambassador, Traefik, AWS API Gateway
  • Kubernetes Ingress controllers and service routing
  • Serverless architecture and function-as-a-service (FaaS)
  • Event-driven architectures: message queues, pub/sub patterns

Other Application Protocols:

  • Email protocols: SMTP for sending, IMAP for retrieval, SPF/DKIM/DMARC for authentication
  • Database wire protocols: MySQL, PostgreSQL, MongoDB, Redis
  • Message queue protocols: AMQP (RabbitMQ), Kafka protocol
  • Monitoring and observability: Prometheus metrics, StatsD, OpenTelemetry, syslog

Relevant Manual Pages

man dig              # DNS query tool
man nslookup         # DNS lookup utility
man host             # DNS lookup utility
man ssh              # SSH client
man sshd             # SSH server daemon
man sshd_config      # SSH server configuration
man ssh_config       # SSH client configuration
man ssh-keygen       # SSH key generation and management
man ssh-add          # SSH agent key management
man curl             # HTTP client
man wget             # HTTP client alternative

Online Resources

DNS:

SSH:

HTTP/HTTPS:

Web Servers and Reverse Proxies:

Security:

Practice Environments