diff options
| author | mo khan <mo@mokhan.ca> | 2025-09-27 13:36:26 -0600 |
|---|---|---|
| committer | mo khan <mo@mokhan.ca> | 2025-09-27 13:36:26 -0600 |
| commit | bd759873cec2cab2dea8bc3ff2d631344494e9e9 (patch) | |
| tree | 0acf36c2cf8c21618d5fdf491f93e3449a986b96 /NOTES.md | |
| parent | 9deea2c2f36d845a3203b98630d65613210d156a (diff) | |
fixup chapter 2
Diffstat (limited to 'NOTES.md')
| -rw-r--r-- | NOTES.md | 554 |
1 files changed, 0 insertions, 554 deletions
@@ -1,554 +0,0 @@ -# Chapter 1: Computer Networks and the Internet - Study Notes - -## 1.1 What Is the Internet? - -### Nuts-and-Bolts Description -- **Hosts/End Systems**: Computing devices connected to the Internet (computers, smartphones, servers, IoT devices) - - ~18 billion devices in 2017, projected 28.5 billion by 2022 -- **Communication Links**: Physical media connecting devices - - Different types: coaxial cable, copper wire, optical fiber, radio spectrum - - **Transmission rate**: Measured in bits/second -- **Packet Switches**: Forward packets between links - - **Routers**: Used in network core - - **Link-layer switches**: Used in access networks -- **Packets**: Segments of data with added headers -- **Route/Path**: Sequence of links and switches traversed by packet -- **Internet Service Providers (ISPs)**: Networks of packet switches and links -- **Protocols**: TCP/IP are the principal protocols -- **Internet Standards**: Developed by IETF as RFCs (Requests for Comments) - -### Services Description -- Internet as infrastructure providing services to distributed applications -- **Socket Interface**: Rules for how programs request Internet to deliver data -- Applications run on end systems, not in network core - -### What Is a Protocol? -**Definition**: A protocol defines the format and order of messages exchanged between communicating entities, as well as the actions taken on transmission/receipt of messages - -## 1.2 The Network Edge - -### Access Networks - -#### Home Access -1. **DSL (Digital Subscriber Line)** - - Uses existing telephone lines - - Asymmetric speeds (different up/down) - - Up to 24/52 Mbps down, 3.5/16 Mbps up - - Limited by distance from central office (5-10 miles) - -2. **Cable Internet** - - Uses cable TV infrastructure (HFC - Hybrid Fiber Coax) - - Shared broadcast medium - - DOCSIS standards: up to 1.2 Gbps down, 100 Mbps up - -3. **FTTH (Fiber to the Home)** - - Direct optical fiber to homes - - Can provide gigabit speeds - - PON (Passive Optical Networks) commonly used - -4. **5G Fixed Wireless** - - Wireless data transmission to home modem - - No physical cabling required - -#### Enterprise Access -- **Ethernet**: Most prevalent in corporate/university networks - - 100 Mbps to 10+ Gbps speeds -- **WiFi (802.11)**: Wireless LAN access - - Up to 100+ Mbps shared transmission rate - -#### Wide-Area Wireless -- **4G/5G**: Cellular networks for mobile devices - - 4G: up to 60 Mbps real-world speeds - - Coverage within tens of kilometers of base station - -### Physical Media - -#### Guided Media -- **Twisted-Pair Copper Wire**: Most common, up to 10 Gbps (Cat 6a) -- **Coaxial Cable**: Used in cable TV/Internet systems -- **Fiber Optics**: High bandwidth, low attenuation, immune to interference - -#### Unguided Media -- **Terrestrial Radio**: Short-range, local area, wide area -- **Satellite**: Geostationary (280ms delay) and LEO satellites - -## 1.3 The Network Core - -### Packet Switching -- **Store-and-Forward Transmission**: Switch must receive entire packet before forwarding - - End-to-end delay = N × (L/R) for N links -- **Queuing Delays**: Variable delays when packets wait in output buffers -- **Packet Loss**: Occurs when buffers overflow -- **Forwarding Tables**: Map destination addresses to outbound links -- **Routing Protocols**: Automatically set forwarding tables - -### Circuit Switching -- Resources reserved for duration of connection -- **Multiplexing Methods**: - - FDM (Frequency-Division): Divides frequency spectrum - - TDM (Time-Division): Divides time into slots -- Guaranteed constant rate but wastes resources during idle periods - -### Packet vs Circuit Switching -- **Packet switching advantages**: - - Better resource sharing - - Simpler implementation - - More efficient for bursty data -- **Circuit switching advantages**: - - Guaranteed resources - - Predictable performance - -### Network of Networks -- Internet is interconnection of ISPs at different tiers: - - **Tier-1 ISPs**: Global backbone networks - - **Regional ISPs**: Connect access ISPs to Tier-1 - - **Access ISPs**: Provide Internet access to end users -- **IXPs (Internet Exchange Points)**: Where ISPs peer -- **Content Provider Networks**: (e.g., Google, Amazon) bypass upper tiers when possible - -## 1.4 Delay, Loss, and Throughput - -### Types of Delay (d_nodal = d_proc + d_queue + d_trans + d_prop) - -1. **Processing Delay (d_proc)**: Time to examine packet header -2. **Queuing Delay (d_queue)**: Time waiting in queue - - **Traffic Intensity = La/R** (arrival rate × packet size / transmission rate) - - Must be < 1 to avoid infinite delays -3. **Transmission Delay (d_trans)**: L/R (packet size / transmission rate) -4. **Propagation Delay (d_prop)**: d/s (distance / propagation speed) - -### Packet Loss -- Occurs when queue capacity exceeded -- Performance measured by delay and loss probability - -### Throughput -- **Instantaneous throughput**: Rate at any instant -- **Average throughput**: F/T (file size / transfer time) -- Limited by **bottleneck link** (minimum rate along path) - -## 1.5 Protocol Layers - -### Five-Layer Internet Protocol Stack -1. **Application Layer**: Network applications (HTTP, SMTP, FTP, DNS) - - Data unit: Message -2. **Transport Layer**: Process-to-process data transfer (TCP, UDP) - - Data unit: Segment -3. **Network Layer**: Routing datagrams (IP protocol) - - Data unit: Datagram -4. **Link Layer**: Data transfer between neighboring network elements - - Data unit: Frame -5. **Physical Layer**: Bits on the wire - -### Encapsulation -- Each layer adds its own header information -- Payload = packet from layer above - -## 1.6 Networks Under Attack - -### Types of Attacks -1. **Malware**: Viruses, worms, spyware, botnets -2. **Denial-of-Service (DoS)**: - - Vulnerability attacks - - Bandwidth flooding - - Connection flooding - - DDoS uses multiple sources -3. **Packet Sniffing**: Passive interception of packets -4. **IP Spoofing**: Fake source addresses - -## Key Performance Metrics -- **Bandwidth-Delay Product**: R × d_prop (capacity of "pipe") -- **Utilization**: Actual use vs available capacity -- **Traffic Intensity**: Must be < 1 for stable operation - -## Important Formulas -- Store-and-forward delay (N links): d = N × (L/R) -- Total nodal delay: d_nodal = d_proc + d_queue + d_trans + d_prop -- Transmission delay: d_trans = L/R -- Propagation delay: d_prop = d/s -- Traffic intensity: La/R (must be < 1) -- Throughput = min(R1, R2, ..., RN) for path with N links - -## Key Concepts to Remember -- Internet is a "network of networks" -- Packet switching more efficient than circuit switching for bursty data -- Protocols define communication rules between entities -- Layering provides modularity and abstraction -- Performance limited by delays, loss, and throughput constraints -- Security was not originally built into Internet design - -# Chapter 2: Application Layer - Study Notes - -## 2.1 Principles of Network Applications - -### Network Application Architectures - -**Two Main Architectures:** - -1. **Client-Server Architecture** - - Always-on server with fixed, well-known IP address - - Clients communicate with server, not directly with each other - - Server often housed in data centers for scalability - - Examples: Web, FTP, Telnet, email - -2. **P2P (Peer-to-Peer) Architecture** - - Minimal reliance on dedicated servers - - Direct communication between intermittently connected hosts (peers) - - **Self-scalability**: Each peer adds both workload and service capacity - - Cost-effective (no significant server infrastructure needed) - - Challenges: security, performance, reliability - -### Processes Communicating - -- **Process**: A program running within an end system -- **Socket**: Software interface between application layer and transport layer - - Analogy: Process = house, Socket = door -- **Client Process**: Initiates communication -- **Server Process**: Waits to be contacted - -### Addressing Processes - -To identify a receiving process, need: -1. **IP Address**: 32-bit quantity identifying the host -2. **Port Number**: Identifies the receiving process/socket in the host - - Well-known ports: HTTP (80), SMTP (25) - -### Transport Services Available to Applications - -**Four Dimensions of Service:** - -1. **Reliable Data Transfer** - - Guaranteed delivery without errors - - Important for: email, file transfer, Web documents - -2. **Throughput** - - Guaranteed available throughput at specified rate - - **Bandwidth-sensitive apps**: multimedia applications - - **Elastic apps**: email, file transfer, Web - -3. **Timing** - - Guarantees on delivery time (e.g., <100ms) - - Important for: real-time apps, games, telephony - -4. **Security** - - Encryption, data integrity, authentication - -### Transport Services Provided by Internet - -**TCP (Transmission Control Protocol)** -- Connection-oriented service (handshaking) -- Reliable data transfer -- Full-duplex connection -- Congestion control -- NO throughput or timing guarantees - -**UDP (User Datagram Protocol)** -- Connectionless (no handshaking) -- Unreliable data transfer -- No congestion control -- Lightweight, minimal services - -**TLS (Transport Layer Security)** -- Enhancement for TCP -- Provides encryption, data integrity, authentication -- Implemented in application layer - -## 2.2 The Web and HTTP - -### HTTP Overview - -- **HyperText Transfer Protocol**: Web's application-layer protocol -- Client-server model: browsers (clients) and Web servers -- Uses TCP as transport protocol -- **Stateless protocol**: Server maintains no client information - -### HTTP Connections - -**Non-Persistent Connections (HTTP/1.0)** -- Each request/response over separate TCP connection -- At most one object sent over each connection -- Response time: 2 RTTs + transmission time per object -- Overhead of establishing multiple TCP connections - -**Persistent Connections (HTTP/1.1 default)** -- Multiple objects sent over single TCP connection -- Server leaves connection open after response -- Pipelining: Back-to-back requests without waiting - -### HTTP Message Format - -**Request Message:** -``` -GET /somedir/page.html HTTP/1.1 -Host: www.someschool.edu -Connection: close -User-agent: Mozilla/5.0 -Accept-language: fr -``` - -Components: -- Request line: method, URL, version -- Header lines: Host, User-agent, Accept-language, etc. -- Entity body (empty for GET, used with POST) - -**Response Message:** -``` -HTTP/1.1 200 OK -Connection: close -Date: Tue, 18 Aug 2015 15:44:04 GMT -Server: Apache/2.2.3 -Last-Modified: Tue, 18 Aug 2015 15:11:03 GMT -Content-Length: 6821 -Content-Type: text/html -(data...) -``` - -Components: -- Status line: version, status code, status message -- Header lines -- Entity body (the requested object) - -**Common Status Codes:** -- 200 OK: Success -- 301 Moved Permanently -- 400 Bad Request -- 404 Not Found -- 505 HTTP Version Not Supported - -### Cookies - -**Four Components:** -1. Cookie header in HTTP response -2. Cookie header in HTTP request -3. Cookie file on client -4. Back-end database at Web site - -**Uses:** User identification, shopping carts, recommendations - -### Web Caching (Proxy Servers) - -**Benefits:** -- Reduces response time for clients -- Reduces traffic on institution's access link -- Cost savings (bandwidth) - -**Conditional GET:** -- Uses `If-Modified-Since:` header -- Server responds with 304 Not Modified if unchanged -- Reduces unnecessary data transfer - -### HTTP/2 - -**Goals:** -- Reduce latency through multiplexing -- Request prioritization -- Server push -- Header compression - -**Key Features:** -- **Framing**: Messages broken into small frames, interleaved -- Binary protocol (more efficient than text) -- Solves Head-of-Line (HOL) blocking problem -- Single TCP connection for entire page - -## 2.3 Electronic Mail - -### Components - -1. **User Agents**: Mail clients (Outlook, Gmail app) -2. **Mail Servers**: Store mailboxes, run SMTP -3. **SMTP**: Simple Mail Transfer Protocol - -### SMTP - -- Uses TCP, port 25 -- Push protocol -- Three phases: handshaking, transfer, closure -- Commands: HELO, MAIL FROM, RCPT TO, DATA, QUIT -- 7-bit ASCII restriction (legacy) -- Uses persistent connections - -### Mail Access Protocols - -**Problem:** SMTP is push; need pull for retrieval - -**Solutions:** -- **HTTP**: Web-based email (Gmail) -- **IMAP**: Internet Mail Access Protocol - - Allows folder management on server - - More features than POP3 - -## 2.4 DNS (Domain Name System) - -### Services Provided - -1. **Hostname to IP translation** (main service) -2. **Host aliasing**: Canonical and alias names -3. **Mail server aliasing** -4. **Load distribution**: Rotate IP addresses for replicated servers - -### DNS Hierarchy - -1. **Root DNS servers** (~1000 instances of 13 servers) -2. **Top-Level Domain (TLD) servers** (com, org, edu, country codes) -3. **Authoritative DNS servers** (organization's own servers) -4. **Local DNS servers** (ISP's default name servers) - -### DNS Operation - -**Iterative Queries** (typical): -- Local DNS server queries on behalf of client -- Gets referrals to other servers - -**Recursive Queries**: -- DNS server obtains mapping on behalf of requester - -**DNS Caching**: -- Servers cache mappings to improve performance -- TTL (Time To Live) determines cache duration -- Reduces load on root servers - -### DNS Records (Resource Records) - -Format: `(Name, Value, Type, TTL)` - -**Types:** -- **Type=A**: Name is hostname, Value is IP address -- **Type=NS**: Name is domain, Value is authoritative DNS server -- **Type=CNAME**: Name is alias, Value is canonical name -- **Type=MX**: Name is alias, Value is mail server name - -## 2.5 Peer-to-Peer File Distribution - -### P2P Scalability - -**Distribution Time Formulas:** - -Client-Server: `D_cs = max{NF/u_s, F/d_min}` -- Linear increase with N peers - -P2P: `D_P2P = max{F/u_s, F/d_min, NF/(u_s + Σu_i)}` -- Self-scaling: More peers = more capacity - -### BitTorrent - -**Key Concepts:** -- **Torrent**: Collection of peers distributing a file -- **Chunks**: Equal-size pieces (typically 256KB) -- **Tracker**: Infrastructure node tracking peers - -**Mechanisms:** -1. **Rarest First**: Request rarest chunks first -2. **Tit-for-Tat**: Upload to peers providing best download rates - - Top 4 uploaders get priority (unchoked) - - 1 random peer (optimistically unchoked) every 30 seconds - -## 2.6 Video Streaming and CDNs - -### Video Characteristics -- High bit rate (100 kbps to 4+ Mbps) -- Can be compressed to different quality levels -- Storage/bandwidth intensive - -### HTTP Streaming and DASH - -**Simple HTTP Streaming:** -- Video stored as file with URL -- Client buffers before playback - -**DASH (Dynamic Adaptive Streaming over HTTP):** -- Multiple versions at different bit rates -- Client requests chunks adaptively -- Manifest file lists all versions -- Client measures bandwidth, selects quality - -### Content Distribution Networks (CDNs) - -**Server Placement Strategies:** - -1. **Enter Deep** (Akamai approach) - - Servers in many access ISPs - - Get close to users - - Higher maintenance overhead - -2. **Bring Home** (Limelight approach) - - Large clusters at IXPs - - Lower maintenance - - Potentially higher latency - -**CDN Operation:** -- DNS redirect to select CDN server -- Cluster selection based on: - - Geographic proximity - - Real-time performance measurements - - Load balancing - -### Case Studies - -**Netflix:** -- Uses Amazon cloud for website/processing -- Own private CDN for video delivery -- Push caching during off-peak hours -- No DNS redirect needed - -**YouTube:** -- Google's private CDN -- Pull caching -- DNS redirect for server selection -- HTTP streaming (not adaptive) - -## 2.7 Socket Programming - -### UDP Socket Programming - -**Key Concepts:** -- Connectionless -- Must attach destination address to each packet -- No guarantee of delivery - -**Python Functions:** -- `socket(AF_INET, SOCK_DGRAM)`: Create UDP socket -- `sendto()`: Send with destination address -- `recvfrom()`: Receive data and source address -- `bind()`: Assign port to socket - -### TCP Socket Programming - -**Key Concepts:** -- Connection-oriented (three-way handshake) -- Reliable, in-order delivery -- Server has welcoming socket and connection sockets - -**Python Functions:** -- `socket(AF_INET, SOCK_STREAM)`: Create TCP socket -- `bind()`: Assign port to server socket -- `listen()`: Server listens for connections -- `accept()`: Create connection socket for client -- `connect()`: Client initiates connection -- `send()/recv()`: Data transfer - -## Key Formulas and Metrics - -- **RTT (Round-Trip Time)**: Time for small packet to travel from client to server and back -- **HTTP Response Time (non-persistent)**: 2 RTTs per object + transmission time -- **Traffic Intensity**: (arrival rate × packet size) / link rate (must be < 1) -- **Cache Hit Rate**: Fraction of requests satisfied by cache - -## Important Port Numbers - -- HTTP: 80 -- HTTPS: 443 -- SMTP: 25 -- DNS: 53 -- IMAP: 143 -- POP3: 110 - -## Key Takeaways - -1. Application layer protocols define message format, order, and actions -2. Client-server is simpler but less scalable than P2P -3. HTTP is stateless; cookies add state -4. DNS is critical Internet infrastructure using hierarchy -5. CDNs bring content closer to users -6. DASH enables adaptive video streaming -7. Socket programming allows custom network applications -8. TCP provides reliability; UDP provides simplicity |
