diff options
| author | mo khan <mo@mokhan.ca> | 2025-09-27 12:59:13 -0600 |
|---|---|---|
| committer | mo khan <mo@mokhan.ca> | 2025-09-27 12:59:13 -0600 |
| commit | 5e508ae387e9ca1c7337c48844a80abe0326cafe (patch) | |
| tree | c7161afcc3a66b2f0059244cafce6584efb1d728 | |
| parent | 96f54d8585e2e38e7b9c10340cabee53fab69a4c (diff) | |
Add study notes for chapter 1 and 2
| -rw-r--r-- | NOTES.md | 554 |
1 files changed, 554 insertions, 0 deletions
diff --git a/NOTES.md b/NOTES.md new file mode 100644 index 0000000..7721d32 --- /dev/null +++ b/NOTES.md @@ -0,0 +1,554 @@ +# Chapter 1: Computer Networks and the Internet - Study Notes + +## 1.1 What Is the Internet? + +### Nuts-and-Bolts Description +- **Hosts/End Systems**: Computing devices connected to the Internet (computers, smartphones, servers, IoT devices) + - ~18 billion devices in 2017, projected 28.5 billion by 2022 +- **Communication Links**: Physical media connecting devices + - Different types: coaxial cable, copper wire, optical fiber, radio spectrum + - **Transmission rate**: Measured in bits/second +- **Packet Switches**: Forward packets between links + - **Routers**: Used in network core + - **Link-layer switches**: Used in access networks +- **Packets**: Segments of data with added headers +- **Route/Path**: Sequence of links and switches traversed by packet +- **Internet Service Providers (ISPs)**: Networks of packet switches and links +- **Protocols**: TCP/IP are the principal protocols +- **Internet Standards**: Developed by IETF as RFCs (Requests for Comments) + +### Services Description +- Internet as infrastructure providing services to distributed applications +- **Socket Interface**: Rules for how programs request Internet to deliver data +- Applications run on end systems, not in network core + +### What Is a Protocol? +**Definition**: A protocol defines the format and order of messages exchanged between communicating entities, as well as the actions taken on transmission/receipt of messages + +## 1.2 The Network Edge + +### Access Networks + +#### Home Access +1. **DSL (Digital Subscriber Line)** + - Uses existing telephone lines + - Asymmetric speeds (different up/down) + - Up to 24/52 Mbps down, 3.5/16 Mbps up + - Limited by distance from central office (5-10 miles) + +2. **Cable Internet** + - Uses cable TV infrastructure (HFC - Hybrid Fiber Coax) + - Shared broadcast medium + - DOCSIS standards: up to 1.2 Gbps down, 100 Mbps up + +3. **FTTH (Fiber to the Home)** + - Direct optical fiber to homes + - Can provide gigabit speeds + - PON (Passive Optical Networks) commonly used + +4. **5G Fixed Wireless** + - Wireless data transmission to home modem + - No physical cabling required + +#### Enterprise Access +- **Ethernet**: Most prevalent in corporate/university networks + - 100 Mbps to 10+ Gbps speeds +- **WiFi (802.11)**: Wireless LAN access + - Up to 100+ Mbps shared transmission rate + +#### Wide-Area Wireless +- **4G/5G**: Cellular networks for mobile devices + - 4G: up to 60 Mbps real-world speeds + - Coverage within tens of kilometers of base station + +### Physical Media + +#### Guided Media +- **Twisted-Pair Copper Wire**: Most common, up to 10 Gbps (Cat 6a) +- **Coaxial Cable**: Used in cable TV/Internet systems +- **Fiber Optics**: High bandwidth, low attenuation, immune to interference + +#### Unguided Media +- **Terrestrial Radio**: Short-range, local area, wide area +- **Satellite**: Geostationary (280ms delay) and LEO satellites + +## 1.3 The Network Core + +### Packet Switching +- **Store-and-Forward Transmission**: Switch must receive entire packet before forwarding + - End-to-end delay = N × (L/R) for N links +- **Queuing Delays**: Variable delays when packets wait in output buffers +- **Packet Loss**: Occurs when buffers overflow +- **Forwarding Tables**: Map destination addresses to outbound links +- **Routing Protocols**: Automatically set forwarding tables + +### Circuit Switching +- Resources reserved for duration of connection +- **Multiplexing Methods**: + - FDM (Frequency-Division): Divides frequency spectrum + - TDM (Time-Division): Divides time into slots +- Guaranteed constant rate but wastes resources during idle periods + +### Packet vs Circuit Switching +- **Packet switching advantages**: + - Better resource sharing + - Simpler implementation + - More efficient for bursty data +- **Circuit switching advantages**: + - Guaranteed resources + - Predictable performance + +### Network of Networks +- Internet is interconnection of ISPs at different tiers: + - **Tier-1 ISPs**: Global backbone networks + - **Regional ISPs**: Connect access ISPs to Tier-1 + - **Access ISPs**: Provide Internet access to end users +- **IXPs (Internet Exchange Points)**: Where ISPs peer +- **Content Provider Networks**: (e.g., Google, Amazon) bypass upper tiers when possible + +## 1.4 Delay, Loss, and Throughput + +### Types of Delay (d_nodal = d_proc + d_queue + d_trans + d_prop) + +1. **Processing Delay (d_proc)**: Time to examine packet header +2. **Queuing Delay (d_queue)**: Time waiting in queue + - **Traffic Intensity = La/R** (arrival rate × packet size / transmission rate) + - Must be < 1 to avoid infinite delays +3. **Transmission Delay (d_trans)**: L/R (packet size / transmission rate) +4. **Propagation Delay (d_prop)**: d/s (distance / propagation speed) + +### Packet Loss +- Occurs when queue capacity exceeded +- Performance measured by delay and loss probability + +### Throughput +- **Instantaneous throughput**: Rate at any instant +- **Average throughput**: F/T (file size / transfer time) +- Limited by **bottleneck link** (minimum rate along path) + +## 1.5 Protocol Layers + +### Five-Layer Internet Protocol Stack +1. **Application Layer**: Network applications (HTTP, SMTP, FTP, DNS) + - Data unit: Message +2. **Transport Layer**: Process-to-process data transfer (TCP, UDP) + - Data unit: Segment +3. **Network Layer**: Routing datagrams (IP protocol) + - Data unit: Datagram +4. **Link Layer**: Data transfer between neighboring network elements + - Data unit: Frame +5. **Physical Layer**: Bits on the wire + +### Encapsulation +- Each layer adds its own header information +- Payload = packet from layer above + +## 1.6 Networks Under Attack + +### Types of Attacks +1. **Malware**: Viruses, worms, spyware, botnets +2. **Denial-of-Service (DoS)**: + - Vulnerability attacks + - Bandwidth flooding + - Connection flooding + - DDoS uses multiple sources +3. **Packet Sniffing**: Passive interception of packets +4. **IP Spoofing**: Fake source addresses + +## Key Performance Metrics +- **Bandwidth-Delay Product**: R × d_prop (capacity of "pipe") +- **Utilization**: Actual use vs available capacity +- **Traffic Intensity**: Must be < 1 for stable operation + +## Important Formulas +- Store-and-forward delay (N links): d = N × (L/R) +- Total nodal delay: d_nodal = d_proc + d_queue + d_trans + d_prop +- Transmission delay: d_trans = L/R +- Propagation delay: d_prop = d/s +- Traffic intensity: La/R (must be < 1) +- Throughput = min(R1, R2, ..., RN) for path with N links + +## Key Concepts to Remember +- Internet is a "network of networks" +- Packet switching more efficient than circuit switching for bursty data +- Protocols define communication rules between entities +- Layering provides modularity and abstraction +- Performance limited by delays, loss, and throughput constraints +- Security was not originally built into Internet design + +# Chapter 2: Application Layer - Study Notes + +## 2.1 Principles of Network Applications + +### Network Application Architectures + +**Two Main Architectures:** + +1. **Client-Server Architecture** + - Always-on server with fixed, well-known IP address + - Clients communicate with server, not directly with each other + - Server often housed in data centers for scalability + - Examples: Web, FTP, Telnet, email + +2. **P2P (Peer-to-Peer) Architecture** + - Minimal reliance on dedicated servers + - Direct communication between intermittently connected hosts (peers) + - **Self-scalability**: Each peer adds both workload and service capacity + - Cost-effective (no significant server infrastructure needed) + - Challenges: security, performance, reliability + +### Processes Communicating + +- **Process**: A program running within an end system +- **Socket**: Software interface between application layer and transport layer + - Analogy: Process = house, Socket = door +- **Client Process**: Initiates communication +- **Server Process**: Waits to be contacted + +### Addressing Processes + +To identify a receiving process, need: +1. **IP Address**: 32-bit quantity identifying the host +2. **Port Number**: Identifies the receiving process/socket in the host + - Well-known ports: HTTP (80), SMTP (25) + +### Transport Services Available to Applications + +**Four Dimensions of Service:** + +1. **Reliable Data Transfer** + - Guaranteed delivery without errors + - Important for: email, file transfer, Web documents + +2. **Throughput** + - Guaranteed available throughput at specified rate + - **Bandwidth-sensitive apps**: multimedia applications + - **Elastic apps**: email, file transfer, Web + +3. **Timing** + - Guarantees on delivery time (e.g., <100ms) + - Important for: real-time apps, games, telephony + +4. **Security** + - Encryption, data integrity, authentication + +### Transport Services Provided by Internet + +**TCP (Transmission Control Protocol)** +- Connection-oriented service (handshaking) +- Reliable data transfer +- Full-duplex connection +- Congestion control +- NO throughput or timing guarantees + +**UDP (User Datagram Protocol)** +- Connectionless (no handshaking) +- Unreliable data transfer +- No congestion control +- Lightweight, minimal services + +**TLS (Transport Layer Security)** +- Enhancement for TCP +- Provides encryption, data integrity, authentication +- Implemented in application layer + +## 2.2 The Web and HTTP + +### HTTP Overview + +- **HyperText Transfer Protocol**: Web's application-layer protocol +- Client-server model: browsers (clients) and Web servers +- Uses TCP as transport protocol +- **Stateless protocol**: Server maintains no client information + +### HTTP Connections + +**Non-Persistent Connections (HTTP/1.0)** +- Each request/response over separate TCP connection +- At most one object sent over each connection +- Response time: 2 RTTs + transmission time per object +- Overhead of establishing multiple TCP connections + +**Persistent Connections (HTTP/1.1 default)** +- Multiple objects sent over single TCP connection +- Server leaves connection open after response +- Pipelining: Back-to-back requests without waiting + +### HTTP Message Format + +**Request Message:** +``` +GET /somedir/page.html HTTP/1.1 +Host: www.someschool.edu +Connection: close +User-agent: Mozilla/5.0 +Accept-language: fr +``` + +Components: +- Request line: method, URL, version +- Header lines: Host, User-agent, Accept-language, etc. +- Entity body (empty for GET, used with POST) + +**Response Message:** +``` +HTTP/1.1 200 OK +Connection: close +Date: Tue, 18 Aug 2015 15:44:04 GMT +Server: Apache/2.2.3 +Last-Modified: Tue, 18 Aug 2015 15:11:03 GMT +Content-Length: 6821 +Content-Type: text/html +(data...) +``` + +Components: +- Status line: version, status code, status message +- Header lines +- Entity body (the requested object) + +**Common Status Codes:** +- 200 OK: Success +- 301 Moved Permanently +- 400 Bad Request +- 404 Not Found +- 505 HTTP Version Not Supported + +### Cookies + +**Four Components:** +1. Cookie header in HTTP response +2. Cookie header in HTTP request +3. Cookie file on client +4. Back-end database at Web site + +**Uses:** User identification, shopping carts, recommendations + +### Web Caching (Proxy Servers) + +**Benefits:** +- Reduces response time for clients +- Reduces traffic on institution's access link +- Cost savings (bandwidth) + +**Conditional GET:** +- Uses `If-Modified-Since:` header +- Server responds with 304 Not Modified if unchanged +- Reduces unnecessary data transfer + +### HTTP/2 + +**Goals:** +- Reduce latency through multiplexing +- Request prioritization +- Server push +- Header compression + +**Key Features:** +- **Framing**: Messages broken into small frames, interleaved +- Binary protocol (more efficient than text) +- Solves Head-of-Line (HOL) blocking problem +- Single TCP connection for entire page + +## 2.3 Electronic Mail + +### Components + +1. **User Agents**: Mail clients (Outlook, Gmail app) +2. **Mail Servers**: Store mailboxes, run SMTP +3. **SMTP**: Simple Mail Transfer Protocol + +### SMTP + +- Uses TCP, port 25 +- Push protocol +- Three phases: handshaking, transfer, closure +- Commands: HELO, MAIL FROM, RCPT TO, DATA, QUIT +- 7-bit ASCII restriction (legacy) +- Uses persistent connections + +### Mail Access Protocols + +**Problem:** SMTP is push; need pull for retrieval + +**Solutions:** +- **HTTP**: Web-based email (Gmail) +- **IMAP**: Internet Mail Access Protocol + - Allows folder management on server + - More features than POP3 + +## 2.4 DNS (Domain Name System) + +### Services Provided + +1. **Hostname to IP translation** (main service) +2. **Host aliasing**: Canonical and alias names +3. **Mail server aliasing** +4. **Load distribution**: Rotate IP addresses for replicated servers + +### DNS Hierarchy + +1. **Root DNS servers** (~1000 instances of 13 servers) +2. **Top-Level Domain (TLD) servers** (com, org, edu, country codes) +3. **Authoritative DNS servers** (organization's own servers) +4. **Local DNS servers** (ISP's default name servers) + +### DNS Operation + +**Iterative Queries** (typical): +- Local DNS server queries on behalf of client +- Gets referrals to other servers + +**Recursive Queries**: +- DNS server obtains mapping on behalf of requester + +**DNS Caching**: +- Servers cache mappings to improve performance +- TTL (Time To Live) determines cache duration +- Reduces load on root servers + +### DNS Records (Resource Records) + +Format: `(Name, Value, Type, TTL)` + +**Types:** +- **Type=A**: Name is hostname, Value is IP address +- **Type=NS**: Name is domain, Value is authoritative DNS server +- **Type=CNAME**: Name is alias, Value is canonical name +- **Type=MX**: Name is alias, Value is mail server name + +## 2.5 Peer-to-Peer File Distribution + +### P2P Scalability + +**Distribution Time Formulas:** + +Client-Server: `D_cs = max{NF/u_s, F/d_min}` +- Linear increase with N peers + +P2P: `D_P2P = max{F/u_s, F/d_min, NF/(u_s + Σu_i)}` +- Self-scaling: More peers = more capacity + +### BitTorrent + +**Key Concepts:** +- **Torrent**: Collection of peers distributing a file +- **Chunks**: Equal-size pieces (typically 256KB) +- **Tracker**: Infrastructure node tracking peers + +**Mechanisms:** +1. **Rarest First**: Request rarest chunks first +2. **Tit-for-Tat**: Upload to peers providing best download rates + - Top 4 uploaders get priority (unchoked) + - 1 random peer (optimistically unchoked) every 30 seconds + +## 2.6 Video Streaming and CDNs + +### Video Characteristics +- High bit rate (100 kbps to 4+ Mbps) +- Can be compressed to different quality levels +- Storage/bandwidth intensive + +### HTTP Streaming and DASH + +**Simple HTTP Streaming:** +- Video stored as file with URL +- Client buffers before playback + +**DASH (Dynamic Adaptive Streaming over HTTP):** +- Multiple versions at different bit rates +- Client requests chunks adaptively +- Manifest file lists all versions +- Client measures bandwidth, selects quality + +### Content Distribution Networks (CDNs) + +**Server Placement Strategies:** + +1. **Enter Deep** (Akamai approach) + - Servers in many access ISPs + - Get close to users + - Higher maintenance overhead + +2. **Bring Home** (Limelight approach) + - Large clusters at IXPs + - Lower maintenance + - Potentially higher latency + +**CDN Operation:** +- DNS redirect to select CDN server +- Cluster selection based on: + - Geographic proximity + - Real-time performance measurements + - Load balancing + +### Case Studies + +**Netflix:** +- Uses Amazon cloud for website/processing +- Own private CDN for video delivery +- Push caching during off-peak hours +- No DNS redirect needed + +**YouTube:** +- Google's private CDN +- Pull caching +- DNS redirect for server selection +- HTTP streaming (not adaptive) + +## 2.7 Socket Programming + +### UDP Socket Programming + +**Key Concepts:** +- Connectionless +- Must attach destination address to each packet +- No guarantee of delivery + +**Python Functions:** +- `socket(AF_INET, SOCK_DGRAM)`: Create UDP socket +- `sendto()`: Send with destination address +- `recvfrom()`: Receive data and source address +- `bind()`: Assign port to socket + +### TCP Socket Programming + +**Key Concepts:** +- Connection-oriented (three-way handshake) +- Reliable, in-order delivery +- Server has welcoming socket and connection sockets + +**Python Functions:** +- `socket(AF_INET, SOCK_STREAM)`: Create TCP socket +- `bind()`: Assign port to server socket +- `listen()`: Server listens for connections +- `accept()`: Create connection socket for client +- `connect()`: Client initiates connection +- `send()/recv()`: Data transfer + +## Key Formulas and Metrics + +- **RTT (Round-Trip Time)**: Time for small packet to travel from client to server and back +- **HTTP Response Time (non-persistent)**: 2 RTTs per object + transmission time +- **Traffic Intensity**: (arrival rate × packet size) / link rate (must be < 1) +- **Cache Hit Rate**: Fraction of requests satisfied by cache + +## Important Port Numbers + +- HTTP: 80 +- HTTPS: 443 +- SMTP: 25 +- DNS: 53 +- IMAP: 143 +- POP3: 110 + +## Key Takeaways + +1. Application layer protocols define message format, order, and actions +2. Client-server is simpler but less scalable than P2P +3. HTTP is stateless; cookies add state +4. DNS is critical Internet infrastructure using hierarchy +5. CDNs bring content closer to users +6. DASH enables adaptive video streaming +7. Socket programming allows custom network applications +8. TCP provides reliability; UDP provides simplicity |
