doc/assignment3.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275

# Assignment 3 - Computer Science 314: Operating Systems

You should submit this assignment after you have finished Unit 3. It is worth 10% of your final grade.

Instructions:
Please answer the following questions in complete sentences.
Your answer for each question should be about 150 words. (100 marks total)

1. What are the advantages of using dynamic loading? (6 marks)

> With dynamic loading, a routine is not loaded until it is called.
> All routines are kept on disk in a relocatable load format.
> The main program is loaded into memory and is executed.
> When a routine needs to call another routine, the calling routine first
> checks to see whether the other routine has been loaded.
> If it has not, the relocatable linking loader is called to load the desired
> routine into memory and to update the program's address tables to reflect this change.
> Then control is passed to the newly loaded routine.

> The advantage of dynamic loading is that an unused routine is never loaded.
> This method is particularly useful when large amounts of code are needed
> to handle infrequently occurring cases, such as error routines. In this case,
> although the total program size may be large, the portion that is used (and
> hence loaded) may be much smaller.

> Dynamic loading does not require special support from the operating system.
> It is the responsibility of the users to design their programs to take
> advantage of such a method. Operating ssytems may help the programmer,
> however, by providing library routines to implement dynamic loading.

1. Explain the basic method for implementing paging. (8 marks)

> The basic method for implementing paging involves breaking physical memory
> into fixed-sized blocks called frames and breaking logical memory into blocks
> of the same size called pages. When a process is to be executed, its pages are
> loaded into any available memory frames from their source (a file system or
> the backing store). The backing store is divided into fixed-sized blocks that
> are of the same size as the memory frames.

> Every address generated by the CPU is divided into two parts:
> a page number (p) and a page offset (d). The page number is used as an index
> into a page table. The page table contains the base address of each page in
> physical memory. This base address is combined with the page offset to define
> the physical address is sent to the memory unit.

1. Briefly describe the segmentation memory management scheme. How does it differ from the paging memory management scheme in terms of the user’s view of memory? (8 marks)

> An important aspect of memory management that became unavoidable with paging
> is the separation of the user's view of memory from the actual physical
> memory. As we have already seen, the user's view of memory is not the same as
> the actual physical memory. The user's view is mapped onto physical memory.
> This mapping allows differentiation between logical memory and physical
> memory.

> Segmentation is a memory-management scheme that supports this user view of
> memory. A logical address space is a collection of segments. Each segment
> has a name and a length. The addresses specify both the segment name and
> the offset within the segment. The user therefore specifies each
> address by two quantities: a segment name and an offset. (Contrast this
> scheme with the paging scheme, in which the user specifies only a single
> address, which is partitioned by the hardware into a page number and an
> offset, all invisible to the programmer.)

> For simplicity of implementation, segments are numbered and are referred
> to by a segment number, rather than by a segment name. Thus, a logical
> address consists of a two tuple:
>    <segment-number,offset>.

> A C compiler might create separate segments for the following:
> 1. The code
> 2. Global variables
> 3. The heap, from which memory is allocated
> 4. The stacks used by each thread
> 5. The standard C library
> Libraries that are linked during compile time might be assigned separate
> segments. The loader would take all these segments and assign them segment
> numbers.

1. Explain the distinction between a demand-paging system and a paging system with swapping. (8 marks)
1. How does the second-chance algorithm for page replacement differ from the FIFO page replacement algorithm? (8 marks)

> The simplest page-replacement algorithm is a first-in, first-out (FIFO)
> algorithm. A FIFO replacement algorithm associates with each page the time
> when that page was brought into memory. When a page must be replaced, the
> oldest page is chosen. Notice that it is not strictly necessary to record
> the time when a page is brought in. We can create a FIFO queue to hold all
> pages in memory. We replace the page at the head of the queue. When a page
> is brought into memory, we insert it at the tail of the queue.
> The FIFO page-replacement algorithm is easy to understand and program.
> However, its performance is not always good. On the one hand, the page
> replaced may be an initialization module that was used a long time ago and is
> no longer needed. On the other hand, it could contain a heavily used variable
> that was initialized early and is in constant use.

> The basic algorithm of second-chance replacement is a FIFO replacement
> algorithm. When a page has been selected, however, we inspect its reference
> bit. If the value is 0, we proceed to replace this page; but if the reference
> bit is set to 1, we give the page a second chance and move on to select the
> next FIFO page. When a page gets a second chance, its reference bit it
> cleared, and its arrival time is reset to the current time. Thus, a page that
> is given a second chance will not be replaced until all other pages have been
> replaced (or given second chances). In addition, if a page is used often
> enough to keep its reference bit set, it will never be replaced.

1. Explain how copy-on-write operates. (8 marks)

> Process creation using the `fork()` system call may initially bypass the need
> for demand paging by using a technique similar to page sharing. This
> technique provides for rapid process creation and minimizes the number of new
> pages that must be allocated to the newly created process.

```plaintext
   process1        memory        process2
  ----------     |--------|     ----------
  |        |     |        |     |        |
  |--------|     |--------|     |--------|
  |        |---> | page A | <---|        |
  |--------|     |--------|     |--------|
  |        |---> | page B | <---|        |
  |--------|     |--------|     |--------|
  |        |---> | page C | <---|        |
  |--------|     |--------|     |--------|
  |        |     |        |     |        |
  |        |     |        |     |        |
  |        |     |        |     |        |
  ----------     |--------|     ----------

   process1        memory        process2
  ----------     |--------|     ----------
  |        |     |        |     |        |
  |--------|     |--------|     |--------|
  |        |---> | page A | <---|        |
  |--------|     |--------|     |--------|
  |        |---> | page B | <---|        |
  |--------|     |--------|     |--------|
  |        |---| | page C | <---|        |
  |--------|   | |--------|     |--------|
  |        |   | |--------|     |        |
  |        |   ->| page C1|     |        |
  |        |     |--------|     |        |
  ----------     |--------|     ----------
```

> Recall that the `fork()` system call creates a child process that is a duplicate
> of its parent. Traditionally, `fork()` worked by creating a copy of the parent's
> address space for the child, duplicating the pages belonging to the parent.
> However, considering that many child processes invoke the `exec()` system call
> immediately after creation, the copying of the parent's address space may be
> unnecessary. Instead, we can use a technique known as `copy-on-write`, which
> works by allowing the parent and child processes initially to share the same
> pages. These shared pages are marked as copy-on-write pages, meaning that if
> either process writes to a shared page, a copy of the shared page is created.

> For example, assume that the child process attempts to modify a page containing
> portions of the stack, with the pages set to be copy-on-write. The operating
> system will create a copy of this page, mapping it to the address space of the
> child process. The child process will then modify its copied page and not the
> page belonging to the parent process. Obviously, when the copy-on-write
> technique is used, only the pages that are modified by either process are
> copied; all unmodified pages can be shared by the parent and child processes.

> Note, too, that only pages that can be modified need be marked as
> copy-on-write. Pages that cannot be modified (pages containing executable
> code) can be shared by the parent and child. Copy-on-write is a common
> technique used in several operating systems, including Windows XP, Linux, and
> Solaris.

1. If you were creating an operating system to handle files, what are the six basic file operations that you should implement? (8 marks)
  These size basic operations comprise the minimal set of required file operations.

  1. Creating a file. Allocate space and add directory entry.
  1. Writing a file. Write data starting at seek position in to a file.
  1. Reading a file. Read data from the seek position into a specified buffer.
  1. Repositioning within a file. Change the seek position in a file to read/write to/from.
  1. Deleting a file. Remove the file entry from the directory and release file space.
  1. Truncating a file. Reset size of file to 0 and keep other attributes.

  Other common operations include appending new information to the end of an
  existing file and renaming an existing file. These primitive operations can
  then be combined to perform other file operations. Other common operations
  include appending new information to the end of an existing file and renaming
  an existing file. These primitive operations can then be combined to perform
  other file operations.

1. To create a new file, an application program calls on the logical file system. Describe the steps the logical file system takes to create a file. (8 marks)
  1. Find space in the file system for the file.
  1. Create an entry for the new file in the directory.

1. How is a hash table superior to a simple linear list structure? What issue must be handled by hash table implementation? (8 marks)

  A hash table allows for `O(1)` lookup for directory entries by using a hash function with a low collision rate.
  A collision occurs when two different value produce the same hash code.
  When collisions occur hash table implementations usually fallback to using a form of chaining
  or open addressing. Chaining will typically store collisions as some form of list such as a linked list.
  Open addressing will use a form of probing to find a starting slot to search from until it finds a match.
  In both of these forms the # of entries to evaluate is less than a full linear scan unless a poor quality hash function
  is used.

  A linear scan will start from the start of a list and seek through the list to find a match.
  If the list is stored in a sorted form or as a balanced binary search tree then a
  linear scan has a best case search time of `O(logn)` (base 2).
  If the data cannot be sorted and/or stored in a binary search tree then a linear scan can have a worst case of `O(n)` which can be
  much slower for larger file systems.

  A hash table is superior due to the constant time lookup for directory entries with a good quality hash function.
  Using a hash function will require more CPU but reduces the # of I/O operations needed against the disk to find
  the desired inode.

1. What are the factors influencing the selection of a disk-scheduling algorithm? (8 marks)

  > Given so many disk-scheduling algorithms, how do we choose the best one?
  > SSTF is common and has a natural appeal because it increases performance over FCFS.
  > SCAN and C-Scan perform better for systems that place a heavy load on the disk,
  > because they are less likely to cause a starvation problem.
  > For any particular list of requests, we can define an optimal order of retrieval, but
  > the computation needed to find an optimal schedule may not justify the savings over SSTF or SCAN.
  > With any scheduling algorithm, hovever, performance depends heavily on the number and types of requests.
  > Then, all scheduling algorithms behave the same, because they have only one choice of where to move the disk head:
  > they all behave like FCFS scheduling.
  > Requests for disk service can be greatly influenced by the file-allocation method.
  > A program reading a contiguously allocated file will generate several requests that are close together
  > on the disk, resulting in greater head movement.
  > The location of directories and index blocks is also important. Since every file must be opened to be used,
  > and opening a file requires searching the directory structure, the directories will be access frequently.
  > Suppose that a directory entry is on the first cylinder and a file's data re on the final cylinder.
  > In this case, the disk head has to move the entire width of the disk. If the directory entry where on the middle
  > cylinder, the head would have to move only one-half the width. Caching the directories and index blocks in main
  > memory can also help to reduce disk-arm movement, particularly for read requests.
  > Because of these complexities, the disk-scheduling algorithm should be written as a separate module of the operating
  > system, so that it can be replaced with a different algorithm if necessary. Either SSTF or LOOK is a reasonable
  > choice for the default algorithm.
  > The scheduling algorithms described here consider only the seek distances.
  > For modern disks, the rotational latency can be nearly as large as the average seek time.
  > It is difficult for the operating system to schedule for improved rotational latency, though, because modern
  > disks do not disclose the physical location of logical blocks. Disk manufacturers have been
  > alleviating this problem by implementing disk-scheduling algorithms in the controller hardware built
  > into the disk drive. If the operating system sends a batch of requests to the controller, the controller
  > can queue them and then schedule them to improve both the seek time and the rotational latency.
  > If I/O performance were the only consideration, the operating system would gladly turn over the responsibility
  > of disk scheduling to the disk hard-ware. In practice, however, the operating system may have other constraints
  > on the service order for requests. For instance, demand paging may take priority over application I/O,
  > and writes are more urgent than reads if the cache is running out of free pages.
  > Also, it may be desirable to guarantee the oder of a set of disk writes to make the file system robust
  > in the face of system crashes. Consider what could happen if the operating system allocated a disk page
  > to a file and the application wrote data into that page before the operating system had a chance to flush
  > the file system metadata back to disk. To accommodate such requirements, an operating system may choose
  > to do its own disk scheduling and to spoon-feed the requests to the disk controller, one by one, for some
  > types of I/O.

1. Explain the disadvantage(s) of the SSTF scheduling algorithm. (8 marks)
  The Shortest Seek Time First algorithm will choose the next cylinder to read
  from that is closest to the current head position. This ensures that the
  distance that the head has to travel is minimized and allows for access to
  nearby data quickly.

  However, if multiple requests are added to the queue that all reside near
  each other this could cause other requests that are further away to starve.

  For example, if a request is added to a cylinder near the current head
  position. Then a request is added far away from the head position. Then
  a larger number of requests are placed that are near the current head
  position. This could cause the second request to starve while requests
  that came in later get served sooner. If more and more requests continue
  to be added to the queue that are deemed to have a shorter seek from
  the current head then the program waiting for the second request may
  eventually give up or have severe performance penalties.

1. Explain the concepts of a bus and a daisy chain. Indicate how these concepts are related. (8 marks)
1. What are the three reasons that buffering is performed? (6 marks)

## Sources

* [Operating System Concepts][os-book]

[os-book]: https://www.os-book.com/OS10/index.html