summaryrefslogtreecommitdiff
path: root/doc/2.md
blob: df6a1fd4107d6b47d9ec650c2d4aa09d4d72787e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
# Chapter 2: Operating System Structures

* Chapter 2: 2.1 to 2.5

## Operating-System Services

The OS provides an environment for the execution of programs.
It provides certain services to programs and to the users
of those programs.

services provided to the user:

* User Interface:
  * command line interface: text commands and a method for entering them.
  * batch interface: commands and directives to control those commands are entered into files, and those files are executed.
  * graphical user interface: a window system with a pointing device to direct I/O.
* Program execution: Load a program into memory and run that program.
* I/O operations: a running program may require I/O, which may involve a file or an I/O device.
* File-system manipulation:
  * read and write files
  * create and delete files by name
  * search for files
  * list file information
  * permissions management
* Communication:
  * inter process communication
  * shared memory
  * message passing
* Error detection:
  * CPU/Memory hardware
  * I/O devices
  * user programs

services for ensuring efficient operation:

* Resource allocation
  * CPU scheduling routines
  * allocate printers, modems, USB storage devices
* Accounting
  * keep track of which users use how mucn and what kinds of computer resources
* Protection and security
  * control access to information
  * one process should not be able to interfere with another process.
  * access to system resources are controlled
  * authentication

## User Operating-System Interface

### Command Interpreters

Some operating systems include a command interpreter in the kernel.
Others, such as WIndows and UNIX, treat the command interpreter as a special program
that is running when a job is initated or when a user first logs on.

On systems with multiple command interpreters to choose from, the interpreters are
known as shells. For example:

* sh: Bourne shell
* csh: C shell
* bash: Bourne-Again shell
* ksh: Korn shell
* zsh: Z shell

The main function of the command interpreter is to get and execute the next user-specified command.
Many commands will manipulate files:

* create
* delete
* list
* print
* copy
* execute

These can be implemented in two general ways:

1. the command interpreter has the code to execute the command: direct system calls
2. system programs: the interpreter uses the command to do the work.

The second approach allows programmers to add new commands to the system. The command
interpreter can be small and delegate to system programs to do the work and extend it.

### Graphical User Interfaces (GUI)

Uses a mouse-based window and menu system characterized by a desktop metaphor.
The user user moves the mouse to position the pointer on images, icons, files,
directories, programs and can invoke it with a mouse click.

GUI's first appeared due in part to research taking place in the early 1970's at Xerox PARC.
The first GUI appeared on the Xerox Alto computer in 1973. However, graphical interfaces became
more widespread with the advent of Apple Macintosh computers in the 1980s.

The UI for macOS has undergone many changes but most significant being the adoption of the `Aqua`
interface that appeared with macOS 10.

Microsoft's first version of Windows - Version 1.0 - was based on the addition of a GUI interface
to the MS-DOS operating system.

Smartphones and handheld tablets use a touchscreen interface. Users interact using gestures on the touchscreen.

UNIX systems have been dominated by command-line interfaces.
GUI interfaces are:

* Common Desktop Environment (CDE)
* X-Windows ssytems
* K Desktop Environment (KDE)
* GNOME desktop by the GNU project

Both KDE and GNOME run on Linux and various UNIX systems and are available under open-source licenses.

## System Calls

System calls provide an interface to the services made available by an operating system.
These calls are generally available as routines written in C an C++.

Systems execute thousands of system calls per second. Developers design programs according to an
application programming interface (API). The API specifies a set of functions that are available
to an application programmer, including parameters that are passed to each function and the return
values the programmer can expect.

Common API's are:

* Windows API for MS Windows
* POSIX API for POSIX-based systems (UNIX, Linux, macOS)
* Java API for programs that run on the JVM.

API's are accessed via a library of code provided by the operating system.
In the case of UNIX and Linux for programs written in the C language, the library
is called `libc`. Each operating system has it's own name for each system call.

The functions that make up the API typically invoke the actual system calls on behalf of the
application programmer.

Example of API

```c
#include <unistd.h>

ssize_t read(int fd, void *buf, size_t count);
```

Advantages of API:

* API's provide portability.
* system calls are more difficult to work with
* most programming languages provide a `system-call interface` that serves as a link to system calls made available by the OS.
* complex details are handled by API rather than application developer.

System call interface:

* a number is associated with each system call
* the system-call interface maintains a table indexed according to these numbers
* the system-call interface invokes the intended system call in the operating-system kernel and returns the status of the system call and any return values.

General methods to pass parameters to the operating system:

1. pass parameters in registers
  * params are stored in a block, or table in memory and the address of the block is passed as a param in a register. (Linux/Solaris way)
2. params are pushed onto the stack by the program and popped off the stack by the operating system.

Some OS's prefer the block or stack method because it does not limit the # or length of params passed in.

Types of System calls:

* process controll
* file manipulation
* device manipulation
* information maintenance
* communications
* protection

### Fiile Management

* `create()`
* `delete()`
* `open()`
* `read()`
* `write()`
* `reposition()`
* `close()`
* `get_file_attributes()`
* `set_file_attributes()`
* `move()`
* `copy()`

### Device Management

A process may need several resources to execute

* main memory
* disk drives
* access to files
* `request()`
* `release()`

Sometimes I/O devices are identified by special file names, directory placement, or file attributes.

### Information Maintenance

* `dump()`
* the `trace` program lists each system call as it is executed.

Event microprocessors provide a CPU mode known as single step, in which a trap is executed by the CPU after every instruction.
Many operating systems provide a time profile of a program to indicate the amount of time that the program executes at a particular location
or set of locations.

The kernel keeps information about all its processes, and system calls are used to access this information.

* `get_process_attributes()`
* `set_process_attributes()`

### Communication

There are two common models of interprocess communication.

* message passing model
* shared-memory model

In the `message-passing model` the communicating processes exchange messages with one another to transfer information.
Messages can be exchanged between the processes either directly or indirectly through a common mailbox.
Before communication can take place, a connection must be opened. The name of the other communicator must be known.

* another process on the same system
* process on another computer connected by a communications network.

Each computer on a network has a host name by which it is commonly known.
A host also has a network identifier, such as an IP address.

Each process has a name and this name is translated into an identifier by which the operating system can refer to the process.
The `get_hostid()` and `get_processid()` system calls do this translation.

Most processes that receive connections are special purpose daemons, which are system programs provided for that purpose.
They execute a `wait_for_connection()` call and are awakened when a connection is made.

The source of the connection is the client and the daemon is the server.

* `read_message()`
* `write_message()`
* `close_connection()`

In the `shared-message model`, processes use `shared_memory_create()` and `shared_memory_attach()` system calls to create and
gain access to regions of memory owned by other processes.

### Protection

* `set_permission()`
* `get_permission()`
* `allow_user()`
* `deny_user()`

## Systems Programming

`Systems programs`, also known as `system utils` provide a convenient environment for program development and execution.

* File management: These programs create, delete, copy, rename, print, dump, list and generally manipulate files and directories.
* Status information: Some programs simply ask the system for the date, time, amount of available memory or disk space, number of users, or similar status information.
* File modification: Several text editors may be available to create and modify the content of file stored on disk or other storage devices.
* Programming-language support: Compilers, assemblers, debuggers, and interpreters for common programming languages (such as C, C++, Java, and PERL) are often provided with the operating system or available as a separate download.
* Program loading and execution:
  * Once a program is assembled or compiled, it must be loaded into memory to be executed.
  * The system may provide absolute loaders, relocatable loaders, linkage editors, and overlay loaders.
  * Debugging systems for either higher-level languages or machine language are needed as well.
* Communications:
  * these programs provide the mechanism for creating virtual connections among processes, users, and computer systems.
  * They allow users to send messages to one another's screens, to browse Web pages, to send e-mail messages, to log in remotely, or to transfer files from one machine to another.
* Background Services:
  * All general-purpose systems have methods for launching certain system-program processes at boot time.
  * Some of these processes terminate after completing their tasks, while others continue to run until the system is halted.
  * constantly running system-program processes are known as services, subsystems or daemons. E.g. the network daemon

Most operating systems also supply programs that are helpful in solving common problems or performing commong operations.

These `application programs` include:

* web browsers
* word processors
* text formatters
* spreadsheets
* database systems
* compilers
* plotting and statistical analysis
* games

## Design Goal

* User goals
  * users want the system to be convenient to use
  * easy to learn and to use
  * reliable
  * safe
  * fast
* System goals
  * system should be easy to design
  * system should be easy to implement
  * system should be easy to maintain
  * system should be flexible
  * system should be reliable
  * system should be error free
  * system should be efficient

### Mechanisms and Policies

One important principle is the separate of `policy` from `mechanism`.
Mechanisms determine `how` to do something.
Policies determine `what` will be done.

E.g.

The timer is a mechanism for ensuring CPU protection.
Deciding how long the timer is to be set is a policy decision.

Separating mechanism from policy is important for flexibility.
Policies are likely to change across places or over time.
In the worst case, each change in policy would require a change
in the underlying mechanism.

### Implementation

Once an operating system is designed, it must be implemented.
Early operating systems were written in assembly language.
The Linux and Windows operating system kernels are written mostly in C.
Small sections are written in assembly code for device drivers and for saving
and restoring the state of registers.

MS-DOS was written in Intel 8088 assembly language. It runs natively only on
Intel X86 family of CPU's

The Linux operating system is written mostly in C and is available natively on a number
of different CPUs, including Intel x86, Oracle SPARC and IBM PowerPC.

Major performance improvements in operating systems are more likely to be the result of better
data structures and algorithms than of excellent assembly-language code. In addition, although
operating systems are large, only a smal amount of the code is critical to high
performance;

* the interrupt handler
* I/O manager
* memory manager
* CPU scheduler

After the system is written, bottleneck routines can be identified and can be replaced with
assembly-language equivalents.

## Operating-System Structure

### Simple Structure

MS-DOS

```plaintext
---------------------------
| application program     |
---------------------------
| resident system program |
---------------------------
| MS-DOS device drivers   |
---------------------------
| ROM BIOS device drivers |
---------------------------
```

UNIX

```plaintext
|----------------------------------------------------------------|
| the users                                                      |
|----------------------------------------------------------------|
| shells and commands                                            |
| compilers and interpreters                                     |
| system libraries                                               |
|----------------------------------------------------------------|
|----------------------------------------------------------------|
| system-call interface to the kernel                            |
|----------------------------------------------------------------|
|----------------------------------------------------------------|
| signals terminal     | file system           | CPU scheduling  |
| handling             | swapping block I/O    | page replacement|
| character I/O system | system                | demand paging   |
| terminal drivers     | disk and tape drivers | virtual memory  |
|----------------------------------------------------------------|
|----------------------------------------------------------------|
| terminal controllers | device controllers | memory controllers |
| terminals            | disks and tapes    | physical memory    |
|----------------------------------------------------------------|
```

### Layered Approach

With hardware support, operating systems can be broken into pieces
that are smaller and more appropriate than those allowed by the
original MS-DOW and UNIX systems.

The layered approach breaks up the operating system into multiple layers.
The bottom layer (level 0) is the hardware; the highest layer (layer N) is
the user interface.

```plaintext
| layer N (UI)     |
| ...              |
| layer 1          |
| layer 0 hardware |
```

### Microkernels

As UNIX expanded, the kernel became large and difficult to manage.
In the mid-1980's, researchers at Carnegie Mellon University developed
an operating system called 'Mach' that modularized the kernel using
the microkernel approach. This method structures the operating system by
removing all nonessential components from the kernel and implementing
them as system and user-level programs.

The result is a smaller kernel but there is little consensus on which services
should remain in the kernel and which should be implemented in user space.

microkernels provide minimal process and memory management, in addition to
communication facility.

Communication is provided through message passing.

Microkernels make it easier to extend the operating system.
It's easier to port from one hardware design to another.
It provides more security and reliability.

The macOS kernel (aka Darwin) is also partly based on the Mach microkernel.

Windows NT 4.0 was a microkernel but was slow. By the time it got to
Windows XP it had become a monolithic kernel.

### Modules

The kernel has a set of core components and links in additional services via modules,
either at boot time or during run time. aka loadable kernel modules.

This design is implemented in Solaris, Linux, macOS and Windows.

The kernel provides core services while other services are implemented dynamically,
as the kernel is running. Linking services dynamically is preferable to adding
new features directly to the kernel, which would require recompiling the kernel
every time a change was made.

The Solaris operating system structure is organized around a core kernel with
seven types of loadable kernel modules:

1. Scheduling classes
1. File systems
1. Loadable system calls
1. Executable formats
1. STREAMS modules
1. Miscellaneous
1. Device and bus drivers

### Hybrid Systems

Few operating systems adopt a single, strictly defined structure.
Instead, they combine different structures, resulting in hybrid
systems that address performance, security and usability issues.
Linux and Solaris are monolithic, because having the operating system
in a single address space provides very efficient performance.
However they are also modular, so new functionality can be dynamically
added to the kernel.

#### Apple macOS X

Apple macOS uses a hybrid structure.
The top layers include the Aqua user interface and a set of application
environments and services.
Cocoa environment specifies an API for the Objective-C programming
language, which is used for writing macOS applications.

Below these layers is the kernel environment, which consists primarily
of the Mach microkernel and BSD UNIX kernel.
Mach provides memory management; support for remote procedure calls (RPCs)
and interprocess communication (IPC) facilities, including message passing;
and thread scheduling.
The BSD component provides a BSD command-line interface, support for networking
and file systems, and an implementation of POSIX APIs, including Pthreads.

In addition to Mach and BSD, the kernel environment provides an I/O kit
for development of device drivers and dynamically loadable modules (which
macOS refers to as kernel extensions).
The BSD application environment can make use of BSD facilities directly.

```plaintext
----------------------------------------
| GUI (Aqua)                           |
----------------------------------------
| application environment and services |
| (Java) (Cocoa) (Quicktime) (BSD)     |
----------------------------------------
| kernel       |                       |
|              |          BSD          |
|              |------------------------
|  Mach                                |
|                                      |
----------------------------------------
| I/O kit      |   kernel extensions   |
----------------------------------------
```

#### iOS

iOS is a mobile operating system designed by Apple to run its smartphone,
the iPhone, as well as its table computer, the iPad. iOS is structured on
the Mac OS X operating system, with added functionality pertinent to mobile
devices, but does not directly run macOS applications.

Cocoa Touch is an API for Objective-C that provides several frameworks for
developing applications that run on iOS devices. Cocoa Touch provides
support for hardware features unique to mobile devices, such as touch screens.

The media services layer provides services for graphics, audio and video.
The core services layer provides features like support for cloud computing
and databases. The bottom layer represents the core operating ssytem which
is based on the kernel environment.

```plaintext
------------------
| Cocoa Touch    |
------------------
| Media Services |
------------------
| Core Services  |
------------------
| Core OS        |
------------------
```

#### Android

The Android operating system was designed by the Open Handset Alliance
(led primarily by Google) and was developed for Android smartphones and
tablet computers.

Android runs on a variety of mobile platforms and is open-source.

```plaintext
----------------------------------------------------
|                 Applications                     |
----------------------------------------------------
----------------------------------------------------
|            Application Framework                 |
----------------------------------------------------
-----------------------------  --------------------
|   Libraries               | |  Android runtime   |
| ---------   ----------    | | ------------------ |
| | SQLite |  | openGL |    | | | Core libraries | |
| ---------   ----------    | | ------------------ |
| ----------- ------------- | |  ----------        |
| | surface | | media     | | |  | Dalvik |        |
| | manager | | framework | | |  |   VM   |        |
| ----------- ------------- | |  ----------        |
| ----------  --------      | |                    |
| | webkit |  | libc |      | |                    |
| ----------  --------      | |                    |
----------------------------   --------------------
----------------------------------------------------
|                  Linux kernel                    |
----------------------------------------------------
```

## Operating System Debugging

`debugging` is the activity of finding and fixing errors in a system,
both in hardware and in software.
`debugging` can also include `performance tuning` which seeks to
improve the performance by removing `bottlenecks`.

### Failure Analysis

If a process fails, most operating ssytems write the error information
to a log file to alert system operators of users that the problem occurred.
The OS can also take a `core dump`. A `core dump` is a capture of the
process memory and stored in a file. Memory used to be referred to as
"core" memory.

A failure in the kernel is called a `crash`. When a crash occurs, error
information is saved to a log file, and the memory state is saved to a `crash dump`.

A common technique is to save the kernel's memory state to a section of disk
set aside for this purpose that contains no file system. If the kernel detects
an unrecoverable error, it writes the entire contents of memory, or at least the
kernel-owned parts of the system memory, to the disk area.
When the system reboots, a process runs to gather the data from that area and write
it to a crash dump file within a file system for analysis.

### Performance Tuning

The operating system must have some means of computing and displaying
measures of system behaviour.

In many systems, the OS does this by providing `trace listings` of system behaviour.
E.g. `top`.

### DTrace

DTrace is a facility that dynamically adds probes to a running system, both
in user processes and in the kernel. These probes can be queried via the
D programming language to determine an astonishing amount about the kernel,
the system state, and process activities.

Lines ending with "U" are executed in user mode, and lines ending in "K" in kernel mode.

Debugging interactions between user-level and kernel code is difficult without tools
that understand both sets of code and can instrument the interaction.

`profiling`, which periodically samples the instruction pointer to determine which code
is being executed, can show statistical trends but not individual activities.

Code can be included in the kernel to emit specific data under specific circumstances, but
that code slows down the kernel and test not to be included in the part of the kernel
where the specific problem being debugged is occurring.

`DTrace` runs on production systems and causes no harm to the system.
It slows activies while enabled, but after execution it resets the system to its
pre-debugging state.

It can broadly debug everything happening in the system (both the user and kernel levels
and between the user and kernel layers).

DTrace is composed of a compiler, a framework, `providers` of `probes` written within
that framework, and `consumers` of those probes.

## Operating-System Generation

Operating systems are designed to runon any of a class of machines at a variety of sites with a 
variety of peripheral configurations. The system must then be configured or generated
for each specific computer site, a process sometimes known as system generation `SYSGEN`.

Operating systems are usually distributed as an ISO image, which if a file in the format
of a CD-ROM or DVD-ROM. To generate a system we use a special program.
The SYSGEN program reads from a given file, or asks the operator of the system for info
concerning the specific configuration of the hardware system, or probes the hardware
directly to determine what components are there.

The following kinds of info mut be determined:

* What CPU is to be used?
* What options are installed? e.g. extended instruction sets, floating point arithmetic
* How will the boot disk be formatted?
* How many sections or partitions will it be separated into?
* What will go in each partition?
* How much memory is available?
* What devices are available?
* What operating-system options are desired or what parameter values are to be used?

Once this information is determined, it can be used in several ways.

1. A sysadmin can modify a copy of the src to compile a custom copy of the OS.
1. A less tailored version can lead to the creation of tables and the selection of modules from a precompiled library.

The major differences are the size and generality of the generated system and the easy of
modifying it as the hardware configuration changes.

## System Boot

How does the hardware know where the kernel is or how to load the kernel?
The procedure of starting a computer by loading the kernel is known as booting the system.

On most computer systems, a small piece of code known as the bootstrap program or bootstrap loader
locates the kernel, loads it into main memory, and starts its execution.

Some computer systems use a two-step process in which a simple bootstrap loader fetches a more
complex boot program from disk, which in turn loads the kernel.

When a CPU receives a reset event the instruction register is loaded with a predefined
memory location, and execution starts there. At that location is the initial bootstrap
program. This program is in the form of read-only memory (ROM) because the RAM is in an unknown
state at system startup. ROM is convenient because it needs no initialization and cannot
easily be infected by a computer virus.

The bootstrap program can run diagnostics to determine the state of the machine.
If the diagnostics pass, the program can continue with the booting steps.
It can also initialize all aspects of the system, from CPU registers to device controllers
and the contents of main memory.

Some systems store the entire operating systems in ROM.
Storing the OS in ROM is suitable for small operating systems,
simple supporting hardware, and rugged operation.

A problem with this approach is that changing the bootstrap code requires changing the ROM
hardware chips. Some systems use erasable programable read-only memory (EPROM), which
is read-only except when explicitly given a command to become writable.

All forms of ROM are also known as firmware, since their characteristics fall somewhere
between those of hardware and those of software. Executing code on firmware is slower
than executing code in RAM.

Some systems store the OS in firmware and copy it to RAM for fast execution.

For large operating systems the bootstrap loader is stored in firmware and the operating
system is on disk. In this case, the bootstrap runs diagnostics and has a bit of code
that can read a single block at a fixed location from disk into memory and execute the
code from that boot block.

The program stored in the boot block may be sophisticated enough to load the entire OS
into memory and begin its execution.

GRUB is an example of an open-source bootstrap program for Linux systems. All of the
disk-bound bootstrap, and the operating system itself, can be easily changed by writing
new versions to disk.
A disk that has a boot partition is called a boot disk or system disk.

Once the bootstrap program has been loaded, it can traverse the file system
to find the operating system kernel, load it into memory, and start its execution.
At this point the system is running.

## Summary

Operating systems provide a # of services. At the lowest level, system calls allow a running
program to make requests from the operating system directly. At a higher level, the command
interpreter or shell provides a mechanism for a user to issue a request without writing a
program.

Commands may come from files during batch-mode execution or directly from a terminal
or desktop GUI when in an interactive or time-shared mode. System programs are provided
to satisfy many common user requests.