1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
|
## Question 1
Consider the following dimensions, dimension attributes, and dimension sizes for Farm Coop-Insurance.
```text
Member(member_ID, Name, Address)
InsuredItem(Item_ID, Description, Coverage_Type)
CoopOffice(Office_ID, Address, Manager_name)
Policy(Policy_ID, Type)
Period(Date_Key, Fiscal_Period)
Claim(Claim_ID, Claim_Description, Claim_Type)
```

* On average, there are two members for each policy and item covered by the policy.
* There is an average of ten covered items per policy.
* Each policy is registered and managed by only one office of the cooperative.
* The company has approximately one million policies at the present time.
* Approximately five percent of these policies experience some change each month.
* The length of the fiscal period is one month.
* The decision system reports are supposed to be based on five years of data.
The facts to be recorded for each combination of these dimensions are:
* `Policy_Premium`
* `Deductible`
* `Monthly_Claim_Total`
**Design a star schema for this problem.**

**Using the assumptions stated above, estimate the number of rows in the fact table.**
* 1 policy has 2 members
* 1 policy has 10 items
* 1 office for each policy
* 1M total policies
* 5% of 1M policies = 50,000 policy changes each month
* 5 years is 60 months
* In 60 months there are 3M policy changes
```text
Total rows = 1 * 2 * 10 * 1 * 50K * 60
= 60,000,000 rows
```
~= 60,000,000 rows
**Estimate the total size of the fact table (in bytes), assuming an average of 5 bytes per field.**
```text
Total size = 9 columns * 5 bytes * N rows
= 45 bytes * N rows
= 45 bytes * 60,000,000 rows
= 2,700,000,000 bytes
```
~= 2.7 gigabytes
## Question 2
Suggest an appropriate recovery technique that a database administrator could use to resolve each of the following situations.
**A network disconnection occurs while a user is entering a transaction at an ATM bank machine.**
Perform a `Backward Recovery` i.e. rollback the transaction.
**A disk drive fails during regular operations while a clerk is entering data about newly registered students.**
If disk mirroring is enabled then one of the secondary nodes can be promoted to become the new primary.
If mirroring is not enabled, then replacing the drive and restoring from the most recent backup can
help to restore service. The failed drive can be analyzed to see what bin/transaction logs can be
recovered so that they can be replayed against the new primary from the time of the last backup.
**The registration office at a university entered an incorrect amount for a student tuition payment. The error was discovered by the financial services department several weeks later.**
Apply a compensating transaction to the system in order to correct the error.
**The database administrator of a financial institution performed a full database backup, but forgot to activate the journalizing facility. Afterwards, data entry clerks at the financial institution entered transactions for two hours before the database became corrupt. It is discovered that the journalizing facility of the database has not been activated since the backup was made.**
Restore the database to the most recent database backup and apply the after images from the database change log.
## Question 3
Suggest the most appropriate security measures for each of the situations described below.
**The Western Union bank uses an electronic funds transfer (EFT) system to transmit sensitive financial data between its branches all over the world.**
* Server security: Ensure all servers are patched and hardened appropriately. Ideally data is transmitted over an authenticated and authorized API endpoints with short lived authn/authz tokens that can be revoked.
* Network security: Ensure that all data is transported over an encrypted transport layer like TLS and that the forward secrecy and that only trusted cipher suites are enabled on server. Mutual TLS handshake for client/server would be nice as well.
* Data Privacy: Consider storing the data on disk in an encrypted form and restrict access to the data via the proper access controls.
**A fighter jet simulation company has set up an off-site computer-based training centre for the F35.
The company wishes to restrict access to the site to authorized employees.
Since each employee's use of the centre is occasional, it does not wish to provide the employees with keys to access the centre.**
Physical access to the data should not be the only way to restrict access.
Data should be protected by multiple layers of security in the event that there is a breach in the physical facility.
The physical location should be secured via some form of physical access controls such as proximity access cards that can be configured to provided
access during specific operational hours. Installation of physical cameras with on-premise security might be worth considering.
Access to and from the facility should be recorded. Employees should wear their badges so that an image of their face with a name and clearance level
can clearly be seen. It should be easy to identify people in the facility that do not belong in the facility.
**A golf club uses a simple password system to protect its database.
The club has created a new Web site to allow both its members and its employees to use the new Web-based system to access and update information.
The club finds that it needs a more comprehensive security system to grant different privileges (such as read-only versus create or update) to different users.**
Considering adding roles based access controls or permissions based access controls at the application layer if possible.
Otherwise, you can use the access controls built into most databases.
Databases offer the ability to create different users/roles with different levels of access to perform specific operations on the specific tables.
You could create separate users for each member and employee and separate roles for members and employees.
You could control access by specifying individual permissions for a user or role.
Then ensure that users are attached to the appropriate role.
**A training centre at the Northern Technical University has experienced considerable difficulty with unauthorized users who access files and databases by appropriating passwords from legitimate users.**
Require a second factor of authentication for all users of the system. If passwords are continuously being compromised it's possible that a stronger password policy is necessary
but adding a second factor of authentication such as a time based one time password ([TOTP](https://tools.ietf.org/html/rfc6238)) should aid in blocking unprivileged access.
## Question 4
**The UBS broker company has a database server with three disks.
Both the accounting and stock exchange applications share the same disk, and they are experiencing performance problems.
Discuss potential reasons for the performance problem, and suggest how to reduce I/O contention.**
All I/O operations are occurring on a single disk which is causing a bottleneck. Specific tables or datafiles can be moved to distribute the I/O across the 3 disks
or the data can be striped across the three disks ensuring that the I/O is equally balanced across all three disks.
Another area to look at, is the query logs. The query logs will allow the DBA to see if there are expensive queries due to many disk seeks due to a missing database index.
An index can help by trying to make sure related data is logically grouped on disk to reduce disk seeks back and forth.
If the applications are read heavy applications then it would make sense to ensure that solid state drives are used to optimize the reads.
## Question 5
Consider the concurrent execution of the following transactions.
```markdown
| T1 | T2 |
| ------- | ------- |
| Read A | Read B |
| Read B | Write A |
| Write C | Read C |
| Write A | Write B |
| Commit | Commit |
```
**List two problems that may occur from the concurrent execution of these two transactions.**
1. T1/T2 will overwrite data `A` depending on the order the transactions are committed.
2. T2 could be operating on a stale read of data `C` that is being modified in T1.
## Question 6
The Edmonton International Airport would like to implement a database that will be used to keep track of airplanes, their owners, airport employees, and pilots.
From the requirements for this database, the following information was collected.
* Each airplane has a registration number, is of a particular plane type, and is stored in a particular hangar.
* Each hangar is managed by an employee who supervises the maintenance services performed in that hangar.
* Each plane type has a model number, a capacity, and a weight.
* Each hangar has a number, a capacity, and a location.
* The database keeps track of the owner of each plane, and the employees who have maintained each plane.
* The database keeps track of each airplane’s purchase date.
* Each maintenance service record is identified by a work code, and includes the employee who performed the service, the date and time of the service, and the number of hours the maintenance service required.
* Each plane undergoes service many times, and all its service records are kept.
* An owner can be either a person or a corporation.
* A person can be an owner, a pilot, or an employee of the airport.
* Each pilot has specific attributes, including license number and restrictions.
* Each employee has specific attributes, including salary and shift worked.
* The database stores social insurance number, name, address, and telephone number for all person entities.
* The database stores name, address, and telephone number for all corporation entities.
* The database also keeps track of the types of plane each pilot is authorized to fly, and the types of plane each employee is qualified to service.
**Draw an object-oriented diagram for the Edmonton International Airport database.**

|