## Project Module 1 For a long time, the Royal Victoria Hospital (RVH) worked with an information system that consisted of a mix of paper-based files and small independent databases developed within some departments. The new administration has created an information system (IS) department, and they hired you as information officer: head of the team in charge of the design and implementation of a new global information system. The following information was collected by the first team from the IS department, who conducted interviews with some of the hospital administration and staff to identify entity types for the hospital. The hospital depends primarily on four groups of people: * employees * physicians * patients * volunteers Of course, some common attributes are shared by all of these groups: * person_ID (identifier) * name * address * birth date * phone number Each group also has at least one unique attribute of its own. Employees have a date hired, volunteers have a skill, physicians have a specialty and a pager number, and patients have a contact date (date of the first contact with the hospital). Some people may belong to two or more of these groups at a given time (e.g., patient and volunteer). Patient: A person who is either admitted to the hospital, or is registered in an outpatient program. One, and only one, physician is responsible for each patient. Patients are divided into two groups: resident and outpatient. Each outpatient is scheduled for zero or more visits. The entity visit has two attributes: date (partial identifier), and comments. Note that an instance of visit cannot exist without an outpatient owner entity. Only resident patients are assigned to a bed, and a bed may or may not be assigned to a patient. Physician: A member of the hospital staff who may admit patients to the hospital, and who can administer medical treatments. A given physician can be responsible for zero or more patients at a given time. A patient must be referred to the hospital by exactly one physician. A physician can refer any number of patients, or may not refer any patient. Physicians may perform any number of treatments on behalf of any number of patients, or may not perform any treatment. A patient may have treatments performed by any number of physicians. For each treatment performed on behalf of a given patient by a particular physician, the hospital records the treatment date, treatment time, and results. Employee: Any person employed as part of the hospital staff. Employees are subdivided into three groups: nurse, staff, and technician. Only nurse has the attribute certificate, which indicates a qualification (RN, LPN, etc.). Only staff has the attribute job class, and only technician has the attribute skill. Each nurse is assigned to one (and only one) care centre. Each technician is assigned to one or more laboratories. Care centre: A treatment centre within the hospital. Examples of care centres are maternity, emergency, and cardiology. Attributes of care centre are name (identifier) and location. A care centre may have one or more nurses assigned to it. Also, one of the nurses assigned to each care centre is appointed nurse-in-charge. A nurse cannot be appointed nurse-in-charge of a care centre unless s/he has an RN certificate. Each hospital employee is assigned to work in one or more care centre. Each care centre has at least one employee, and may have any number of employees. The hospital records the number of hours per week that a given employee works in a particular care centre. Each physician can be assigned to one or more care centres, and a care centre can have one or more physicians assigned to it. Laboratory: A unit in the hospital where clinical tests (i.e., blood, urine, tissue, etc.) are performed to obtain information about the health of a patient. Attributes of laboratory include name (identifier) and location. A laboratory must have one or more technicians assigned to it. Bed: A hospital bed that may be assigned to a resident patient who is admitted to the hospital. Each bed has a bed number, a room number, and a care centre ID. There may be no bed assigned to a care centre, or a care centre may have one or more beds assigned to it. Item: Any medical or surgical item that is used in treating a patient. Each item has an item number, description, and unit cost. A patient may optionally consume any number of items. A given item may be consumed by one or more patients, or may not be consumed. For each item consumed by a patient, the hospital records the date, time, quantity, and total cost (which can be computed by multiplying quantity by unit cost). Treatment: Any test or procedure performed by a physician on behalf of a patient. Each treatment has a treatment ID, which consists of a treatment number and a treatment name. **Is the ability to model subtype/supertype relationships likely to be important in a hospital environment such as the RVH?** It's possible that implementation inheritance may need to be modeled in this case due to the following reasons: * All the people described above share some common attributes which could indicate a need for a subtype/supertype relationship. * Employees are subdivided into three groups: nurse, staff and technician. The supertype would allow grouping the common attributes of the subtypes to reduce duplication. The subtypes are important because the ensure that logic related to the subtype can be isolated to the subtype. I think that when people model data they tend to jump towards class/subclass relationship too early in the design process. I prefer to favour composition over inheritance. If the [Liskov substitution principle](https://en.wikipedia.org/wiki/Liskov_substitution_principle) is satisfied for any relationship, then I think utilizing implementation inheritance is acceptable. **Can the business rules paradigm, and the ability to easily define, implement, and maintain business rules, be used as a competitive advantage in a hospital environment such as the RVH?** Yes. Change is the only constant in software development. Designing software so that it can be extended, changed and modified allows the needs of the organization to change with the freedom of knowing that the software can change with it. **Do there appear to be any weak entities in the description of the data requirements in this project module?** > A weak entity is a type of entity that exists when some other type exists so it is dependent on an identifying owner. The `Visit` entity appears to be a weak entity due to the fact that it's existance relies on requiring an associated outpatient owner entity. **Draw an EER diagram to accurately represent this set of requirements. State any assumption you had to make in developing the diagram.** ![EER](./module-1.png) I have chosen to model the different types of staff as subclasses of a `Staff` class. Each of the different types of Staff can have different skills modelled as a text field in the `qualifications` attribute on the `Staff` relation. Every entity has a primary key called `id`. Foreign keys are modeled using a convention of `_id`. **Are there any universal data models that can be reused as a starting point for modeling RVH’s data requirements?** The [Oracle Healthcare Data Model](https://docs.oracle.com/cd/E18811_01/doc/doc.112/e18026/logical_physical_hdm.htm#CACBAAAH) is a large model that covers many different aspects of health care. For the purpose of this project this data model includes more functionality than what is necessary for this project. ## Project Module 2 Use the relational schema of the EER diagram you developed in Module 1 to answer the following questions. **Should the RVH use normalization when designing its database?** Yes, minimizing duplication will make it easier to keep changes synchronized. At the moment there is no identified need to de-normalize the database. **Why are entity integrity and referential integrity constraints of importance to the hospital?** It's possible that people will need to change their address or phone number. By ensuring that the proper referential integrity constraints in place we make it easier to update records and limit the risk of creating orphaned records. This helps to keep out single source of truth organized, which makes it easier for scheduling personnel and ensuring that patients receive the care that they need. **Map the EER diagram to a relational schema, and transform the relation into 3NF** ![relational schema](./module-2.png) **Besides the 3NF relations, what additional types of information are required to create a physical database design?** * Usage patterns * Regional policies related to location of storage. .e.g EU data regulations. * Data retention policies * Data segregation policies * Backup * Security * Throughput * Maintenance **Are there opportunities for horizontal or vertical partitioning of the database? Are there other opportunities to denormalize the relations of this database? If no, explain why? If yes, how might you denormalize the database?** > Horizontal partitioning is similar to creating supertype/subtype relationship because different types of the entity are involved in different relationships. This schema has some supertype/subtype relationships so horizontal scaling is possible. We could also look at partitioning the data by care centre, so that there would be a separate "staff" table for each care centre. Performing a vertical partition like this would isolate the data related to each care centre separately but does increase the complexity of the program working with this data. This may also lead to some unnecessary duplication for scenarios like staff who work in multiple care centres. I chose to denormalize the data by flattening the supertype/subtype hierachy into a single table (staff). This is more commonly known as Single Table Inheritance. A 'type' column is used as a discriminator to decide what type of subclass and portions of the table to load. **Suppose the date treatment performed was not entered. What procedures are required to handle the missing data?** * Add logic in the application layer to validate the input data. * Add a NOT NULL check constraint on the column in the database. **Consider the following query against the RVH database.** For each treatment performed in the past two weeks, list the physicians performing the treatment (grouped by treatment), and the number of times this physician performed that particular treatment, on that particular day. Order the list by treatment ID, and by reverse chronological order for each treatment ID. ```sql SELECT treatments.id, date(treatments.occurred_at), staff.id, count(treatments.id) FROM treatments INNER JOIN patients ON patients.id = treatments.patient_id INNER JOIN staff ON staff.id = patients.physician_id AND staff.type = "Physician" INNER JOIN accounts ON accounts.id = staff.account_id WHERE occurred_at > current_date - interval '14 days' GROUP BY treatments.id, DATE(treatments.occurred_at), staff.id ORDER BY treatments.id DESC ``` Create secondary key indexes to optimize the performance of this query. State any assumptions. Each of the joins above take advantage of foreign key indexes except for the group by on the `treatments.occurred_at` column and the filter to find all staff with a type of `Physician`. The query above groups by date so we can reduce the size of the index by applying the `DATE` function to the `occurred_at` column. ```sql CREATE INDEX index_occurred_at_on_treatments ON treatments ((occurred_at::DATE)); CREATE INDEX intex_type_on_staff ON staff (type); ``` ## Project Module 3 Consider the following relations: ```text Patients(pid, name, address, telephone, care_centre_id) Care_centres(cid, name, location, nurse_charge_id) Treatments(tid, patient_id, physician_id, treatment_name, date) Nurses(nid, name, care_centre_id, certificate_type, telephone, salary) Physicians(phid, name, pager_number, specialization, salary). ``` Use Oracle to complete the following tasks. **Create the tables that correspond to these relations in your Oracle home database.** ```sql CREATE TABLE Nurses( nid bigint primary key, name varchar, care_centre_id bigint, certificate_type varchar, telephone varchar, salary decimal ); CREATE TABLE Care_centres( cid bigint primary key, name varchar, location text, nurse_charge_id bigint references Nurses(nid) ); CREATE TABLE Patients( pid bigint primary key, name varchar, address text, telephone varchar, care_centre_id bigint references Care_centres(cid) ); CREATE TABLE Physicians( phid bigint primary key, name varchar, pager_number varchar, specialization varchar, salary decimal ); CREATE TABLE Treatments( tid bigint primary key, patient_id bigint references Patients(pid), physician_id bigint references Physicians(phid), treatment_name varchar, "date" timestamp ); ``` **If not automatically created by the DBMS, create indexes corresponding to the primary and foreign keys.** ```sql ALTER TABLE Nurses ADD CONSTRAINT care_centres_fk FOREIGN KEY (care_centre_id) REFERENCES Care_centres (cid); ``` **Populate these tables with some sample data, and write SQL queries that show the content of each table after entering the data.** ```sql INSERT INTO Care_centres VALUES(1, 'Intensive Care Unit', 'MT-M'); INSERT INTO Nurses VALUES (1, 'Sharp', 1, 'birth', '111-222-3333', 100000); INSERT INTO Nurses VALUES (2, 'Barton', 1, 'RN', '111-222-3333', 100000); INSERT INTO Nurses VALUES (3, 'Nightingale', 1, 'birth', '111-222-3333', 100000); INSERT INTO Nurses VALUES (4, 'Seacole', 1, 'RN', '111-222-3333', 50000); INSERT INTO Nurses VALUES (5, 'Dix', 1, 'birth', '111-222-3333', 100000); INSERT INTO Nurses VALUES (6, 'Mahoney', 1, 'birth', '111-222-3333', 50000); INSERT INTO Nurses VALUES (7, 'Wald', 1, 'birth', '111-222-3333', 100000); INSERT INTO Nurses VALUES (8, 'Sanger', 1, 'RN', '111-222-3333', 100000); INSERT INTO Nurses VALUES (9, 'Breckinridge', 1, 'birth', '111-222-3333', 50000); INSERT INTO Nurses VALUES (10, 'Teresa', 1, 'birth', '111-222-3333', 0); INSERT INTO Nurses VALUES (11, 'Bertschinger', 1, 'RN', '111-222-3333', 50); UPDATE Care_centres SET nurse_charge_id = 1 where cid = 1; INSERT INTO Physicians VALUES (1, 'dr. mo', '555-555-5555', 'love', 250000); INSERT INTO Physicians VALUES (2, 'dr. allison', '555-555-0001', 'head aches', 250000); INSERT INTO Physicians VALUES (3, 'dr. didi', '555-555-0002', 'foot pain', 200000); INSERT INTO Physicians VALUES (4, 'dr. nini', '555-555-0003', 'back pain', 200000); INSERT INTO Patients VALUES (1, 'moe sislack', '123 street nw', '777-777-7777', 1); INSERT INTO Patients VALUES (2, 'mose allison', '234 street nw', '888-888-8888', 1); INSERT INTO Patients VALUES (3, 'didi kong', '345 street nw', '999-999-9999', 1); INSERT INTO Treatments VALUES(1, 1, 1, 'Surgery', '2020-03-01'); SELECT * FROM Care_centres; SELECT * FROM Nurses; SELECT * FROM Physicians; SELECT * FROM Patients; SELECT * FROM Treatments; cid | name | location | nurse_charge_id -----+---------------------+----------+----------------- 1 | Intensive Care Unit | MT-M | 1 (1 row) nid | name | care_centre_id | certificate_type | telephone | salary -----+--------------+----------------+------------------+--------------+-------- 1 | Sharp | 1 | birth | 111-222-3333 | 100000 2 | Barton | 1 | RN | 111-222-3333 | 100000 3 | Nightingale | 1 | birth | 111-222-3333 | 100000 4 | Seacole | 1 | RN | 111-222-3333 | 50000 5 | Dix | 1 | birth | 111-222-3333 | 100000 6 | Mahoney | 1 | birth | 111-222-3333 | 50000 7 | Wald | 1 | birth | 111-222-3333 | 100000 8 | Sanger | 1 | RN | 111-222-3333 | 100000 9 | Breckinridge | 1 | birth | 111-222-3333 | 50000 10 | Teresa | 1 | birth | 111-222-3333 | 0 11 | Bertschinger | 1 | RN | 111-222-3333 | 50 (11 rows) phid | name | pager_number | specialization | salary ------+-------------+--------------+----------------+-------- 1 | dr. mo | 555-555-5555 | love | 250000 2 | dr. allison | 555-555-0001 | head aches | 250000 3 | dr. didi | 555-555-0002 | foot pain | 200000 4 | dr. nini | 555-555-0003 | back pain | 200000 (4 rows) pid | name | address | telephone | care_centre_id -----+--------------+---------------+--------------+---------------- 1 | moe sislack | 123 street nw | 777-777-7777 | 1 2 | mose allison | 234 street nw | 888-888-8888 | 1 3 | didi kong | 345 street nw | 999-999-9999 | 1 (3 rows) tid | patient_id | physician_id | treatment_name | date -----+------------+--------------+----------------+--------------------- 1 | 1 | 1 | Surgery | 2020-03-01 00:00:00 (1 row) ``` **For some strategic decisions, the president of the hospital needs summary data about the care centres. For each care centre, s/he needs to know the number of nurses holding an RN certificate, as well as their total and average salaries. Does the following view answer the president’s request? If not, write the correct view that will satisfy the president’s request.** ```sql CREATE VIEW NURSE_SUMMARY (D, C, TOTAL_S, AVERAGE_S) AS SELECT cid, COUNT (*), SUM (salary), AVG (salary) FROM Care_centres, Nurses WHERE nurse_charge_id = nid and certificate_type like ‘RN’ GROUP BY cid; ``` This is incorrect. It is only including nurses who are in charge of the care centre. ```sql CREATE VIEW NURSE_SUMMARY (D, C, TOTAL_S, AVERAGE_S) AS SELECT c.cid, COUNT(*), SUM(n.salary), AVG(n.salary) FROM Nurses n INNER JOIN Care_Centres c ON c.cid = n.care_centre_id AND n.certificate_type LIKE 'RN' GROUP BY c.cid; ``` State which of the following queries and updates would be allowed in this view. If a particular query or update would be allowed, show what the corresponding query or update on the base relations would look like, and give its result when applied to the database. Q1 ```sql SELECT * FROM NURSE_SUMMARY; ``` This query is allowed. ```sql d | c | total_s | average_s ---+---+---------+-------------------- 1 | 4 | 250050 | 62512.500000000000 (1 row) ``` Q2. ```sql SELECT D, C FROM NURSE_SUMMARY WHERE TOTAL_S > 100000; ``` This query is allowed. ```sql d | c ---+--- 1 | 4 (1 row) ``` Q3. ```sql SELECT D, AVERAGE_S FROM NURSE_SUMMARY WHERE C > (SELECT C FROM NURSE_SUMMARY WHERE D=4); ``` This query is allowed. ```sql d | average_s ---+----------- (0 rows) ``` Q4. ```sql UPDATE NURSE_SUMMARY SET D=3 WHERE D=4; ``` The above statement is not allowed. An attempt to update the nid for a specific Nurse could be attempted with the following: ```sql UPDATE Nurses SET nid = 3 WHERE nid = 4; ``` This will cause an error due to another Nurse already having the `nid` of 3. It would yield an error similar to: ```sql ERROR: 23505: duplicate key value violates unique constraint "nurses_pkey" DETAIL: Key (nid)=(3) already exists. SCHEMA NAME: public TABLE NAME: nurses CONSTRAINT NAME: nurses_pkey LOCATION: _bt_check_unique, nbtinsert.c:427 ``` Q5. ```sql DELETE FROM NURSE_SUMMARY WHERE C > 4; ``` The above statement is not allowed. Instead the following can be used. ```sql DELETE FROM Nurses WHERE nid IN (SELECT D FROM NURSE_SUMMARY WHERE C > 4); DELETE 0 SELECT * FROM Nurses; nid | name | care_centre_id | certificate_type | telephone | salary -----+--------------+----------------+------------------+--------------+-------- 1 | Sharp | 1 | birth | 111-222-3333 | 100000 2 | Barton | 1 | RN | 111-222-3333 | 100000 3 | Nightingale | 1 | birth | 111-222-3333 | 100000 4 | Seacole | 1 | RN | 111-222-3333 | 50000 5 | Dix | 1 | birth | 111-222-3333 | 100000 6 | Mahoney | 1 | birth | 111-222-3333 | 50000 7 | Wald | 1 | birth | 111-222-3333 | 100000 8 | Sanger | 1 | RN | 111-222-3333 | 100000 9 | Breckinridge | 1 | birth | 111-222-3333 | 50000 10 | Teresa | 1 | birth | 111-222-3333 | 0 11 | Bertschinger | 1 | RN | 111-222-3333 | 50 (11 rows) ``` This statement has no affect because in the current data set there are zero `NURSE_SUMMARY` results with a count of greater than 4. **Use Oracle Forms, or write a program (using PL/SQL and/or Java), to display the following report for a given patient (the user will only enter the patient number).** ```text Patient Number: _______________ Patient Name: ______________________ Care Centre Name: ___________________ Name of Nurse-in-Charge: ___________________ Treatment ID Treatment Name Physician ID Date ___________ ___________ ___________ ___________ . . . . . . . . . . . . ``` The following [Ruby](https://www.ruby-lang.org/en/) program will request a patient id when run and then query the necessary information from a [PostgreSQL](https://www.postgresql.org/) database to print a report to the terminal. ```ruby #!/usr/bin/env ruby require 'bundler/inline' gemfile do source 'https://rubygems.org' gem 'pg' end connection = PG.connect(dbname: ENV.fetch('PGDBNAME', 'postgres'), host: ENV.fetch('PGHOST', Pathname.pwd.join('tmp/sockets'))) print "\nPatient Number: " sql = <<~SQL SELECT p.name AS p_name, c.name as c_name, n.name as n_name FROM Patients p INNER JOIN Care_centres c ON c.cid = p.care_centre_id INNER JOIN Nurses n ON n.nid = c.nurse_charge_id WHERE pid = $1; SQL patient_id = gets.to_i connection.exec_params(sql, [patient_id]) do |result| result.each do |row| puts "\nPatient Name: #{row['p_name']}" puts "Care Centre Name: #{row['c_name']}" puts "Name of Nurse-in-Charge: #{row['n_name']}" end end puts "\nTreatment ID | Treatment Name | Physician ID | Date" connection.exec_params("SELECT * FROM Treatments WHERE patient_id = $1;", [patient_id]) do |treatments| treatments.each do |x| puts [x['tid'].ljust(12), x['treatment_name'].ljust(14), x['physician_id'].ljust(12), x['date']].join(' | ') end end ``` The program can be run like this: ```bash モ ruby program.rb Patient Number: 1 Patient Name: moe sislack Care Centre Name: Intensive Care Unit Name of Nurse-in-Charge: Sharp Treatment ID | Treatment Name | Physician ID | Date 1 | Surgery | 1 | 2020-03-01 00:00:00 ```