summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
Diffstat (limited to 'doc')
-rw-r--r--doc/unit-4.md100
1 files changed, 100 insertions, 0 deletions
diff --git a/doc/unit-4.md b/doc/unit-4.md
index 7705154..b60e126 100644
--- a/doc/unit-4.md
+++ b/doc/unit-4.md
@@ -25,6 +25,106 @@ Important terms:
* horizontal partitioning:
* vertical partitioning:
+
+### The physical database design process
+
+* goal: data procesing effeciency
+
+#### critical decisions
+
+* storage format: data type
+* provide dbms with hints on how to group attributes
+* provide dmbs with hints on how to arrange similar data
+* select structures for storing and connecting files efficiently. (indexes, db architecture)
+* prepare strategies for handling queries. optimize for perf.
+
+#### Usage analysis
+
+* add notation to the EER diagram
+ * represents final set of normalized relations
+
+* record data volume
+* record access frequency
+
+#### Designing Fields
+
+> Field: The smallest unit of application data reconized by system software.
+
+* type of data
+* data integrity controls
+* handling of missing values for the field
+
+> data type: a detailed coding scheme recognized by software system for representing organizational data.
+
+* space to store data
+* speed required to access data
+
+1. represent all possible values
+2. improve data integrity
+3. support all data manipulations
+4. minimize storage space
+
+[Postgresql Data Types](https://www.postgresql.org/docs/current/datatype.html)
+
+* Numeric
+* monetary
+* character
+* binary
+* date/time
+* boolean
+* enum
+* geometric
+* network address types
+* bit string types
+* text search types
+* uuid types
+* xml types
+* json types
+* arrays
+* composite types
+* range types
+* domain types
+* object identifier types
+
+#### Data Integrity
+
+* default value
+* range control
+* null value control
+* referential integrity control
+
+#### Handling Missing Data
+
+* default value
+* do not allow null
+* substitute estimate of missing data
+* track missing data so it can be resolved manually
+* perform sensitivity testing
+
+### Denormalizing
+
+noramlized relations solve maintenance anomolies and
+minimize redundancies. This may not yield efficient data processing
+if the data required is across many relations. i.e JOINS are costly
+if the data isn't near each other.
+
+> Denormalization: process of transforming normalized relations into non-normalized physical record specifications.
+
+Opportunities for denormalization
+
+1. two entities with a one-to-one relationship
+2. many-to-many relationship (associative entity) with nonkey attributes
+ * instead of joining three files to extract data from two basic entities
+3. reference data
+
+Dangers of denorm
+
+* increase chance of errors and inconsistencies
+* optimization for one type of processing make affect a different type of processing
+* more storage space requirements
+
+Partitioning
+
## Section 2: File Organization and Database Architectures
Outcomes: