diff options
| -rw-r--r-- | doc/unit-4.md | 62 |
1 files changed, 61 insertions, 1 deletions
diff --git a/doc/unit-4.md b/doc/unit-4.md index e16b8fb..eab8723 100644 --- a/doc/unit-4.md +++ b/doc/unit-4.md @@ -125,9 +125,69 @@ Dangers of denorm ### Partitioning +Horizontal Partitioning + > Horizontal partitioning: Distribution of the rows of a logical relation into several separate tables. -** Continue from page 218. +Partition key + +* single column +* date: usually a qualifier in queries + +Makes table maintainenance easier because fragementing and rebuliding can be isolated to single partitions +as storage space needs to be organized. + +File level security can be applied to prohibit users from seeing certain rows of data. +Taking one of the partitioned files out of service because it was damaged or so it can be recovered +still allows processing against the other partitioned files to continue. + +Each partitioned file can be placed on a separate disk drive to reduce contention for the same drive +and hence improve query and maintenance performance across the database. + + +Horizontal partitioning is similar to creating supertype/subtype relationship because different types +of the entity are involved in different relationships. + +Advantages + +1. Efficiency: Data queried together are stored close to one another and separate from data not used together. +2. Local optimization: Each partition of data can be stored to optimize performance for it's own use. +3. Security: Data not relevant to one group of users can be segregated from data those users are allowed to use. +4. Recovery and uptime: Smaller files take less time to back up and recover and other files are still accessible if one file is damaged, so the effects of damage are isolated. +5. Load balancing: Files can be allocated to different storage areas, which minimizes contention for access to the same storage area or even allows for parallel access to the different areas. + +Disadvantages + +1. Inconsistent access speed: Different partitions may have different access speeds, thus confusing users. If data needs to be accessed across partitions, this produces slower response times. +2. Complexity: is usually not transparent to application code, which requires knowledge of which partition to read/write from. +3. Extra space and update time: Data may be duplicated across partitions, taking extra storage space compared to all data in normalized files. Updates that affect multiple partitions will be slower. + +Range partitioning + +* Each partition is defined by a range of values for one or more columns of the normalized table. +* A table row is inserted in the proper partition based on it's initial values for the range fields. +* Each partition may hold a different number of rows. +* A partition key that can produce a more balanced distribution of rows is better. +* A row may be restricted fro moving between partitions when key values are updated. + +Hash Partitioning + +* data are evenly spread across partitions independent of any partition key value. +* overcomes the uneven distribution of rows that is possible with range partitioning. +* works well when distributing data evenly across devices is necessary. + +List Partitioning + +* partitions are defined based on predefined list of values of the partitioning key. +* E.g. `province` partition key + +Vertical Partitioning + +> Vertical Partitioning: Distribution of the columns of a logical relation into several separate physical tables. + +Data replication + +* The same data is stored in multiple places in the database. ## Section 2: File Organization and Database Architectures |
