summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authormo khan <mo.khan@gmail.com>2020-02-15 15:32:47 -0700
committermo khan <mo.khan@gmail.com>2020-02-15 15:32:47 -0700
commit3faece6f29bc057bb823210e5eea29521e81ad26 (patch)
treeeb2d65cd862a6e7cd83f2aaf6a2c4540e458ddbb
parent06471347cfeb5de96a6d6e563386fb9f9241f3dc (diff)
Work on q2'
-rw-r--r--assignments/3/README.md28
-rw-r--r--doc/unit-7.md11
2 files changed, 39 insertions, 0 deletions
diff --git a/assignments/3/README.md b/assignments/3/README.md
index 9fcbfc4..080a1fe 100644
--- a/assignments/3/README.md
+++ b/assignments/3/README.md
@@ -33,8 +33,36 @@ The facts to be recorded for each combination of these dimensions are:
**Using the assumptions stated above, estimate the number of rows in the fact table.**
+* 1 policy has 2 members
+* 1 policy has 10 items
+* 1 office for each policy
+* 1M total policies
+* 5% of 1M policies = 50,000 policy changes each month
+* 5 years is 60 months
+* In 60 months there are 3M policy changes
+
+Total rows = 1 * 2 * 10 * 1 * 50K * 60
+= 60,000,000 rows
+
+Total rows = 1M offices * (5% of 1M) * 2 members per policy * (5 * 12 months)
+= 1M * 50,000 * 2 * 60
+= 6,000,000,000,000 rows
+
+* 2M members
+* 10M items
+* X offices
+* 1M policies
+* 60 periods
+* 3M claims (50K claims/month * 60 months)
+
+= 50K * 60 * 1M * 10M * 2M
+
+
**Estimate the total size of the fact table (in bytes), assuming an average of 5 bytes per field.**
+Total size = 9 columns * 5 bytes * N rows
+= 45 bytes * N rows
+
## Question 2
Suggest an appropriate recovery technique that a database administrator could use to resolve each of the following situations.
diff --git a/doc/unit-7.md b/doc/unit-7.md
index cf471a0..e9edc2f 100644
--- a/doc/unit-7.md
+++ b/doc/unit-7.md
@@ -250,6 +250,17 @@ A common grain would be each business transaction, such as an individual line it
Clicks on a website is possibly the lowest level of granularity.
+Size of the fact table
+
+The grain and duration of the fact table have a direct impact on the size of that table.
+
+We can estimate the number of rows in the fact table as follows:
+
+1. Estimate the number of possible values for each dimension associated with the fact table.
+2. Multiple the values obtained in the first step after making any necessary adjustments
+
+Total rows = 1000 stores * (10,000 products * 50% = 5,000) * 24 months
+ = 120,000,000 rows
Multiple fact tables