summaryrefslogtreecommitdiff
path: root/assignments/2
diff options
context:
space:
mode:
authormo khan <mo.khan@gmail.com>2020-01-31 12:16:36 -0700
committermo khan <mo.khan@gmail.com>2020-01-31 12:16:36 -0700
commitf8ded072912b37ca72c1981516c07101fbbaa6ef (patch)
treea24b0134fb5278783978f4cc024d904961691a6d /assignments/2
parentb9e71838ad0db47988e4631fb3d8f9872a01c706 (diff)
Answer more questions
Diffstat (limited to 'assignments/2')
-rw-r--r--assignments/2/README.md49
1 files changed, 46 insertions, 3 deletions
diff --git a/assignments/2/README.md b/assignments/2/README.md
index 323ec76..abdc03e 100644
--- a/assignments/2/README.md
+++ b/assignments/2/README.md
@@ -111,9 +111,52 @@ Answer the following questions (250 words max/question).
Answer the following questions (250 words max/question).
-* What factors should be considered when choosing a file organization?
-* What is the purpose of clustering data in a file?
-* Compare hashed file organization versus indexed file organization. List two advantages of indexed over hashed, and two advantages of hashed over indexed.
+**What factors should be considered when choosing a file organization?**
+
+The following is a list of general guidelines to consider when choosing a file organization.
+
+1. fast data retrieval
+2. high throughput for processing data input and maintenance transactions
+3. efficient use of storage space
+4. protection from failures or data loss
+5. minimizing need for reorganization
+6. accommodating growth
+7. security from unauthorized use
+
+The needs of the application and the distribution of the software may also raise the priority of
+some of the points raise above.
+
+In a system where the data might be shipped as an embedded database into a resource restricted
+environment such as an on-premises appliance factors such as disk space may have a higher priority
+than protecting from failures or data loss.
+
+In a system where the write traffic significantly outweighs the read traffic you may want
+to consider giving up efficient use of storage space in favour of high throughput for processing
+data input and accommodating growth. In such a system it might be okay to allow data
+to be eventually consistent by partitioning the data across nodes in a ring of servers.
+
+In a system where the speed of retrieving the data is the most important factor such as a
+cache then you might be okay with storing the data in memory rather than persisting it to disk.
+This is a common requirement for a cache.
+
+**What is the purpose of clustering data in a file?**
+
+Clustering data allows for sequential access of related data to improve the read
+performance of the desired data. It reduces the need to find related information be
+accessing multiple files and therefore speeds up the response time for the data.
+
+**Compare hashed file organization versus indexed file organization. List two advantages of indexed over hashed, and two advantages of hashed over indexed.**
+
+Hashed file advantages over indexed file
+
+1. Allows for O(1) lookup to find data, which can mean much faster access time to access a specific piece of data.
+2. Deleting data is very easy
+
+Indexed file advantages over hashed file
+
+1. Allows for efficient access to multiple related pieces of data.
+2. No wasted space for data. Additional space is required to maintain the index.
+3. Lower CPU overhead than computing a hash key for each record and hash collisions are not a concern.
## Question 4 (18 marks)