diff options
| author | mo khan <mo.khan@gmail.com> | 2020-01-31 12:16:36 -0700 |
|---|---|---|
| committer | mo khan <mo.khan@gmail.com> | 2020-01-31 12:16:36 -0700 |
| commit | f8ded072912b37ca72c1981516c07101fbbaa6ef (patch) | |
| tree | a24b0134fb5278783978f4cc024d904961691a6d /assignments | |
| parent | b9e71838ad0db47988e4631fb3d8f9872a01c706 (diff) | |
Answer more questions
Diffstat (limited to 'assignments')
| -rw-r--r-- | assignments/2/README.md | 49 |
1 files changed, 46 insertions, 3 deletions
diff --git a/assignments/2/README.md b/assignments/2/README.md index 323ec76..abdc03e 100644 --- a/assignments/2/README.md +++ b/assignments/2/README.md @@ -111,9 +111,52 @@ Answer the following questions (250 words max/question). Answer the following questions (250 words max/question). -* What factors should be considered when choosing a file organization? -* What is the purpose of clustering data in a file? -* Compare hashed file organization versus indexed file organization. List two advantages of indexed over hashed, and two advantages of hashed over indexed. +**What factors should be considered when choosing a file organization?** + +The following is a list of general guidelines to consider when choosing a file organization. + +1. fast data retrieval +2. high throughput for processing data input and maintenance transactions +3. efficient use of storage space +4. protection from failures or data loss +5. minimizing need for reorganization +6. accommodating growth +7. security from unauthorized use + +The needs of the application and the distribution of the software may also raise the priority of +some of the points raise above. + +In a system where the data might be shipped as an embedded database into a resource restricted +environment such as an on-premises appliance factors such as disk space may have a higher priority +than protecting from failures or data loss. + +In a system where the write traffic significantly outweighs the read traffic you may want +to consider giving up efficient use of storage space in favour of high throughput for processing +data input and accommodating growth. In such a system it might be okay to allow data +to be eventually consistent by partitioning the data across nodes in a ring of servers. + +In a system where the speed of retrieving the data is the most important factor such as a +cache then you might be okay with storing the data in memory rather than persisting it to disk. +This is a common requirement for a cache. + +**What is the purpose of clustering data in a file?** + +Clustering data allows for sequential access of related data to improve the read +performance of the desired data. It reduces the need to find related information be +accessing multiple files and therefore speeds up the response time for the data. + +**Compare hashed file organization versus indexed file organization. List two advantages of indexed over hashed, and two advantages of hashed over indexed.** + +Hashed file advantages over indexed file + +1. Allows for O(1) lookup to find data, which can mean much faster access time to access a specific piece of data. +2. Deleting data is very easy + +Indexed file advantages over hashed file + +1. Allows for efficient access to multiple related pieces of data. +2. No wasted space for data. Additional space is required to maintain the index. +3. Lower CPU overhead than computing a hash key for each record and hash collisions are not a concern. ## Question 4 (18 marks) |
