The University of Hong Kong
COMP 3323
COMP3323
Advanced Database Systems
Midterm Examination
Questions and Answers
31st October, 2019
Number of questions: 3; Number of pages: 5
Total points: 70; Total time: 70 minutes
Name:
Student No.: [Please write your UID on every page.]
1. (25 points) Query Evaluation. Consider two relations r and s, which have
...[Show More]
COMP3323
Advanced Database Systems
Midterm Examination
Questions and Answers
31st October, 2019
Number of questions: 3; Number of pages: 5
Total points: 70; Total time: 70 minutes
Name:
Student No.: [Please write your UID on every page.]
1. (25 points) Query Evaluation. Consider two relations r and s, which have a common attribute
A. A is the primary key for relation s and a foreign key in r referencing s. Assume that r has nr
records, occupying br blocks on disk and s has ns records, occupying bs blocks on disk. Assume
that the memory buffer has M blocks and M is smaller than br and bs. Assume that there is a B+-
tree primary index on A for relation s and that the tree has three levels. Assume that the records in
relation r are not sorted based on the values of A.
(a) (20 points) What is the expected I/O cost for computing r ⋈ s (i) using index nested loops,
with r being the outer relation and (ii) using sort-merge join? (Hint: in the sort-merge join, the
output of the sorting algorithm does not need to be written to the disk.)
Answer (i): For index-nested loops, the expected I/O cost is the cost of an index probe for
each record of r. The cost of an index probe is (tree height)+1 (we expect that all probes will
be successful because A is the primary key for s). Therefore, the overall cost of the join is
br+nr×4.
Answer (ii): For sort-merge join, s is already sorted on A, since there is a primary B+-tree
index on A for s. The cost to sort r, excluding the cost of writing the output is (2[logM-
1(br/M)]+1)×br. Then, the cost for the join is bs, assuming that the output of the sorting is not
written, but it is immediately joined with s. Thus, the overall cost is
(2[logM-1(br/M)]+1)×br+bs
Page 2
UID: XXXXXXXX
(b) (5 points) Which of the two methods above is the best choice if br=100, bs=10000, nr=15000,
M=20? Explain your answer and explain under what condition(s) method (i) or (ii) should be
used.
[Show Less]
Access Full Document
Instant download after payment
Card Payments
₿
Crypto Accepted