The University of Hong Kong
COMP 3323
COMP3323
ASSIGNMENT 1 SOLUTION
Reynold Cheng, Chenhao Ma
September 2, 2019
QUESTION 1
Q1 [30%]. Consider two relations r(A,B) and s(C,D). Relation r has 50000 tuples, and s has 6000
tuples. 25 tuples of r fit in one block, 40 tuples of s fit in one block. Assume that the system has a
memory of M =100 blocks.
¢ (i)
...[Show More]
COMP3323
ASSIGNMENT 1 SOLUTION
Reynold Cheng, Chenhao Ma
September 2, 2019
QUESTION 1
Q1 [30%]. Consider two relations r(A,B) and s(C,D). Relation r has 50000 tuples, and s has 6000
tuples. 25 tuples of r fit in one block, 40 tuples of s fit in one block. Assume that the system has a
memory of M =100 blocks.
¢ (i) [20%] Estimate the minimum cost (in number of block accesses) of the equi-join with
r.B=s.C using the following algorithms:
¢ (a) nested-loops join,
r is outer relation: br + nr * bs = 2000 + 50000* 150 = 7,502,000 -2.5%
s is outer relation: 150 + 6000 * 2000 = 12,000,150 -2.5%
¢ (b) block nested-loops join,
br + ([br / M-2])*bs
r is outer relation: 2000 + 2000 /98 * 150 = 2000 + 21*150 = 5150 -2.5%
s is outer relation: 150 + 150 /98 * 2000 = 150+ 2* 2000 = 4150 -2.5%
¢ (c) merge-join (assume that relations r and s are initially sorted),
br + bs = 2000+150 = 2150 -5%
¢ (d) hash-join (assume no overflow occurs and no recursive partitioning occurs.)
3 * (br + bs) = 6450 disk accesses. -5% 2
QUESTION 1
¢ (ii) [10%] Assume that the block size is 4096 bytes. Suppose there are B+-tree indexes on
both r.B and s.C. Neither of these two B+-tree indexes is primary index. For each block in
the two B+-tree indexes, 96 bytes space is reserved for block header information (e.g.,
is_leaf flag and number of entries). Except one sibling access pointer, only key values and
record-ids are stored in leaf nodes. Both key values, r.B and s.C, are 12 bytes long. A
record-id and a block-id each occupies 12 bytes. Please estimate the minimum cost (in
number of block accesses) of running the index nested loops equi-join algorithm with r.B
=s.C. (Assume that the join result contains N=30000 tuples.)
¢ 4096-96=4000 and (4000-12)/(12+12) = 166 (floor operation).
¢ r is outer relation.
¢ The height of B+ tree on s.B: 1 + [log166+1(6000 /166)] = 2
¢ 2000 + 50000*2 + 30000 = 132,000 -3%
¢ s is outer relation:
¢ The height of B+ tree on r.B: 1+ [log166+1(50000/166)] = 3 -3%
¢ 150 + 6000 *3 + 30000 = 48,150 -4%
¢ minimum cost = bs + ns * Index_search_cost + Join_Result_Records = 48150.
3
QUESTION 2
Q2 [30%]. SQL query:
¢ SELECT * FROM R, S WHERE (R.c=S.c) AND (71 <= R.a AND R.a <= 80) AND (S.b=5)
¢
•R has 5000 records and S has 10000 records. Each block can hold 10 records.
¢
•The domain of the attribute R.a is [1,200]. The domain of the attribute S.b is [1,10].
¢
•The smaller temporary table (between T1 and T2) is used as the outer relation of the block
nested loop join.
Join: T1 is smaller and it is used as the outer relation for the block nested loop join.
The cost of join = 25*100+25=2525 blocks. (Cost = br + br*bs)
Total cost = (500+25) + (1000+100) + 2525 = 4150
4
br=5000/10=500 bs=10000/10=1000
[Show Less]