1. A website requires users to register in order to view it, and
users can also post comments to the site. Now let's assume there
are 10,000 users, and 500 comments, made by 100 different people
(1% of all users). We want to join comments with users (in
order to get their names and email addresses.)
Schema:
user U(id, name, email, aboutme), with a unique index on id.
comment C(id, date, subject, thecomment) variable
length records, no indexes.
Please pick and describe a reasonable join algorithm,
based upon the following statistics. Estimate the number of
disk I/O's required.
You could compare the I/O requirements if a 2-pass join with
that of using an index.
|
U |
C |
T |
10000 |
500 |
B |
500 |
250 |
V(id) |
10000 |
100 |
M available |
200 |
R |
S |
|
T |
25000 |
4000 |
B |
5000 |
500 |
V(Y) |
2500 |
2000 |