Csc 417/517 Assignment 7
Due Wednesday Nov. 14
For the following joins (SQL queries),
One: The Logical query plan in
relational algebra is a natural join
of the 4 tables below. (I added Professors to the tables of the
last assignment)
- Estimate the size of the total join, using the estimates
- Pick a join order by enumerating all the reasonable left-deep
trees of
the 4 relations (You can rule out joins that are simply cross products)
and
- Estimate the size of the intermediate results
- Choose the one with the smallest size for the intermediate results
Two: Now add the selection
criterion: WHERE n = 'your name',
and push this selection down your trees, and redo the estimates, choose
what is now the best join order.
The schema is, you can use the short forms:
Students(id, name,
major)
Enrollment(id, coursecode, grade)
Courses (coursecode,
title, professor)
Professors (professor,
department, email) -- professor is the name of
S (i, n, m )
E(i, c, g)
C(c, t, p)
P(p, d, e)
Estimates
Table
|
Tuples
|
n
|
i
|
c
|
p
|
S (i, n, m ) |
2500
|
V(S,n)=2400
|
V(S,i)=2500 |
|
V(S,m)=30 |
E(i, c, g)
|
10,000
|
|
V(E,i)=2000 |
V(E,c)=300 |
|
C(c, t, p)
|
500
|
|
|
V(C,c)=500 |
V(C,p)=100 |
P(p, d, e)
|
120
|
V(P,d)=30 |
|
|
V(P,p)=120 |
This is based upon Bishop's University:
There are about 2500 students, but 500 are not currently
registered in any courses. Most have unique names, but there are 100
duplicate names.
Professors take a sabbatical every 6 years, so 1/6 are not teaching in
the current semester. Those that teach, teach 3 courses.
Not all courses are taught every semester.
There are 30 departments, all have professors, and major students.
Graduate Students, also:
Three: Consider the join with
also the condition S.m = P.d
This should give only "Enrollments of students with a professor of
their major department".
Would a "Greedy Algorithm" give the same result as your exhaustive
search?