Csc 417/517 Assignment 7

Due Wednesday Nov. 14

For the following joins (SQL queries),
One: The Logical query plan in relational algebra is a natural join of the 4 tables below. (I added Professors to the tables of the last assignment)
  1. Estimate the size of the total join, using the estimates
  2. Pick a join order by enumerating all the reasonable left-deep trees of the 4 relations (You can rule out joins that are simply cross products) and
  3. Estimate the size of the intermediate results
  4. Choose the one with the smallest size for the intermediate results
Two: Now add the selection criterion: WHERE n = 'your name', and push this selection down your trees, and redo the estimates, choose what is now the best join order.

The schema is, you can use the short forms:

Students(id, name, major)
Enrollment(id, coursecode, grade)
Courses (coursecode, title, professor)
Professors (professor, department, email)  -- professor is the name of

S (i, n, m )
E(i, c, g)
C(c, t, p)
P(p, d, e)


Estimates

Table
Tuples
n
i
c
p
S (i, n, m ) 2500
V(S,n)=2400
V(S,i)=2500
V(S,m)=30
E(i, c, g)
10,000

V(E,i)=2000 V(E,c)=300
C(c, t, p)
500


V(C,c)=500 V(C,p)=100
P(p, d, e)
120
V(P,d)=30

V(P,p)=120

This is based upon Bishop's University:
There are about  2500 students, but 500 are not currently registered in any courses. Most have unique names, but there are 100 duplicate names.
Professors take a sabbatical every 6 years, so 1/6 are not teaching in the current semester. Those that teach, teach 3 courses.
Not all courses are taught every semester.
There are 30 departments, all have professors, and major students.

Graduate Students, also:

Three: Consider the join with also the condition S.m = P.d  This should give only "Enrollments of students with a professor of their major department".
Would a "Greedy Algorithm" give the same result as your exhaustive search?