CS 457/557 Assignment 8 - Query plans: Size estimation, join order

Due Friday, 23 March 2018

A. Exercise 16.4.1 (p.802)
    Sizes of (a) - (g)














B. Choose a join order
  1. For the natural join of W,X,Y,Z of part A (16.4.1 (a)), use a "greedy algorithm" to choose a join order for a left-deep tree. You should probably start by calculating the sizes of W join X, X join Y  and  Y join Z (any others?)  [Greedy alg, p 824: Join first the pair whose estimated join size is smallest, and so forth.]
  2. Here are some statistics for the second query of last week's assignment:
    SELECT n, t, g
    FROM  S, C, E
    WHERE C.cc like 'CS%'
       AND S.i =E.i
       AND C.cc = E.cc
    
    T(S) = 3000
    T(E) = 10000
    T(C) = 600
    V(S,i)  = 3000
    V(E,i) = 2500

    V(S, n) = 2900
    V(E,cc) = 500
    V(C,cc) = 600
    V(S, m) = 50
    V(E, g) = 60
    V(C, p) = 150
    There would probably be no statistics to estimate the "like". There could be anwhere from 0 to T tuples. I think it is reasonable to use the same intuition as for an inequality, namely that it would tend to be rather less than half, so use T/3.
    1. How many tuples do you estimate for the final result?
    2. Pick an order for the 3 joins.