We would like to introduce a technique that we use to tune SQL statements correctly without the trial and error that we were often faced with before. The concept of SQL Diagramming was introduced to us through a book named SQL Diagramming by Dan Tow. For a complete understanding of this topic we encourage you to buy the book and get more information from http://www.singingsql.com.
For our purposes, I will explain the basics of the following SQL diagram:
registration .04–>student
registration .04–>class .002
The diagram looks a lot like an ERD diagram. To build it:
· Start with any table in the FROM clause and put it on paper (the easiest way to draw
these diagrams is by hand). In our case we started with the first table in the FROM clause
named STUDENT.
· Take the next table, which is REGISTRATION and place it either above or
below the existing tables based on the relationship. Since the REGISTRATION relates to
the STUDENT table uniquely, i.e. one row in REGISTRATION points to one row in the
STUDENT table, we put it above and draw and arrow downwards.
· Take the next table, CLASS, and we know that one row in REGISTRATION points to one
row in CLASS so we put it below REGISTRATION with another downward pointing arrow.
The next step is to further understand the criteria used in the query to limit the number of rows
from each of the tables. The first criteria to explore is anything in the WHERE clause to limit
the rows from REGISTRATION. This criteria is:
r.signup_date BETWEEN
DATEADD(day, -1, current_timestamp) AND current_timestamp
AND r.cancelled = ‘N’
To understand the selectivity of the criteria, run a query against the table and use the same
criteria:
select count(*) from registration r
WHERE r.signup_date BETWEEN
DATEADD(day, -1, current_timestamp) AND current_timestamp
AND r.cancelled = ‘N’
Results – 3,562 / 80,000 (total rows in REGISTRATION) = 0.0445 = 4.4% selectivity
Use this number on the diagram next to the REGISTRATION table and underline it. This represents the selectivity of the criteria against that table. The next table to review in the same manner is the CLASS table. The query we would run becomes:
SELECT count(1) FROM class
WHERE name = ‘SQL TUNING’
Results – 2 / 1000 (total rows in CLASS) = 0.002 = 0.2% selectivity
Add this to the diagram next to the CLASS table. The next table to explore is the STUDENT
table, but there are no direct criteria against this table, so no underlined number next to that table.
The diagram has now been completed for our example, however, Dan Tow goes into more detail
than we will include for now.