SQL Style

Background

Jim & Reto developed this style guide at LoyaltyMatrix and have carried it with us ever since. 

With large development teams and more folks writing code it is really important that we can all easily read & understand each other’s code. And by “code” I don’t just mean long involved load sequences. Simple queries, after all, should be simple to understand and should be written with the same care, style and pride as more complex tasks. This page essentially documents the SQL coding style I have been using for the last few years. I adapted the ideas of Steven Feuerstein from his book Oracle PL/SQL Programming, 2nd Edition from O’Reilly. (Unfortunately, the chapter is not included in the 3rd edition.)

Why we are doing this

Reading each other’s code should be a no brainer – focus should be on how it works (or not works)
All code can, and should, be re-used
A large part of the value of any organization’s value is its IP. A large part of our IP is SQL code.
Here are the fundamentals, to be expanded below
  • Looks good on the screen/page – open, lots of white space, no clutter…
  • If you use a generated statement, take the time to eliminate the copious clutter
  • Uppercase reserved words, lowercase application words
  • Vertical layout
  • Right align reserved words
  • Single things on single lines
  • Break long statements (eg CASE) to emphasize continuation
  • Maintain vertical alignment in sub-blocks
  • It’s OK to be a bit verbose
  • AS is required (a new rule for me!)
  • Assume SQL-92 functionality
  • Use JOIN/ON not FROM/WHERE
  • Use consistent sequence in conditions
  • Comment for clarity & explanation, not to paraphrase syntax
This standard can evolve, but only very slowly! If there is anything here you strongly object to, please hash it out with the BI team now.

Physical Layout Style

Looking Good

Jim’s creative writing teacher in college first judged a poem by looking at the page from afar to see the form on the page. Good SQL looks good on the page. It is inviting to read.
  • Use plenty of white space.
  • Keep line lengths to 80, or 100, characters max!
  • Always uses spaces never tabs, to indent. That way you are in control, not some subsequent editor.

UPPER RESERVED WORDS. lower application words

This means all reserved words including SQL functions should be upper case.
User functions should be lower case. Table names, column names are lower case
-- Good:
SELECT stuff
       FROM table
-- Bad:
select STUFF from TABLE

Vertical Layout

The eye should easily find the part of a statement you are looking for. Complex SQL is hard enough to understand without needing to track down the phrase of interest.
One key principle is to right align key words as in these examples:

SELECT block 

SELECT <stuff>
  FROM <main table>
  JOIN <next table>
    ON <join condition(s)>
   AND ...
    OR ...
 WHERE <select condition(s)>
   AND ...
    OR ...
 GROUP BY <columns>
HAVING <aggregate conditions>
   AND ...
    OR ...
 ORDER BY <columns>

INSERT blocks 

INSERT INTO <table, columns>
       VALUES <stuff>
or
INSERT INTO <table, columns>
       SELECT <stuff>
         FROM <etc...>

UPDATE blocks 

UPDATE <table>
   SET <set expression(s)>
 WHERE <update condition>

DELETE blocks 

DELETE
  FROM <table>
 WHERE <delete condition(s)>

Single things on single lines

SELECT columns, UPDATE sets, etc. should each be on a single line
-- Good:
SELECT customer_id,
       first_name,
       last_name,
       ...etc...
  FROM ...etc...   
-- Bad:
SELECT customer_id, first_name, last_name, ...etc. FROM ...etc...
and
UPDATE customer_dim
   SET first_name = 'Joe',
       last_name = 'Doe'
 WHERE ...etc...
This goes with the vertical alignment philosophy and, more importantly, encourages annotation with inline comments (see below)

Break long expressions

And wrap with trailing operators to emphasize continuation. Use 2-character indents on continuation lines.
SELECT 123.456 * very_long_column_name +
         another_long_column_name
         AS my_calc_field,
       ...etc...

Break CASE clauses

at each WHEN and, perhaps, at each THEN. ELSE aligns with THEN. Right align END on new line:
SELECT CASE state WHEN 'CA' THEN 'Great'
                  WHEN 'TX' THEN 'Bad'
                            ELSE 'Midwest'
        END AS state_class,
       state_name,
       ...etc...
  FROM ...etc... 
or, for longer WHEN terms:
SELECT CASE WHEN state IN ('CA', 'OR', 'WA', 'HI')
            THEN 'Great'
            WHEN state IN {'TX', 'OK', 'ND', 'SD', 'IN')
            THEN 'Bad'
            ELSE 'Midwest'
        END AS state_class,
       state_name,
       ...etc...
  FROM ...etc...

Maintain vertical alignment within sub-blocks

For sub-blocks in your code maintain the alignment. For example, a sub-SELECT has it’s own alignment column
SELECT <stuff>
  FROM (SELECT <more stuff>
          FROM <table>
         WHERE <sub-select condition>
       ) s
 WHERE <condition>

Verbosity is OK in moderation

In general, less is more – as in “more better.” Some optional reserved words add to clarity.

AS when aliasing a column or expression

SELECT qty_sole * unit_price AS ext_price
  FROM ...etc...

Alias the FROM & JOIN table names

Even the simplest single table queries have a way of growing more complex. Using table aliases, always!
SELECT cus.customer_key
  FROM customer_dim AS cus
Special case – joining to the same table more than once (aka self-join)
In sub-selects use the standard alias with an integer suffix
SELECT <stuff>
  FROM sales_fact AS sf
 WHERE sf.sales_fact_identity = ( SELECT MAX(sf1.sales_fact_identity )
                                    FROM sales_fact AS sf1
                                   WHERE ...etc...
                                )
Use descriptive suffix when a table takes different roles 
SELECT <stuff>
  FROM sales_fact sf
  JOIN time_by_day AS date_sale
    ON date_sale.date_id = sf.sale_date_id
  JOIN time_by_day AS date_order
    ON date_order.date_id = sf.order_date_id
 WHERE <select condition(s)>   

Assume SQL-92 is supported

Use JOIN/ON rather than FROM/WHERE

The “new” JOIN/ON syntax clearly states just what is going on & keeps the join conditions where they belong – just after the join declaration.
SELECT <stuff>
  FROM sales_fact sf
  JOIN customers c
    ON c.customer_id = sf.customer_id
  JOIN product p
    ON p.product_id = sf.product_id
   …
PLEASE — no more FROM/WHERE joins!
The above example shows the basics. Here are the rules for more realistic JOINs:
  • Pick the FROM table as table most important to the query. Typically a fact table
  • JOIN to a table, specifying an alias using AS
  • The join ON condition is of the form <new table column/expression> = <prior table column condition>
  • The WHERE clause should be reserved for conditions on the entire select, not additional join conditions.
For example,
SELECT <stuff>
  FROM sales_fct AS sf
  JOIN customer_dim AS cus
    ON cus.customer_identity = sf.customer_identity
   AND cus.is_cur_flag = 1                          -- just current SCD rows in join
 WHERE cus.last_name LIKE 'Sm_th%'
   AND sf.sales_amount > 1000
-- (very) bad:
select <stuff> from sales_fct, customer_dim where  sales_amount > 1000 and last_name like 'Sm_th%' and is_cur_flag = 1 and sales_fact.customer_identity = customer_dim.customer_identity

Never, Never, Ever

SELECT * FROM …

Never in production code! Since this is a golden rule, better make sure you don’t use it in Development or QA as things have a tendency to slip into our source control system.

Assume formats or type casting

When exporting
Don’t assume the DBMS is your friend! Always use
  • CAST(), or
  • CONVERT()

PostgreSQL Specifics

… to be added as we go along …

WITH … SELECT instead of JOIN (SELECT …)

Defining stuff before you use it; what a concept!

:: Casting

Really handy when your result set is to be read by real people!

Window Functions

Always explicitly define and name the window

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Recent Comments
Archives
Categories
R-bloggers

R news and tutorials contributed by (750) R bloggers

%d bloggers like this: