All Articles

Snowflake EDW Naming Conventions

Photo by Zdenek Machacek on Unsplash
Photo by Zdenek Machacek on Unsplash

Overview

Consistent and standardized naming conventions can help speed up both development and understanding. It’s critical to establish a pattern and while exceptions do occur, make sure it’s communicated and followed to the extent possible.

General Rules

In addition to more specific naming standards, there are some general rules worth considering. Some of these are subjective, so certainly not something that is set in stone or even agreed to across most organizations. However, while I’m biased, I do think these are probably the more popular/common approaches listed below.

  1. Use singular table names (e.g., dim_product not dim_products, fact_order not fact_orders). The idea is that we’re describing what a singular instance/row contains.
  2. Use lowercase only for object names. Personally, I think it’s both easier to read and type—reason enough to go with this standard vs. all uppercase (besides the one exception below).
  3. Use uppercase for keywords only. While remembering to hold shift is annoying, this does improve readability. Knowing that SQL keywords are uppercase helps separate them from the actual business table/column names and makes queries a lot easier to read.
  4. Do not use abbreviations or vendor/technology specific names in table or column names. Not everyone will understand the meaning of your abbreviations. It’s better to just take the extra few seconds and type out the full word. Some exceptions where an acronym is common (e.g., lbs for pounds on a weight column) are okay. Regarding vendor specific names, putting the name of a vendor (e.g., dim_oracle_account if you use an Oracle ERP) leaves you open to a renaming risk if you change vendors. There might be some exceptions for certain cross reference translation tables where it’s nearly impossible not to do this, but worth trying to avoid where it makes sense.

Class Words

Naming columns in a table can be tricky. One thing that helps standardize is class words. These are descriptors that occur at the end of the column names and help define what type of data exists in the column. For example, if we have a gross sales column that contains a dollar amount, it makes sense to call that column gross_sales_amount. If we have a date column (e.g., date of a sales transaction), we could call that sales_date. What the entire list should be is subjective, but here are some starting points:

Class Word Snowflake Data Type Examples Notes
amount DECIMAL(19,4) sales_amount, price_amount, total_amount Used for monetary values and other decimal amounts
code STRING postal_code, state_code, product_code Short identifiers or reference codes
count INT or BIGINT customer_count, item_count, order_count Used for counting items or records
date DATE order_date, birth_date, calendar_date Date values without time component
datetime DATETIME or TIMESTAMP_NTZ create_datetime, last_updated_datetime, transaction_datetime Date and time values together
description STRING product_description, customer_description Longer text describing an entity
flag INT etl_inferred_member_flag Reserved for ETL related columns only. Typically 0 or 1
indicator STRING taxable_indicator, active_indicator Values should be positive/negative statements in business language (e.g., ‘Taxable’ and ‘Not Taxable’)
key BIGINT customer_key, order_key, dim_date_key Only for table keys for facts/dims or foreign keys for facts
name STRING customer_name, city_name, product_name Human-readable names or labels
number INT order_number, vendor_number, account_number Numeric identifiers (not counts or amounts)
quantity DECIMAL(19,4) order_quantity, item_quantity, stock_quantity Numeric quantities or measurements
status STRING order_status, account_status, payment_status Current state or condition of an entity
text STRING address_text, notes_text, comment_text Free-form text fields
type STRING order_type, customer_type, payment_type Classification or category of an entity
url STRING product_url, image_url, website_url URLs or web links
value STRING etl_row_hash_value Typically only used for ETL. Hash values or encoded data

Audit Columns

This can be up to the specific team/process, but various create_ and last_update_ columns at the end of the table can be helpful. For example batch_id, user_name, datetime for both create and last update (~6 total columns). This gives a row level trace back to when records were created or updated.

Summary

Naming standardization helps remove ambiguity and contributes to a more understandable data model for all stakeholders. By establishing consistent patterns for table names, column names, and class words, teams can reduce confusion, speed up development, and improve maintainability of data warehouse objects. While these conventions are guidelines rather than strict rules, communicating and following them consistently will pay dividends as your data warehouse grows in complexity.