Overview
Consistent and standardized naming conventions can help speed up both development and understanding. It’s critical to establish a pattern and while exceptions do occur, make sure it’s communicated and followed to the extent possible.
General Rules
In addition to more specific naming standards, there are some general rules worth considering. Some of these are subjective, so certainly not something that is set in stone or even agreed to across most organizations. However, while I’m biased, I do think these are probably the more popular/common approaches listed below.
- Use singular table names (e.g., dimproduct not dimproducts, factorder not factorders). The idea is that we’re describing what a singular instance/row contains.
- Use lowercase only for object names. Personally, I think it’s both easier to read and type—reason enough to go with this standard vs. all uppercase (besides the one exception below).
- Use uppercase for keywords only. While remembering to hold shift is annoying, this does improve readability. Knowing that SQL keywords are uppercase helps separate them from the actual business table/column names and makes queries a lot easier to read.
- Do not use abbreviations or vendor/technology specific names in table or column names. Not everyone will understand the meaning of your abbreviations. It’s better to just take the extra few seconds and type out the full word. Some exceptions where an acronym is common (e.g., lbs for pounds on a weight column) are okay. Regarding vendor specific names, putting the name of a vendor (e.g., dimoracleaccount if you use an Oracle ERP) leaves you open to a renaming risk if you change vendors. There might be some exceptions for certain cross reference translation tables where it’s nearly impossible not to do this, but worth trying to avoid where it makes sense.
Class Words
Naming columns in a table can be tricky. One thing that helps standardize is class words. These are descriptors that occur at the end of the column names and help define what type of data exists in the column. For example, if we have a gross sales column that contains a dollar amount, it makes sense to call that column grosssalesamount. If we have a date column (e.g., date of a sales transaction), we could call that sales_date. What the entire list should be is subjective, but here are some starting points:
| Class Word | Snowflake Data Type | Examples | Notes |
|---|---|---|---|
| amount | DECIMAL(19,4) | salesamount, priceamount, total_amount | Used for monetary values and other decimal amounts |
| code | STRING | postalcode, statecode, product_code | Short identifiers or reference codes |
| count | INT or BIGINT | customercount, itemcount, order_count | Used for counting items or records |
| date | DATE | orderdate, birthdate, calendar_date | Date values without time component |
| datetime | DATETIME or TIMESTAMP_NTZ | createdatetime, lastupdateddatetime, transactiondatetime | Date and time values together |
| description | STRING | productdescription, customerdescription | Longer text describing an entity |
| flag | INT | etlinferredmember_flag | Reserved for ETL related columns only. Typically 0 or 1 |
| indicator | STRING | taxableindicator, activeindicator | Values should be positive/negative statements in business language (e.g., ‘Taxable’ and ‘Not Taxable’) |
| key | BIGINT | customerkey, orderkey, dimdatekey | Only for table keys for facts/dims or foreign keys for facts |
| name | STRING | customername, cityname, product_name | Human-readable names or labels |
| number | INT | ordernumber, vendornumber, account_number | Numeric identifiers (not counts or amounts) |
| quantity | DECIMAL(19,4) | orderquantity, itemquantity, stock_quantity | Numeric quantities or measurements |
| status | STRING | orderstatus, accountstatus, payment_status | Current state or condition of an entity |
| text | STRING | addresstext, notestext, comment_text | Free-form text fields |
| type | STRING | ordertype, customertype, payment_type | Classification or category of an entity |
| url | STRING | producturl, imageurl, website_url | URLs or web links |
| value | STRING | etlrowhash_value | Typically only used for ETL. Hash values or encoded data |
Audit Columns
This can be up to the specific team/process, but various “create” and “lastupdate” columns at the end of the table can be helpful. For example batchid, user_name, datetime for both create and last update (~6 total columns). This gives a row level trace back to when records were created or updated.
Summary
Naming standardization helps remove ambiguity and contributes to a more understandable data model for all stakeholders. By establishing consistent patterns for table names, column names, and class words, teams can reduce confusion, speed up development, and improve maintainability of data warehouse objects. While these conventions are guidelines rather than strict rules, communicating and following them consistently will pay dividends as your data warehouse grows in complexity.