This is the second article in the learning SQL series:
- Part 1: Setup your environment
- Part 2: Basic Select Statements (this article)
- Part 3: Where Statements
- Part 4: More Where Statements
- Part 5: Aggregations and Group By Statements
- Part 6: Joins
Once your environment is setup, the first thing you need to know how to do is pull data from a table. SQL is pretty easy to read and the gist of a SELECT statement is to provide the names of the columns and table you are pulling data from. Execute the statement and you’ll see the database return the rows and columns that you requested. You could pull data from one column, a few columns, or all the columns in a table. Let’s go through a few examples.
To pull all columns and rows of a table, you’ll write what is commonly referred to as a “select star” SQL query. For the examples in this article, we are going to be looking at the DimCustomer table. The SQL and results are as follows…the screenshot is just a sample - test it out on your own and you’ll see that the entire table’s contents are retrieved.
The first line is the SELECT portion of the statement, with star representing a wildcard of all column names. The second part of the statement, the FROM clause specifies which table we are pulling data from. This could be swapped out for any table name to pull the table contents. These two components (which columns and which table) make up the basic SELECT statement.
SELECT * FROM [AdventureWorksDW2017].[dbo].[DimCustomer]
Pulling all rows and columns in a table is probably overkill in most situations, especially when initially inspecting the data. Even more important, it can tie up a lot of resources on the database server when tables are large. To pull a same of just a few rows, we use the keyword TOP and the number of rows we want to pull.
Note - the keyword could change a bit depending on which flavor of SQL you are using (ex: some database servers you’ll need to use a FETCH FIRST N ROWS ONLY at the end of the SQL statement instead).
SELECT TOP 10 * FROM [AdventureWorksDW2017].[dbo].[DimCustomer]
Often you will not want every single column in a table. In those cases, you need to name the columns you want to pull, seperated by commas. For example, if we wanted to just pull the first 5 FirstName and LastName columns, the SQL would be as follows:
SELECT TOP 5 FirstName ,LastName FROM [AdventureWorksDW2017].[dbo].[DimCustomer]
How could we pull a list of just distinct values? There are likely duplicate names in the customer table. If we wanted to pull a distinct list of first names, the DISTINCT keyword can be used. For any columns that are in a query with a DISTINCT, only the unique combinations will be returned.
SELECT DISTINCT FirstName FROM [AdventureWorksDW2017].[dbo].[DimCustomer]
That’s it for this article. Select statements are not complex, they are foundational to writing SQL, and a main component of what data analysts use to pull data using SQL. Almost any query that you write to pull data from a database table will utilize a SELECT statement.
Check out additional articles in this series here: