Home > Resources > Blog

Querying Data in Snowflake

May 3, 2021 by Bibhas Bhattacharya
Categories: Big Data , Data Science and Business Analytics

In this tutorial, you will learn how to create and run queries in Snowflake. You will also learn about query profiles and views. We will also review the new Snowsight experience UI.

According to Snowflake’s documentation, “Snowflake supports standard SQL, including a subset of ANSI SQL:1999 and the SQL:2003 analytic extensions. Snowflake also supports common variations for a number of commands where those variations do not conflict with each other.“

Part 1 – Sign in to Snowflake Web UI

1. If you have not done so, sign in to the Snowflake Web UI using your credentials.

You should be logged in as a SYSADMIN user.

Part 2 – Querying Data in Snowflake

1. Click Worksheets

2. Click + to add a new tabbed worksheet, name it Querying Data

3. Make sure you are using the LAB_WH warehouse and the LAB_DB database.

Note: You can programmatically set up both resources as follows:

USE LAB_DB;
USE WAREHOUSE LAB_WH;

Also, whenever you wish to suspend a warehouse, issue this command:

ALTER WAREHOUSE LAB_WH SUSPEND;

4. Enter and run the following commands:

SELECT ORDER_ID, SUM (QUANTITY * UNIT_PRICE) "ORDER TOTAL"
FROM ORDER_DETAILS    
    GROUP BY ORDER_ID;

Review the results.

5. Here is an illustration of the INNER JOIN SQL statement that you can copy and paste in the same worksheet tab (select it before running):

SELECT O.ORDER_ID, OD.INVENTORY_ID, OD.STATUS_ID, O.EMPLOYEE_ID
    FROM  ORDER_DETAILS OD JOIN  ORDERS O
    ON OD.ORDER_ID = O.ORDER_ID
    ORDER BY O.ORDER_DATE;

Now let’s see how we can use the pagination feature based on the OFFSET and FETCH parameters.

Note: Snowflake supports both LIMIT (the Postgres syntax) and FETCH (the ANSI syntax) for constraining the maximum number of rows returned by a statement or subquery; they produce the same results.

6. Enter and execute the following command:

SELECT ORDER_ID, SUM (QUANTITY * UNIT_PRICE) "ORDER TOTAL"
FROM ORDER_DETAILS    
    GROUP BY ORDER_ID
    OFFSET 0 FETCH 5;

7. Replace the last line of the above command with the following line to see how you can progress through the available orders ids:

    OFFSET 5 FETCH 5;

Part 3 – Create a View

Here is what you need to do to create and query a standard view.

1. Enter the following command to create a new view:

CREATE OR REPLACE VIEW V_INVOICED_ORDERS
AS
SELECT O.ORDER_ID, OD.INVENTORY_ID, O.EMPLOYEE_ID
    FROM  ORDER_DETAILS OD JOIN  ORDERS O
    ON OD.ORDER_ID = O.ORDER_ID WHERE OD.STATUS_ID = 'Invoiced'
    ORDER BY O.ORDER_DATE

2. Enter the following command to execute a query against the above view:

SELECT * FROM V_INVOICED_ORDERS WHERE EMPLOYEE_ID = 9;

Part 4 – The Query Profile (the Query Execution Plan)

1. In the Results pane at the bottom of the Web UI, click Query ID and in the popup that appears, click the query id link in the form of a hash:

Review the query’s overall runtime details, like Scanned Bytes and Total Duration.

2. Click Profile

3. In the Profile screen, click the Box button in the bottom left corner to center the picture; you should see the following overview of the query map along with the relative costs in the form of a percentage of the total query execution time.

Spend some time analyzing the results.

You could notice that the most expensive part of the query is the table scan of the ORDER_DETAILS table, which takes 66.7% of the total processing time. This scan comes from this FROM clause:

...
FROM  ORDER_DETAILS OD JOIN  ORDERS O  
...

The query plan starts with the table scan of the ORDER table scan followed by the filter on the EMPLOYEE_ID = 9 executed against the ORDER table . Some operations may be performed by Snowflake in parallel given that you have sufficient compute resources (the cloud platform’s virtual CPUs — vCPUs, more specifically) .

Part 5 – The Snowsignt Experience

1. Click Preview App in the toolbar, and log in with your Snowflake credentials.

You will be logged in a new browser tab.

2. Click +Worksheet in the top right-hand corner

3. Select the LAB_WH for the warehouse engine and LAB_DB.PUBLIC for the database in places shown in the screenshot below.

We are going to demonstrate the datebucket-based functionality that is only available in Snowsight worksheets (the datebucket function is not recognized in the Classic Console’s worksheets).

4. Enter the following command (enjoying the auto-completion feature), and execute it:

SELECT COUNT(ORDER_DATE) as COUNT_ORDER_DATE, ORDER_DATE 
FROM ORDERS 
GROUP BY :datebucket(ORDER_DATE), ORDER_DATE 
ORDER BY COUNT_ORDER_DATE DESC;

The Snowsight UI will display the query results along with the visual summary view on the right side.

The visual elements are grouped into date buckets in the form of bar graphs that offer the drill-down capability.

5. Click the COUNT_ORDER_DATE section in the bar graph to select it for review.

You will see the overall summary of the output:

There are 6 date buckets in the ranges:

[1,2), [2,3), [3,4), — no data in bucket 4 –, [5,6), and [6,7)

You can always go back to the original view of the visual by clicking Clear Selection located above the graphs.

Spend some time reviewing the available options for visual (exploratory) data analysis.

6. Switch back to the Classic Console view (the Web UI) tab and stop the LAB_WH warehouse.

Now, if you go back to the Snowsight and refresh the page, you will see that the LAB_WH is stopped (the green indicator has turned to gray). As you have probably guessed, Snowflake activities happening in both browser tabs are synchronized.

Part 6 – Workspace Clean Up

1. Close the Snowsight browser tab.

2. Make sure that the LAB_WH warehouse is in suspend mode; issue the following command, if necessary:

ALTER WAREHOUSE LAB_WH SUSPEND;

Part 7 – Review

In this tutorial, you learned how to create and run queries in Snowflake and the new Snowsight experience UI. You also learned about query profiles and how to create views