Data Glossary
• Glossary
Cloud Data Architecture
Clustered Database
Data Partitioning
Primary Key in Database
Snowflake Stages
Snowgrid
Unity Catalog

Snowflake Stages

What Are Snowflake Stages?

Snowflake stages are storage locations within Snowflake where data can be temporarily or permanently stored before being loaded into a table or extracted for use in external systems. They act as an intermediary storage layer that simplifies the process of importing and exporting data in Snowflake. These stages are crucial for optimizing data pipelines, especially when handling large datasets, by reducing the complexity of data movement.

When working with stages in Snowflake, users can upload files to a stage and then use SQL commands to load that data into Snowflake tables or unload data from tables into stage locations for further processing. Utilizing Snowflake stages effectively is key to improving query performance and managing large data workflows. To further optimize these workflows, mastering the Snowflake query profile can help users analyze query execution and understand how data movement through stages impacts overall performance.

There are two main types of stages in Snowflake: internal stages and external stages. Snowflake external stages allow users to integrate with cloud storage services like AWS S3, Azure Blob Storage, and Google Cloud Storage. Meanwhile, internal stages are managed entirely within Snowflake’s platform. Additionally, when handling incremental data updates, merge statements in Snowflake can be used to efficiently merge staged data with existing tables, ensuring that data remains consistent and up to date.

Find out where you stand

Ready to see where you stand on these predictions? Let us take a peek under the hood with a free assessment and no commitment.

Find your savings

 

Types of Snowflake Stages: Internal and External

Snowflake offers two primary types of stages to meet different data storage and integration needs: internal stages and external stages.

1. Internal Stages

Internal stages in Snowflake are storage locations that exist within the Snowflake environment. These stages are fully managed by Snowflake, making them easy to set up and use without additional cloud storage configurations. Internal stages are further divided into three categories:

  • User Stages: Every Snowflake user is automatically assigned a user stage. These stages are private to each user and can be used to upload files for personal use.
  • Table Stages: Each table in Snowflake has an associated table stage that is automatically created when the table is created. Table stages are used to load or unload data directly to and from a specific table.
  • Named Stages: These are manually created stages that are reusable and can be shared across multiple tables or users. Named stages offer more flexibility and control compared to user and table stages.

 

2. External Stages

Snowflake external stages connect Snowflake with external cloud storage services like AWS S3, Azure Blob Storage, and Google Cloud Storage. These stages are ideal for organizations that already store large datasets in external cloud storage and want to load that data into Snowflake without manually transferring files.

To set up an external stage, users need to configure a storage integration between Snowflake and the cloud provider. Once connected, users can reference the external stage in their SQL queries to load or unload data directly from cloud storage.

Steps to Set Up Snowflake Stages

Setting up stages in Snowflake is a straightforward process that involves creating a stage, uploading files, and loading data into tables. Below are the steps to set up both internal and external stages in Snowflake:

1. Setting Up an Internal Stage

Step 1: Create a Named Internal Stage

To create a named internal stage, use the following SQL command:

CREATE STAGE my_internal_stage;

 

This command creates an internal stage that you can use to store and manage files.

Step 2: Upload Files to the Internal Stage

Use the PUT command to upload files from your local machine to the internal stage:

PUT file://path_to_your_file.csv @my_internal_stage;

 

Step 3: Load Data into a Table

After uploading files to the stage, use the COPY INTO command to load the data into a Snowflake table:

COPY INTO my_table
FROM @my_internal_stage/file.csv
FILE_FORMAT = (TYPE = ‘CSV’);

 

2. Setting Up an External Stage

Step 1: Create a Storage Integration

Before creating an external stage, you need to set up a storage integration between Snowflake and your cloud provider. For example, in AWS S3, use the following command:

CREATE STORAGE INTEGRATION my_s3_integration
TYPE = EXTERNAL_STAGE
STORAGE_PROVIDER = ‘S3’
ENABLED = TRUE;

 

Step 2: Create an External Stage

Once the storage integration is set up, create an external stage using the following command:

CREATE STAGE my_external_stage
URL = ‘s3://my-bucket/data/’
STORAGE_INTEGRATION = my_s3_integration;

 

Step 3: Load Data from the External Stage into a Table

To load data from the external stage into a Snowflake table, use the COPY INTO command:

COPY INTO my_table
FROM @my_external_stage
FILE_FORMAT = (TYPE = ‘CSV’);

Save Big in 30 min

Ready to take the plunge? Hop on a 30 minute demo to see how much you can save in the first 30 days with Seemore.

Oink a demo

Best Practices for Using Snowflake Stages

To maximize the efficiency of snowflake stages, it’s essential to follow best practices for managing and securing data.

1. Use Named Stages for Reusability

Named internal stages are more flexible than table or user stages. They can be shared across multiple tables and users, making them ideal for reusable data pipelines.

2. Secure External Stages with IAM Roles

When setting up Snowflake external stages, ensure that your cloud storage is secured using IAM roles or access policies. This prevents unauthorized access to your data and ensures compliance with data security standards.

3. Monitor and Manage Data in Stages

Regularly monitor the data stored in your stages to ensure they don’t accumulate outdated files. Use the LIST command to see the files in a stage and the REMOVE command to delete unnecessary files.

LIST @my_internal_stage;
REMOVE @my_internal_stage/file.csv;

 

4. Use Appropriate File Formats

To optimize data loading performance, use the appropriate file format and compression. For example, use Parquet or ORC files for large datasets, as they are more efficient than CSV files.

5. Automate Data Loading with Snowpipe

To streamline the data loading process, consider using Snowflake’s Snowpipe feature. Snowpipe automates the process of loading data from stages into tables, reducing manual effort.

Prev
Next

Let's start by spending 40% less on data

With end-to-end data product level lineage visibility, data cost root-cause analysis and the perfect mix of automation, we help implement transparent cost allocation models that run with really minimum effort and on a daily basis

Wanna see how?

Seemore resources