8 min read

Vertical Scaling Snowflake: Predicting Query Behavior and Cost

Ready to eliminate guesswork from Snowflake query scaling?

Vertical Scaling Snowflake illustration showing AI-driven warehouse optimization, rising costs, and performance monitoring on cloud data infrastructure

Wesley Allan

Apr 14, 2026

TL;DR

If your team assumes that a bigger Snowflake warehouse always costs more and simply runs queries faster, you are using the wrong model.

Snowflake query behavior changes based on memory pressure, operator type, spillage, and workload overlap.

A memory-heavy query running on a small warehouse can spill to slower storage, run much longer, and cost more than the same query on a larger warehouse.

The practical approach is to monitor the right signals, such as:

Warehouse size
Query load percentage
Bytes scanned
Spillage metrics

These signals help data teams predict query behavior and implement Snowflake cost optimization strategies, not just individual query speed when Vertical scaling Snowflake.

Vertical scaling Snowflake: What it means increasing your Warehouse size

Each warehouse tier increases:

Compute resources
Available memory
Parallel processing capacity

Many teams assume this creates a simple trade-off:

More power → higher cost

Real query behavior is more complex.

Queries typically follow one of four patterns when scaling warehouses:

Linear improvement
Runtime decreases steadily as compute increases.
Diminishing returns
Performance improves initially, then levels off.
Performance plateau
Additional computing produces minimal improvement.
Cost reduction through scaling
Runtime drops enough that the total cost decreases.

The fourth pattern occurs when queries become memory-bound.

When additional memory prevents remote spillage, the runtime can drop dramatically.

In short:
A larger Snowflake warehouse can sometimes reduce both runtime and total cost.

Three KPIs That Predict Snowflake Query Behavior

Cost per query is a lagging metric.

Three signals provide stronger predictive insight into query behavior.

Warehouse Size

Warehouse size determines the maximum compute and memory available for query execution and is a key component of Snowflake warehouse performance optimization.

This affects:

Parallel processing capacity
Query memory limits
Intermediate data processing

Warehouse sizing, therefore, directly influences query memory usage and spill risk.

Query Load Percentage

Query load percentage indicates warehouse concurrency pressure.

High load typically signals:

Queued queries
Resource contention
Reduced performance

This metric helps distinguish between:

Scaling problems – warehouse too small
Concurrency problems – warehouse overloaded

Bytes Scanned

Bytes scanned measures how much data a query reads.

It does not equal memory usage, but it strongly correlates with:

Data movement
Processing workload
Potential memory pressure

When paired with query operator type, bytes scanned becomes a useful proxy for predicting memory demand.

In short:
Warehouse size, query load percentage, and bytes scanned together explain most Snowflake query performance behavior.

Why Bytes Scanned Matter for Snowflake Query Optimization

Most teams treat bytes scanned as a simple measure of data access, but it becomes far more useful when analyzed as part of usage-based data pipeline optimization, where query behavior is tied to actual workload demand.

The more useful interpretation is how it relates to memory demand.

Snowflake exposes bytes scanned through query history tables.

SELECT
query_id,
warehouse_name,
total_elapsed_time / 1000 AS elapsed_seconds,
bytes_scanned,
rows_produced,
query_text
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
AND warehouse_name IS NOT NULL
ORDER BY bytes_scanned DESC
LIMIT 25;

Queries scanning large datasets often place heavy pressure on memory.

However, the query operator shape significantly affects how memory demand grows.

Typical patterns include:

Filter-heavy queries

Low memory amplification
Data filtered early in execution

Aggregation-heavy queries

Moderate working memory requirements

Join-heavy queries

Large intermediate datasets

Window-function queries

High memory amplification due to partition processing

Two queries scanning identical data volumes can behave very differently.

Example:

Query A scans 200 GB and filters most rows early.
Query B scans 200 GB but performs large window operations.

Both have identical bytes scanned.

Only one is likely to become memory-bound.

In short:
Bytes scanned become far more useful when combined with query operator patterns.

Snowflake Spillage: The Hidden Driver of Query Runtime

Spillage occurs when Snowflake cannot keep intermediate data in memory.

Data moves through a storage hierarchy:

Memory
Local SSD
Remote storage

Each step increases latency.

Remote storage introduces the largest performance penalty.

When queries spill remotely, runtime can increase by multiples, not just percentages.

This explains why warehouse size can sometimes reduce cost when organizations apply warehouse optimization techniques that identify memory pressure and remote spillage.

A larger warehouse provides more memory.

If that memory eliminates remote spillage, the query may finish dramatically faster.

Snowflake exposes spillage metrics through query history.

SELECT
query_id,
warehouse_name,
bytes_scanned,
bytes_spilled_to_local_storage,
bytes_spilled_to_remote_storage,
total_elapsed_time / 1000 AS elapsed_seconds
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
AND bytes_spilled_to_remote_storage > 0
ORDER BY bytes_spilled_to_remote_storage DESC
LIMIT 50;

Queries with heavy remote spillage are strong candidates for vertical scaling tests.

In short:
Remote spillage is often the largest contributor to slow Snowflake queries.

Snowflake Warehouse Sizing: Why Smaller Warehouses Are Not Always Cheaper

A memory-heavy query illustrates why smaller warehouses are not always optimal.

Consider the following pattern.

Small warehouse

Heavy remote spillage
Long runtime
High total cost

Medium warehouse

Reduced spillage
Moderate runtime improvement

Large warehouse

Minimal spillage
Short runtime
Potentially lower cost

The goal is not to find the smallest warehouse.

The goal is to find the configuration where:

Remote spillage disappears
Parallel processing still provides a benefit

Beyond that point, scaling only increases price.

In short:
Optimal Snowflake warehouse sizing eliminates remote spillage without unnecessary compute.

Workload Behavior Matters More Than Individual Queries

Snowflake bills based on warehouse uptime, not query complexity. This is why teams increasingly rely on usage-based workload optimization to reduce idle compute and unnecessary warehouse runtime.

This means workload behavior determines cost more than individual queries.

Many queries run simultaneously and compete for the same resources.

The queries that matter most are those that extend warehouse runtime.

A workload-level view can be generated using query history.

SELECT
warehouse_name,
DATE_TRUNC(‘hour’, start_time) AS hour_bucket,
COUNT(*) AS query_count,
AVG(total_elapsed_time / 1000) AS avg_elapsed_seconds,
SUM(bytes_scanned) AS total_bytes_scanned
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
GROUP BY 1, 2
ORDER BY 2 DESC, 5 DESC;

This analysis highlights:

Concurrency peaks
Workload bursts
Queries extending warehouse uptime

In short:
Optimizing the queries that extend warehouse runtime produces the largest cost improvements.

Predicting Query Behavior for Snowflake Compute Cost Optimization

When teams apply continuous Snowflake cost monitoring that correlates query behavior with warehouse runtime.

A structured workflow produces useful guidance.

Step 1: Identify dominant query patterns

Group queries by operator type:

Filters
Aggregations
Joins
Window functions

Queries with joins and window functions often become memory-bound.

Step 2: Measure key workload signals

Track three metrics per warehouse:

Warehouse size
Query load percentage
Bytes scanned

These signals distinguish concurrency pressure from memory pressure.

Step 3: Identify memory-bound queries

Look for queries that combine:

Large scan volumes
Complex operators

These queries are likely candidates for vertical scaling tests.

Step 4: Detect spillage patterns

Review spillage metrics:

bytes_spilled_to_local_storage
bytes_spilled_to_remote_storage

Frequent remote spillage indicates insufficient memory.

Step 5: Test warehouse sizes against the workload

Run the same workload across different warehouse sizes.

Measure:

Runtime changes
Spillage reduction
Total warehouse uptime

The optimal configuration usually occurs where remote spillage disappears, and the runtime drops significantly.

In short:
Snowflake compute cost optimization works best when evaluating entire workloads instead of single queries.

Where Seemore Fits

Building workload-level optimization models internally is possible.

It is also complex.

Teams must continuously correlate:

Query performance
Warehouse configuration
Cost patterns
Data pipeline activity

Seemore Data addresses this challenge by analyzing cost, performance, and usage signals across the entire data stack through an AI-driven data efficiency agent that continuously detects optimization opportunities.

The platform continuously evaluates:

Snowflake warehouse performance
Query behavior and spillage
Pipeline usage patterns
BI workload impact

It then surfaces optimization opportunities and automates actions such as scheduling adjustments, workload-aware scaling, and cost anomaly detection.

In short:
Seemore turns warehouse tuning from manual analysis into continuous optimization.

Vertical scaling Snowflake often behaves differently from team expectations.

Key principles include:

Larger warehouses do not always cost more
Smaller warehouses do not always save money
Bytes scanned help estimate memory pressure
Remote spillage is a major driver of runtime
Workload behavior determines total warehouse cost

Teams that monitor these signals can predict query behavior more accurately.

Once the correct model is applied, warehouse tuning becomes predictable rather than guesswork.

FAQ

What is vertical scaling in Snowflake?

Vertical scaling means increasing or decreasing warehouse size so queries receive more or less compute and memory.

Why can a larger Snowflake warehouse cost less?

A larger warehouse can eliminate remote spillage, reducing runtime enough to lower total warehouse compute time.

Which Snowflake metrics predict query behavior?

Warehouse size, query load percentage, bytes scanned, and spillage metrics are the most useful indicators.

Why are the bytes scanned difficult to interpret?

Bytes scanned do not directly represent memory usage. It must be combined with the query operator type to estimate memory pressure.

Is all Snowflake spillage bad?

No. Some local spillage is acceptable. Remote spillage usually creates the largest performance penalty.

Should teams optimize queries or workloads?

Workloads. Snowflake bills based on warehouse uptime, so the queries extending warehouse runtime provide the largest optimization opportunities.

Should you migrate to Gen2?

Calculate my ROI

12 min read

6 Best Snowflake Cost Management Tools: 2026 Comparison and Feature Guide

Wesley Allan

Apr 20, 2026

15 min read

Snowflake ROUND Function Explained: A Quick Guide

Matan Avneri

Feb 27, 2025

24 min read

Top Open-Source Data Lineage Tools for 2024: Pros, Cons and How to Assess the Right Solution for You

Idan Birnboim

Sep 05, 2024