Will Gen2 Save You Money? See Your Real Gen2 Savings Potential in Seconds.

< blog
6 min read

Snowflake AI Observability for Cortex Agents: Tracking Token-Level Costs with SQL

Snowflake Cortex Agent Cost is a Black Box. And Snowflake AI Observability shed some light onto it.

TL;DR
Snowflake Cortex Agent costs are not fully visible in native dashboards. By querying SNOWFLAKE.LOCAL.AI_OBSERVABILITY_EVENTS and applying model-specific pricing, you can calculate 100% of agent costs, attribute spend to agents/users/teams, and build daily cost monitoring with SQL – today.

Intro

If you’ve recently deployed Snowflake Cortex Agents in production, you’ve probably discovered the same painful truth many data engineering teams have: tracking Agent costs is nearly impossible with Snowflake AI Observability dashboards.

You check your AI Services dashboard and see $45.50 in total daily usage, but only ~$15 is accounted for in the detailed breakdown. Where did the other $30.50 go? The answer: Cortex Agent costs are invisible in standard monitoring – until now.

This article provides data engineering teams with a practical, SQL-based solution to monitor Cortex Agent usage and calculate actual costs using Snowflake’s AI_OBSERVABILITY_EVENTS table.

The Problem: The Cortex Agent Cost Black Box

Snowflake’s Cortex AI services are billed based on token consumption – input and output tokens processed through LLMs. While this works transparently for functions like AI_COMPLETE or AI_EMBED, Cortex Agents operate as orchestration layers that make multiple LLM calls under the hood.

Each agent interaction involves:

  • Planning calls to determine which tools to use
  • Tool execution (SQL queries, API calls, retrievers)
  • Response generation through LLMs
  • Multi-turn conversations with context management

 

These interactions generate token consumption across multiple models (Mistral Large2, Llama 3.1-70B, etc.), but Snowflake’s standard Cost & Usage dashboards, including Snowflake AI Observability, don’t provide agent-specific breakdowns. This creates several critical problems:

  1. Cost attribution: You can’t allocate costs to specific agents, teams, or projects
  2. Budget management: Impossible to set spending alerts for agent usage
  3. Optimization: No visibility into which agents are expensive vs. efficient
  4. Forecasting: Can’t predict future costs based on agent usage patterns

 

Read more about The hidden cost of Snowflake Cortex Pricing.

The Solution: AI_OBSERVABILITY_EVENTS Table

Until native Cortex Agent cost dashboards arrive in Snowflake AI Observability, Snowflake provides a powerful workaround:

SNOWFLAKE.LOCAL.AI_OBSERVABILITY_EVENTS

This internal table captures comprehensive telemetry for all AI operations in your account, including:

  • Agent names and versions
  • User and role information
  • Model names used for planning
  • Token counts (input, output, total)
  • Execution duration
  • SQL execution time

By querying this table and applying Snowflake’s pricing structure, you can calculate actual agent costs with token-level precision.

Understanding Snowflake Cortex Agent Pricing

Before diving into SQL, let’s understand how Cortex AI is priced. According to Snowflake’s documentation, costs are based on credits per million tokens.

Key Pricing Concepts

What’s a token?
A token is the smallest unit of text processed by an LLM. Industry convention: ~4 characters = 1 token, though this varies by model.

Billable tokens for AI_COMPLETE (used by Agents):

  • Both input AND output tokens are billed
  • The model determines the credit rate
  • Popular models like Mistral Large2 and Llama 3.1-70B have different rates

Example credit costs (from Snowflake’s consumption table):

  • Mistral Large2: ~0.01-0.02 credits per 1M tokens (varies by edition)
  • Llama 3.1-70B: ~0.015 credits per 1M tokens
  • Claude 4 Sonnet: ~0.03 credits per 1M tokens

Converting credits to dollars:

  • Standard Edition: $2.00 per credit
  • Enterprise Edition: $3.00 per credit
  • Business Critical Edition: $4.00 per credit

For example, An Enterprise Edition account runs an agent that processes 10M tokens on Llama 3.1-70B:

10M × 0.015 credits × $3.00 = $0.45

The SQL Query: Extracting Snowflake Cortex Agent Cost Data

Here’s the production-ready query to extract Cortex Agent usage and calculate costs:

SELECT
MIN(TIMESTAMP) AS start_ts,
MAX(TIMESTAMP) AS end_ts,
RECORD_ATTRIBUTES:”snow.ai.observability.object.name”::STRING AS agent_name,
— User and role attribution
RECORD_ATTRIBUTES:”ai.observability.record_id”::STRING AS ai_observability_record_id,
RESOURCE_ATTRIBUTES:”snow.user.name”::STRING AS user_name,
RESOURCE_ATTRIBUTES:”snow.session.role.primary.name”::STRING AS role_name,
— Model and performance metrics
MAX(RECORD_ATTRIBUTES:”snow.ai.observability.agent.planning.model”::STRING) AS model_name,
SUM(RECORD_ATTRIBUTES:”snow.ai.observability.agent.tool.sql_execution.duration”::NUMBER) AS sql_execution_duration_ms,
SUM(RECORD_ATTRIBUTES:”snow.ai.observability.agent.duration”::NUMBER) AS agent_duration_ms,
SUM(RECORD_ATTRIBUTES:”snow.ai.observability.agent.planning.duration”::NUMBER) AS agent_planning_duration_ms,
— Token consumption (the key to cost calculation)
SUM(RECORD_ATTRIBUTES:”snow.ai.observability.agent.planning.token_count.input”::NUMBER) AS tokens_input,
SUM(RECORD_ATTRIBUTES:”snow.ai.observability.agent.planning.token_count.output”::NUMBER) AS tokens_output,
SUM(RECORD_ATTRIBUTES:”snow.ai.observability.agent.planning.token_count.total”::NUMBER) AS tokens_total
FROM SNOWFLAKE.LOCAL.AI_OBSERVABILITY_EVENTS
WHERE
RECORD_ATTRIBUTES:”snow.ai.observability.object.type”::STRING = ‘Cortex Agent’
AND TIMESTAMP > ‘2025-12-31’
GROUP BY agent_name, user_name, role_name, ai_observability_record_id
ORDER BY start_ts DESC;

Calculating Agent Costs

Once you have the token counts, calculate costs using this enhanced query:

WITH agent_usage AS (
SELECT
MIN(TIMESTAMP) AS start_ts,
MAX(TIMESTAMP) AS end_ts,
RECORD_ATTRIBUTES:”snow.ai.observability.object.name”::STRING
AS agent_name,
RESOURCE_ATTRIBUTES:”snow.user.name”::STRING AS user_name,
MAX(RECORD_ATTRIBUTES:”snow.ai.observability.agent.planning.model”::STRING) AS model_name,
SUM(RECORD_ATTRIBUTES:”snow.ai.observability.agent.planning.token_count.total”::NUMBER) AS tokens_total
FROM SNOWFLAKE.LOCAL.AI_OBSERVABILITY_EVENTS
WHERE
RECORD_ATTRIBUTES:”snow.ai.observability.object.type”::STRING = ‘Cortex Agent’
AND TIMESTAMP > DATEADD(‘days’, -30, CURRENT_DATE())
GROUP BY agent_name, user_name, RECORD_ATTRIBUTES:”ai.observability.record_id”::STRING
),
cost_calculation AS (
SELECT
agent_name,
user_name,
model_name,
tokens_total,
— Apply model-specific credit rates (adjust based on your models)
CASE
WHEN model_name LIKE ‘%mistral-large%’ THEN tokens_total / 1000000.0 * 0.015
WHEN model_name LIKE ‘%llama3.1-70b%’ THEN tokens_total / 1000000.0 * 0.015
WHEN model_name LIKE ‘%claude%’ THEN tokens_total / 1000000.0 * 0.03
ELSE tokens_total / 1000000.0 * 0.02 — default rate
END AS credits_consumed,
— Convert to USD (adjust $3.00 to your edition’s rate)
CASE
WHEN model_name LIKE ‘%mistral-large%’ THEN tokens_total / 1000000.0 * 0.015 * 3.00
WHEN model_name LIKE ‘%llama3.1-70b%’ THEN tokens_total / 1000000.0 * 0.015 * 3.00
WHEN model_name LIKE ‘%claude%’ THEN tokens_total / 1000000.0 * 0.03 * 3.00
ELSE tokens_total / 1000000.0 * 0.02 * 3.00
END AS cost_usd
FROM agent_usage
)
SELECT
agent_name,
user_name,
model_name,
SUM(tokens_total) AS total_tokens,
SUM(credits_consumed) AS total_credits,
SUM(cost_usd) AS total_cost_usd
FROM cost_calculation
GROUP BY agent_name, user_name, model_name
ORDER BY total_cost_usd DESC;

Important: Update the credit rates in the CASE statements based on:

  1. Your Snowflake edition (Standard, Enterprise, Business Critical)
  2. The actual models used by your agents
  3. Current pricing from the Snowflake Service Consumption Table

Important Note – Keep in mind to take into consideration the warehouse cost for a more accurate prediction!

 

Best Practices for Cost Optimization for Cotext Agent

Based on the visibility this solution provides, here are optimization strategies:

  1. Model Selection: If you discover an agent using Claude 4 Sonnet when Llama 3.1-70B would suffice, you can reduce costs by 50%
  2. Prompt Engineering: High token counts often indicate verbose prompts – refactor for conciseness
  3. Context engineering: Be descriptive and accurate in your Orchestration instructions and Tools description
  4. Caching: Implement context caching for repeated agent interactions
  5. Rate Limiting: Set per-agent or per-user daily token budgets
  6. Tool Optimization: Reduce SQL execution time to minimize overall agent latency

 

Required Privileges

To access AI_OBSERVABILITY_EVENTS, ensure your role has:

GRANT DATABASE ROLE SNOWFLAKE.CORTEX_USER TO ROLE your_monitoring_role;
GRANT APPLICATION ROLE SNOWFLAKE.AI_OBSERVABILITY_EVENTS_LOOKUP TO ROLE your_monitoring_role;  

Limitations and Future State

Current limitations:

  • Requires manual credit rate maintenance
  • No native UI for agent cost visualization
  • Tracing data doesn’t include tool-specific breakdowns

Coming soon (per Snowflake roadmap):

  • Native agent cost dashboards in Snowsight
  • Automated cost allocation by project/workspace
  • Real-time cost alerting

 

How to Monitor Snowflake Cortex Agent Costs with Seemore Data?

While the SQL queries above work for manual analysis, production teams need continuous, automated AI cost control. Seemore Data transforms raw Snowflake AI Observability telemetry into automated governance by providing:

Advanced Observability Dashboard

  • Token-level visibility across all Cortex Agents, not just aggregated totals
  • Real-time cost attribution by agent, user, team, and project
  • Model efficiency tracking with cost-per-interaction benchmarking
  • Historical trending to identify optimization opportunities

Intelligent Spike Monitoring

  • Automated anomaly detection for day-over-day cost increases
  • Proactive alerts when agents exceed expected token consumption patterns
  • Root cause analysis linking spikes to specific agents, queries, or users
  • Slack/email notifications before budget overruns occur

Cortex Budget Limiting

  • Set spending caps at the agent, user, or team level
  • Automatic throttling when budgets approach thresholds
  • Forecasting based on current usage trajectories
  • Budget allocation recommendations based on historical patterns

 

Conclusion

Cortex Agents are powerful, but without proper cost monitoring, they can become budget black holes. The AI_OBSERVABILITY_EVENTS table provides the raw data needed to build comprehensive cost tracking – today.

Seemore Data helps data engineering teams move from raw observability data to continuous, automated AI cost control by enabling:

  • Account for cortex costs with token-level precision
  • Attribute costs to agents, users, and teams
  • Detect anomalies before they become budget crises
  • Limit budgets on cortex AI costs before it spikes

If you’re exploring how to operationalize Snowflake AI Observability beyond manual queries, you can book a quick meeting with our data experts, and look at how Seemore approaches AI cost visibility and governance.

Ready to unlock Snowflake AI Observability?

Contact us to move to an automated AI cost control method.

Get Your Cortex Analysis

 

FAQ: Snowflake AI Observability & Cortex

What is Snowflake AI Observability?

Snowflake AI Observability is Snowflake’s built-in telemetry framework for AI services that captures execution metadata such as model usage, token counts, duration, and user context for Cortex functions and Agents.

Does Snowflake AI Observability include Cortex Agents?

Yes. Snowflake AI Observability captures telemetry for Cortex Agents, including planning models, token usage, execution duration, and user metadata, via the AI_OBSERVABILITY_EVENTS system table.

Why doesn’t Snowflake AI Observability show full Cortex Agent costs?

Snowflake AI Observability records raw usage events, but native dashboards do not aggregate these events into agent-level or user-level cost views, leaving part of Cortex Agent spend unaccounted for.

What is the AI_OBSERVABILITY_EVENTS table in Snowflake?

SNOWFLAKE.LOCAL.AI_OBSERVABILITY_EVENTS is a system table that stores detailed observability events for Snowflake AI services, including Cortex Agents, with token counts, models used, timing, and execution context.

How are Cortex Agents billed in Snowflake?

Cortex Agents are billed based on LLM token consumption. Both input and output tokens generated during agent planning and response generation are billable, with credit rates determined by the underlying model.

Can I calculate Cortex Agent costs using Snowflake AI Observability data?

Yes. By querying AI_OBSERVABILITY_EVENTS and applying Snowflake’s model-specific credit rates to recorded token counts, you can calculate Cortex Agent costs with token-level accuracy.

Can Snowflake AI Observability attribute AI costs to users or teams?

Snowflake AI Observability captures user and role metadata, but does not natively attribute AI costs by user, team, or project. Attribution requires custom SQL or an automated cost-governance layer.

What is the difference between Cortex functions and Cortex Agents in observability?

Cortex functions generate single LLM calls, while Cortex Agents orchestrate multiple LLM calls and tools. This makes Cortex Agent costs more complex and less visible in Snowflake AI Observability dashboards.

Should you migrate to Gen2?
4 min read

How Splitit Gained Continuous Cost Control and Optimization with Seemore Data

7 min read

From Runtime to Spillage: Snowflake Performance Optimization Insights from Our CPO

Snowflake performance tuning
14 min read

Snowflake Performance Tuning in 2025: Pro Tips and Common Mistakes in Snowflake: Best Practices, Common Mistakes, and Pro Tips for Data Teams

Cool, now
what can you DO with this?

data ROI