Skip to content

Investigating & improving warehouse performance

This guide is a repeatable monitor → diagnose → tune/scale playbook for answering "the warehouse feels slow / we're being throttled — what now?" using fabric-dw. It operates at the warehouse / capacity (compute) level — the whole-warehouse story — not the tuning of a single query.

For single-query tuning (execution plans, missing statistics on one query, clustering), see the query-optimizer Agent Skill and the fdw sql plan command instead. This guide is about the warehouse and the capacity it runs on.

Audience. Fabric workspace admins, data-platform / DWH operators, and SREs running Fabric Data Warehouses or SQL Analytics Endpoints from the CLI or via an MCP-connected assistant. It assumes you are comfortable with T-SQL and Fabric workspaces, but not with Fabric's internal compute model — that model is explained below, because it changes which lever is the right one.

Every command appears in both its CLI form (fdw …) and its MCP tool name, so you can drive this workflow by hand or through an AI assistant. CLI subcommand names and MCP tool names mostly match; where they diverge (the queries group), both are given.


Relationship to the warehouse-performance Agent Skill

fabric-dw ships a warehouse-performance Agent Skill (/fabric-dw:warehouse-performance) that automates exactly this monitor → diagnose → tune workflow: it surfaces long-running and frequent queries plus SQL pool insights, audits statistics health, checks result-set caching, reviews and tunes SQL pool configuration, and produces a prioritized findings report — with every mutating action gated on your confirmation.

The guide and the skill are complementary, not duplicates:

  • This guide is the human narrative — the why, and how Fabric serverless compute works — for an operator driving fabric-dw (or any SQL/REST client) by hand and reasoning about whether the right lever is statistics, caching, workload isolation, or a capacity change.
  • The skill is the AI-assistant automation — the do it for me path: it runs the same read commands, synthesizes the report, and proposes the mutating actions for you to confirm via an MCP-connected assistant.

If you have an MCP-connected assistant configured, you can let the skill drive the steps below for you. Either way, the underlying commands and their guarantees are identical.


How Fabric Warehouse compute actually works

Before reaching for a lever, it helps to know what you can and cannot tune. Fabric's Warehouse compute model is different from a provisioned MPP warehouse, and that difference decides the fix.

Serverless, autonomously scaling compute

A Fabric Warehouse and a SQL Analytics Endpoint share the same serverless, autonomously scaling distributed compute engine. Backend nodes are provisioned in seconds, scaling is an online operation, and there is no compute to provision or resize — scaling happens autonomously as the workload demands it. You do not pick a node count or a "DWU" tier per warehouse.

See Workload management in Fabric Data Warehouse.

Capacity, bursting, and smoothing

Performance and cost headroom come from the Fabric capacity SKU the workspace runs on, measured in Capacity Units (CU):

  • Bursting — a query can temporarily use more CU than the SKU baseline (up to 12× baseline) so a single heavy operation finishes fast. The available burst is governed by a per-SKU scale factor (CU / duration / Baseline CU); F2–F2048 each have a guardrail. See Burstable capacity.
  • Smoothing — CU accounting is spread over time (roughly 5 min for interactive queries, up to 24 h for background operations) rather than billed at the instant of execution. Smoothing does not change how long a query takes; it only spreads the accounting so short spikes don't immediately exhaust the capacity. See Smoothing and throttling and the Fabric throttling policy (overage, carryforward, burndown, and the CapacityLimitExceeded error).

The practical consequence: when the warehouse "feels slow across the board" or you see CapacityLimitExceeded / "query rejected due to current capacity constraints", the problem is usually sustained CU demand exceeding the SKU (a capacity decision) or a few expensive queries consuming the shared engine (a query/statistics/caching decision) — not an under-provisioned warehouse you can resize.

The default SELECT vs non-SELECT pool split

Even with no custom configuration, the engine isolates work into two pools, split 50/50:

  • a SELECT pool for read/analytics queries, and
  • a NON-SELECT pool for DML/DDL/ETL/ingestion statements.

This keeps ingestion from starving interactive reads (and vice-versa) by default. You can change this baseline with custom SQL pools — covered in Step 3. See Workload management — compute pool isolation.


When to use this guide

Reach for this playbook when you see warehouse-wide symptoms, not a single slow query:

  • queries are slow across the board, not just one report;
  • you see CapacityLimitExceeded or "query rejected due to current capacity constraints";
  • you are getting throttling alerts on the capacity;
  • ingestion and interactive workloads appear to be contending for the same compute.

If instead a specific query regressed, start with the query-optimizer skill.


Set your defaults

Store the workspace and warehouse once so you do not repeat them on every command:

fdw config set workspace MyWorkspace
fdw config set warehouse SalesWH

The rest of this guide assumes these defaults are set, so the examples omit -w MyWorkspace and drop the warehouse positional where it is optional. Any command still accepts an explicit -w/--workspace or a positional [WAREHOUSE]/[ITEM] to override them. Commands that take a trailing required argument (such as queries kill … SESSION_ID or settings result-set-caching … on|off) keep the warehouse positional so the remaining arguments stay unambiguous. The workspace-wide commands (warehouses list, workspaces list-capacities) take no per-warehouse target. See Configuration & defaults.


Step 1 — Monitor: is there a problem, and where?

Establish what exists, what is running right now, and whether the capacity itself is the constraint.

Inventory the warehouses

Confirm which items exist and which kind each is (Data Warehouse vs SQL Analytics Endpoint — they behave differently for several tuning levers below).

Action CLI MCP tool
List warehouses (optionally across all workspaces) fdw warehouses list (-A/--all-workspaces, --warehouses-only) list_warehouses (all_workspaces)
Inspect one warehouse fdw warehouses get get_warehouse
# Everything visible, across every workspace
fdw warehouses list --all-workspaces

Look at live load

What is executing right now, and how many connections are open?

Action CLI MCP tool
Currently running queries fdw queries running list_running_queries
Active SQL connections (incl. idle) fdw queries connections list_connections
fdw queries running
fdw queries connections

CLI vs MCP names diverge here

The queries CLI subcommands map to differently-named MCP tools: queries runninglist_running_queries, queries connectionslist_connections, queries historylist_request_history, and queries sessionslist_session_history.

Read SQL pool pressure

sql-pools insights reads queryinsights.sql_pool_insights — the warehouse's own resource-pressure signal, including is_pool_under_pressure and max_resource_percentage. This is the single most direct "is compute the bottleneck?" read.

Action CLI MCP tool
SQL pool insight events fdw sql-pools insights list_sql_pool_insights
fdw sql-pools insights --since 2026-06-13T00:00:00

See Custom SQL pools and the queryinsights.sql_pool_insights column reference.

Check the capacity state

Is the capacity even active, and what SKU is it? A paused capacity, or an undersized SKU, explains warehouse-wide slowness on its own.

Action CLI MCP tool
List capacities with sku and state fdw workspaces list-capacities list_capacities
fdw workspaces list-capacities
 id            displayName    sku   region        state
 ------------- -------------- ----- ------------- ------
 ab12cd34-...  MyCapacity     F64   West Europe   Active

fabric-dw also reads GET /v1/capacities internally so that an all-workspaces scan (warehouses list -A) automatically skips paused/inactive capacities.


Step 2 — Diagnose: what is consuming compute, and why?

Once you know there's pressure, find the cost drivers. Fabric's Query Insights views retain about 30 days of history and aggregate repeated statements by query_hash. All of the commands below accept --since / --until / --limit to scope a window (--limit defaults to 100, capped at 10 000). There is no --order-by flag — each view is ordered server-side, so "top N" is simply --limit N.

See Query Insights and Monitor Fabric Data Warehouse.

Find the worst offenders

Action CLI MCP tool
Long-running queries (by median elapsed, DESC) fdw queries long-running list_long_running_queries
Frequently-run queries (by run count, DESC) fdw queries frequent list_frequent_queries
# Top 20 slowest queries over the last week
fdw queries long-running --limit 20

# Top 20 most-frequently-run queries — a cheap query run 100k times is also a cost driver
fdw queries frequent --limit 20

A query that is individually fast but runs constantly can dominate CU usage just as much as one slow query — check both lists. See the queryinsights.long_running_queries column reference.

Drill into completed requests and sessions

Per-request detail tells you why a query is expensive: CPU time, how much data it scanned, and whether the result-set cache helped.

Action CLI MCP tool
Completed requests (exec_requests_history) fdw queries history list_request_history
Completed sessions (exec_sessions_history) fdw queries sessions list_session_history
fdw queries history --limit 50 --since 2026-06-13T00:00:00

Useful fields in queries history include allocated_cpu_time_ms, data_scanned_remote_storage_mb / _memory_mb / _disk_mb, result_cache_hit, sql_pool_name, program_name, query_hash, and label.

Correlate across runs and tools

Use query_hash to aggregate every run of the same statement, and add OPTION (LABEL = '...') to your queries so they surface under a known label in the history. program_name / sql_pool_name tell you which application and pool a request landed in. For end-to-end traceability between the portal and these views, see Observe Fabric Data Warehouse utilization (Operation Id ↔ distributed_statement_id).

Insights latency

Query Insights is populated asynchronously: a request you just ran may take a short while to appear in the history views. If a recent query is "missing", give it a moment and re-query.

Check statistics health

Stale or missing statistics make the optimizer choose bad plans, which inflates CPU and data scanned warehouse-wide. Audit whether the optimizer has good statistics on hot columns. Reads work on both Data Warehouses and SQL Analytics Endpoints.

Action CLI MCP tool
List statistics (--schema, --table, --user-only, --auto-only) fdw statistics list list_statistics
Show one statistic's histogram fdw statistics show … [--histogram] show_statistics
fdw statistics list --table dbo.sales --user-only
fdw statistics show SalesWH dbo.sales _stat_order_date --histogram   # warehouse positional kept — names follow

See Statistics in Fabric Data Warehouse for when automatic statistics suffice and when to create/update manually, and the umbrella Performance guidelines.


Step 3 — Tune: apply the right lever

Now act on the diagnosis. Levers are ordered from cheapest/safest to most structural.

MCP write guards

Every mutating MCP tool below is gated behind the server's write guard (assert_writes_allowed), and the destructive ones additionally behind assert_destructive_allowed (FABRIC_MCP_ALLOW_DESTRUCTIVE=1). In CLI, destructive commands prompt for confirmation unless you pass --yes. None of these are read-only — treat them as changes.

Result-set caching (Data Warehouse only — the fast win)

If the diagnosis shows the same identical queries repeating with low result_cache_hit, enabling result-set caching can cut elapsed time for repeated identical queries dramatically.

Action CLI MCP tool
Read current server-side settings fdw settings show get_warehouse_settings
Toggle result-set caching on/off fdw settings result-set-caching … on\|off set_result_set_caching
fdw settings show
fdw settings result-set-caching SalesWH on   # warehouse positional kept — on|off follows

settings result-set-caching runs ALTER DATABASE CURRENT SET RESULT_SET_CACHING { ON | OFF }. The toggle is Data-Warehouse-only — SQL Analytics Endpoints are rejected with an error. (Note that settings show itself works on both kinds; only the toggle is DW-only.) See Result set caching in Fabric Data Warehouse.

Note

settings retention --days N (MCP set_time_travel_retention, also DW-only) lives in the same settings group but controls time-travel retention, not performance.

Statistics tuning (Data Warehouse only — DDL)

If Step 2 found missing or stale statistics on hot columns, create or refresh them. Fabric supports single-column, histogram-based statistics only.

Action CLI MCP tool Notes
Create a statistic fdw statistics create --table --column --name [--fullscan \| --sample-percent] create_statistics write-guarded
Update a statistic fdw statistics update … [--fullscan \| --sample-percent] update_statistics write-guarded
Drop a statistic fdw statistics delete … delete_statistics destructive
fdw statistics create \
  --table dbo.sales --column order_date --name _stat_order_date --fullscan

The MCP equivalents exist and are not read-only: create_statistics, update_statistics, and delete_statistics (the last flagged destructive) all run behind the MCP write/destructive guards. SQL Analytics Endpoints are rejected for these DDL operations.

Workload isolation via custom SQL pools (workspace admin)

If Step 2 showed one workload (e.g. heavy ETL) starving interactive reads, override the default 50/50 split with custom SQL pools: you cap each workload's maxResourcePercentage and route work to a pool by an application-name classifier. These are workspace-scoped and require the workspace admin role.

Action CLI MCP tool Notes
Read full configuration fdw sql-pools get get_sql_pools_configuration read
List pools fdw sql-pools list list_sql_pools read
Show one pool fdw sql-pools show --name get_sql_pool read
Create a pool fdw sql-pools create create_sql_pool write
Update a pool fdw sql-pools update update_sql_pool write
Delete a pool fdw sql-pools delete --name delete_sql_pool destructive
Enable custom pools fdw sql-pools enable enable_sql_pools write
Disable custom pools fdw sql-pools disable disable_sql_pools write
# Carve out an isolated ETL pool capped at 30% so it can't starve interactive reads
fdw sql-pools create \
  --name ETL \
  --max-percent 30 \
  --no-optimize-for-reads \
  --classifier-type "Application Name" \
  --classifier-value "ETL" \
  --classifier-value "Load"

Default pool behaviour

When no custom pools exist, sql-pools list reports the default autonomous 50/50 split (one SELECT pool, one NON-SELECT pool) rather than an empty list. The sum of every pool's maxResourcePercentage must be ≤ 100 and exactly one pool is the default. See Custom SQL pools and Configure custom SQL pools via REST API.

Relieve live contention

If a single runaway session is blocking everything else, terminate it.

Action CLI MCP tool
Kill a session fdw queries kill … <session_id> kill_session
fdw --yes queries kill SalesWH 42   # warehouse positional kept — SESSION_ID follows

Step 4 — Decide whether to scale

If the cost drivers are genuinely justified work — not a tuning problem — and you are still throttled, the lever is the capacity, not the warehouse. Distinguish the two cases:

  • A design / tuning problem — a handful of queries dominate CU, statistics are stale, or one workload is starving another. Fix it in Step 2 / Step 3; scaling just pays more for the same waste.
  • Sustained, legitimate demand exceeding the SKU — throttling persists even after tuning, and the Capacity Metrics app shows steady overage. This is a capacity decision.

For a real capacity decision you have two levers, only one of which fabric-dw performs:

Lever Tool support
Move the workspace to another capacity (load-balance across capacities) fdw workspaces assign-capacity <workspace> --capacity-id <uuid> · MCP assign_workspace_to_capacity
Resize the capacity SKU (e.g. F64 → F128) Out of CLI scope — a capacity-admin action in the Fabric/Azure portal
# Spread load by moving a workspace onto a different (less-loaded) capacity
fdw workspaces assign-capacity MyWorkspace \
  --capacity-id ab12cd34-ef56-7890-abcd-ef1234567890

assign-capacity moves the workspace; it does not resize a SKU. A SKU resize is a capacity-admin action — see Scale your Fabric capacity and Evaluate and optimize your Fabric capacity for right-sizing and load-balancing guidance.

Auditing is supporting, not diagnostic

The fdw audit commands (get / enable / set-retention / set-groups / add-group / remove-group) provide traceability that complements performance work but is not itself a diagnosis tool. Enable auditing for accountability, not to find slow queries.


Worked example

A copy-pasteable session that walks the whole loop on a warehouse that "feels slow". Or let the warehouse-performance skill drive these same steps for you.

# 1. MONITOR — confirm the warehouse exists, check live load and pool pressure, check the SKU
fdw warehouses get
fdw queries running
fdw sql-pools insights --since 2026-06-13T00:00:00
fdw workspaces list-capacities

# 2. DIAGNOSE — find the worst offenders and inspect their CPU / data-scanned / cache hits
fdw queries long-running --limit 20
fdw queries frequent --limit 20
fdw queries history --limit 50 --since 2026-06-13T00:00:00
fdw statistics list --table dbo.sales --user-only

# 3. TUNE — apply the cheapest effective lever(s) for what you found
fdw settings result-set-caching SalesWH on          # repeated identical queries (on|off follows, so warehouse positional kept)
fdw statistics create \
  --table dbo.sales --column order_date --name _stat_order_date --fullscan   # stale/missing stats
fdw sql-pools create \
  --name ETL --max-percent 30 --no-optimize-for-reads \
  --classifier-type "Application Name" --classifier-value "ETL"     # ETL starving reads

# 4. RE-MEASURE — confirm the change helped before reaching for a capacity change
fdw queries long-running --limit 20
fdw sql-pools insights --since 2026-06-13T00:00:00

Only if throttling persists after re-measuring should you consider Step 4 — reassigning the workspace to another capacity, or escalating a SKU resize to a capacity admin.


Limits of this tool

fabric-dw covers the read and most of the tune levers, but a few things are deliberately out of scope:

  • No capacity SKU resize. fabric-dw reads capacity state, lists capacities, and can reassign a workspace to another capacity — but a SKU resize (e.g. F64 → F128) is a capacity-admin action in the Fabric/Azure portal.
  • No Capacity Metrics app. For CU-usage and throttling trends over time, use the Capacity Metrics app — this tool does not replace it.
  • Application-name classifier only. The only SQL-pool routing key today is the application-name classifier; statement-type routing is observe-only with no API (#596).
  • No cache-cooldown control for result-set caching (#595).

See also