Investigating & improving warehouse performance¶

This guide is a repeatable monitor → diagnose → tune/scale playbook for answering "the warehouse feels slow / we're being throttled — what now?" using fabric-dw. It operates at the warehouse / capacity (compute) level — the whole-warehouse story — not the tuning of a single query.

For single-query tuning (execution plans, missing statistics on one query, clustering), see the query-optimizer Agent Skill and the fdw sql plan command instead. This guide is about the warehouse and the capacity it runs on.

Audience. Fabric workspace admins, data-platform / DWH operators, and SREs running Fabric Data Warehouses or SQL Analytics Endpoints from the CLI or via an MCP-connected assistant. It assumes you are comfortable with T-SQL and Fabric workspaces, but not with Fabric's internal compute model — that model is explained below, because it changes which lever is the right one.

Every command appears in both its CLI form (fdw …) and its MCP tool name, so you can drive this workflow by hand or through an AI assistant. CLI subcommand names and MCP tool names mostly match; where they diverge (the queries group), both are given.

Relationship to the `warehouse-performance` Agent Skill¶

fabric-dw ships a warehouse-performance Agent Skill (/fabric-dw:warehouse-performance) that automates exactly this monitor → diagnose → tune workflow: it surfaces long-running and frequent queries plus SQL pool insights, audits statistics health, checks result-set caching, reviews and tunes SQL pool configuration, and produces a prioritized findings report — with every mutating action gated on your confirmation.

The guide and the skill are complementary, not duplicates:

This guide is the human narrative — the why, and how Fabric serverless compute works — for an operator driving fabric-dw (or any SQL/REST client) by hand and reasoning about whether the right lever is statistics, caching, workload isolation, or a capacity change.
The skill is the AI-assistant automation — the do it for me path: it runs the same read commands, synthesizes the report, and proposes the mutating actions for you to confirm via an MCP-connected assistant.

If you have an MCP-connected assistant configured, you can let the skill drive the steps below for you. Either way, the underlying commands and their guarantees are identical.

How Fabric Warehouse compute actually works¶

Before reaching for a lever, it helps to know what you can and cannot tune. Fabric's Warehouse compute model is different from a provisioned MPP warehouse, and that difference decides the fix.

Serverless, autonomously scaling compute¶

A Fabric Warehouse and a SQL Analytics Endpoint share the same serverless, autonomously scaling distributed compute engine. Backend nodes are provisioned in seconds, scaling is an online operation, and there is no compute to provision or resize — scaling happens autonomously as the workload demands it. You do not pick a node count or a "DWU" tier per warehouse.

See Workload management in Fabric Data Warehouse.

Capacity, bursting, and smoothing¶

Performance and cost headroom come from the Fabric capacity SKU the workspace runs on, measured in Capacity Units (CU):

Bursting — a query can temporarily use more CU than the SKU baseline (up to 12× baseline) so a single heavy operation finishes fast. The available burst is governed by a per-SKU scale factor (CU / duration / Baseline CU); F2–F2048 each have a guardrail. See Burstable capacity.
Smoothing — CU accounting is spread over time (roughly 5 min for interactive queries, up to 24 h for background operations) rather than billed at the instant of execution. Smoothing does not change how long a query takes; it only spreads the accounting so short spikes don't immediately exhaust the capacity. See Smoothing and throttling and the Fabric throttling policy (overage, carryforward, burndown, and the CapacityLimitExceeded error).

The practical consequence: when the warehouse "feels slow across the board" or you see CapacityLimitExceeded / "query rejected due to current capacity constraints", the problem is usually sustained CU demand exceeding the SKU (a capacity decision) or a few expensive queries consuming the shared engine (a query/statistics/caching decision) — not an under-provisioned warehouse you can resize.

The default SELECT vs non-SELECT pool split¶

Even with no custom configuration, the engine isolates work into two pools, split 50/50:

a SELECT pool for read/analytics queries, and
a NON-SELECT pool for DML/DDL/ETL/ingestion statements.

This keeps ingestion from starving interactive reads (and vice-versa) by default. You can change this baseline with custom SQL pools — covered in Step 3. See Workload management — compute pool isolation.

When to use this guide¶

Reach for this playbook when you see warehouse-wide symptoms, not a single slow query:

queries are slow across the board, not just one report;
you see CapacityLimitExceeded or "query rejected due to current capacity constraints";
you are getting throttling alerts on the capacity;
ingestion and interactive workloads appear to be contending for the same compute.

If instead a specific query regressed, start with the query-optimizer skill.

Set your defaults¶

Store the workspace and warehouse once so you do not repeat them on every command:

fdw config set workspace MyWorkspace
fdw config set warehouse SalesWH

The rest of this guide assumes these defaults are set, so the examples omit -w MyWorkspace and drop the warehouse positional where it is optional. Any command still accepts an explicit -w/--workspace or a positional [WAREHOUSE]/[ITEM] to override them. Commands that take a trailing required argument (such as queries kill … SESSION_ID or settings result-set-caching … on|off) keep the warehouse positional so the remaining arguments stay unambiguous. The workspace-wide commands (warehouses list, workspaces list-capacities) take no per-warehouse target. See Configuration & defaults.

Step 1 — Monitor: is there a problem, and where?¶

Establish what exists, what is running right now, and whether the capacity itself is the constraint.

Inventory the warehouses¶

Confirm which items exist and which kind each is (Data Warehouse vs SQL Analytics Endpoint — they behave differently for several tuning levers below).

Action	CLI	MCP tool
List warehouses (optionally across all workspaces)	`fdw warehouses list` (`-A/--all-workspaces`, `--warehouses-only`)	`list_warehouses` (`all_workspaces`)
Inspect one warehouse	`fdw warehouses get`	`get_warehouse`

# Everything visible, across every workspace
fdw warehouses list --all-workspaces

Look at live load¶

What is executing right now, and how many connections are open?

Action	CLI	MCP tool
Currently running queries	`fdw queries running`	`list_running_queries`
Active SQL connections (incl. idle)	`fdw queries connections`	`list_connections`

fdw queries running
fdw queries connections

CLI vs MCP names diverge here

The queries CLI subcommands map to differently-named MCP tools: queries running → list_running_queries, queries connections → list_connections, queries history → list_request_history, and queries sessions → list_session_history.

Read SQL pool pressure¶

sql-pools insights reads queryinsights.sql_pool_insights — the warehouse's own resource-pressure signal, including is_pool_under_pressure and max_resource_percentage. This is the single most direct "is compute the bottleneck?" read.

Action	CLI	MCP tool
SQL pool insight events	`fdw sql-pools insights`	`list_sql_pool_insights`

fdw sql-pools insights --since 2026-06-13T00:00:00

See Custom SQL pools and the queryinsights.sql_pool_insights column reference.

Check the capacity state¶

Is the capacity even active, and what SKU is it? A paused capacity, or an undersized SKU, explains warehouse-wide slowness on its own.

Action	CLI	MCP tool
List capacities with `sku` and `state`	`fdw workspaces list-capacities`	`list_capacities`

fdw workspaces list-capacities

 id            displayName    sku   region        state
 ------------- -------------- ----- ------------- ------
 ab12cd34-...  MyCapacity     F64   West Europe   Active

fabric-dw also reads GET /v1/capacities internally so that an all-workspaces scan (warehouses list -A) automatically skips paused/inactive capacities.

Step 2 — Diagnose: what is consuming compute, and why?¶

Once you know there's pressure, find the cost drivers. Fabric's Query Insights views retain about 30 days of history and aggregate repeated statements by query_hash. All of the commands below accept --since / --until / --limit to scope a window (--limit defaults to 100, capped at 10 000). There is no --order-by flag — each view is ordered server-side, so "top N" is simply --limit N.

See Query Insights and Monitor Fabric Data Warehouse.

Find the worst offenders¶

Action	CLI	MCP tool
Long-running queries (by median elapsed, DESC)	`fdw queries long-running`	`list_long_running_queries`
Frequently-run queries (by run count, DESC)	`fdw queries frequent`	`list_frequent_queries`

# Top 20 slowest queries over the last week
fdw queries long-running --limit 20

# Top 20 most-frequently-run queries — a cheap query run 100k times is also a cost driver
fdw queries frequent --limit 20

A query that is individually fast but runs constantly can dominate CU usage just as much as one slow query — check both lists. See the queryinsights.long_running_queries column reference.

Drill into completed requests and sessions¶

Per-request detail tells you why a query is expensive: CPU time, how much data it scanned, and whether the result-set cache helped.

Action	CLI	MCP tool
Completed requests (`exec_requests_history`)	`fdw queries history`	`list_request_history`
Completed sessions (`exec_sessions_history`)	`fdw queries sessions`	`list_session_history`

fdw queries history --limit 50 --since 2026-06-13T00:00:00

Useful fields in queries history include allocated_cpu_time_ms, data_scanned_remote_storage_mb / _memory_mb / _disk_mb, result_cache_hit, sql_pool_name, program_name, query_hash, and label.

Correlate across runs and tools

Use query_hash to aggregate every run of the same statement, and add OPTION (LABEL = '...') to your queries so they surface under a known label in the history. program_name / sql_pool_name tell you which application and pool a request landed in. For end-to-end traceability between the portal and these views, see Observe Fabric Data Warehouse utilization (Operation Id ↔ distributed_statement_id).

Insights latency

Query Insights is populated asynchronously: a request you just ran may take a short while to appear in the history views. If a recent query is "missing", give it a moment and re-query.

Check statistics health¶

Stale or missing statistics make the optimizer choose bad plans, which inflates CPU and data scanned warehouse-wide. Audit whether the optimizer has good statistics on hot columns. Reads work on both Data Warehouses and SQL Analytics Endpoints.

Action	CLI	MCP tool
List statistics (`--schema`, `--table`, `--user-only`, `--auto-only`)	`fdw statistics list`	`list_statistics`
Show one statistic's histogram	`fdw statistics show … [--histogram]`	`show_statistics`

fdw statistics list --table dbo.sales --user-only
fdw statistics show SalesWH dbo.sales _stat_order_date --histogram   # warehouse positional kept — names follow

See Statistics in Fabric Data Warehouse for when automatic statistics suffice and when to create/update manually, and the umbrella Performance guidelines.

Step 3 — Tune: apply the right lever¶

Now act on the diagnosis. Levers are ordered from cheapest/safest to most structural.

MCP write guards

Every mutating MCP tool below is gated behind the server's write guard (assert_writes_allowed), and the destructive ones additionally behind assert_destructive_allowed (FABRIC_MCP_ALLOW_DESTRUCTIVE=1). In CLI, destructive commands prompt for confirmation unless you pass --yes. None of these are read-only — treat them as changes.

Result-set caching (Data Warehouse only — the fast win)¶

If the diagnosis shows the same identical queries repeating with low result_cache_hit, enabling result-set caching can cut elapsed time for repeated identical queries dramatically.

Action	CLI	MCP tool
Read current server-side settings	`fdw settings show`	`get_warehouse_settings`
Toggle result-set caching on/off	`fdw settings result-set-caching … on\\|off`	`set_result_set_caching`

fdw settings show
fdw settings result-set-caching SalesWH on   # warehouse positional kept — on|off follows

settings result-set-caching runs ALTER DATABASE CURRENT SET RESULT_SET_CACHING { ON | OFF }. The toggle is Data-Warehouse-only — SQL Analytics Endpoints are rejected with an error. (Note that settings show itself works on both kinds; only the toggle is DW-only.) See Result set caching in Fabric Data Warehouse.

Note

settings retention --days N (MCP set_time_travel_retention, also DW-only) lives in the same settings group but controls time-travel retention, not performance.

Statistics tuning (Data Warehouse only — DDL)¶

If Step 2 found missing or stale statistics on hot columns, create or refresh them. Fabric supports single-column, histogram-based statistics only.

Action	CLI	MCP tool	Notes
Create a statistic	`fdw statistics create --table --column --name [--fullscan \\| --sample-percent]`	`create_statistics`	write-guarded
Update a statistic	`fdw statistics update … [--fullscan \\| --sample-percent]`	`update_statistics`	write-guarded
Drop a statistic	`fdw statistics delete …`	`delete_statistics`	destructive

fdw statistics create \
  --table dbo.sales --column order_date --name _stat_order_date --fullscan

The MCP equivalents exist and are not read-only: create_statistics, update_statistics, and delete_statistics (the last flagged destructive) all run behind the MCP write/destructive guards. SQL Analytics Endpoints are rejected for these DDL operations.

Workload isolation via custom SQL pools (workspace admin)¶

If Step 2 showed one workload (e.g. heavy ETL) starving interactive reads, override the default 50/50 split with custom SQL pools: you cap each workload's maxResourcePercentage and route work to a pool by an application-name classifier. These are workspace-scoped and require the workspace admin role.

Action	CLI	MCP tool	Notes
Read full configuration	`fdw sql-pools get`	`get_sql_pools_configuration`	read
List pools	`fdw sql-pools list`	`list_sql_pools`	read
Show one pool	`fdw sql-pools show --name`	`get_sql_pool`	read
Create a pool	`fdw sql-pools create`	`create_sql_pool`	write
Update a pool	`fdw sql-pools update`	`update_sql_pool`	write
Delete a pool	`fdw sql-pools delete --name`	`delete_sql_pool`	destructive
Enable custom pools	`fdw sql-pools enable`	`enable_sql_pools`	write
Disable custom pools	`fdw sql-pools disable`	`disable_sql_pools`	write

# Carve out an isolated ETL pool capped at 30% so it can't starve interactive reads
fdw sql-pools create \
  --name ETL \
  --max-percent 30 \
  --no-optimize-for-reads \
  --classifier-type "Application Name" \
  --classifier-value "ETL" \
  --classifier-value "Load"

Default pool behaviour

When no custom pools exist, sql-pools list reports the default autonomous 50/50 split (one SELECT pool, one NON-SELECT pool) rather than an empty list. The sum of every pool's maxResourcePercentage must be ≤ 100 and exactly one pool is the default. See Custom SQL pools and Configure custom SQL pools via REST API.

Relieve live contention¶

If a single runaway session is blocking everything else, terminate it.

Action	CLI	MCP tool
Kill a session	`fdw queries kill … <session_id>`	`kill_session`

fdw --yes queries kill SalesWH 42   # warehouse positional kept — SESSION_ID follows

Step 4 — Decide whether to scale¶

If the cost drivers are genuinely justified work — not a tuning problem — and you are still throttled, the lever is the capacity, not the warehouse. Distinguish the two cases:

A design / tuning problem — a handful of queries dominate CU, statistics are stale, or one workload is starving another. Fix it in Step 2 / Step 3; scaling just pays more for the same waste.
Sustained, legitimate demand exceeding the SKU — throttling persists even after tuning, and the Capacity Metrics app shows steady overage. This is a capacity decision.

For a real capacity decision you have two levers, only one of which fabric-dw performs:

Lever	Tool support
Move the workspace to another capacity (load-balance across capacities)	`fdw workspaces assign-capacity <workspace> --capacity-id <uuid>` · MCP `assign_workspace_to_capacity`
Resize the capacity SKU (e.g. F64 → F128)	Out of CLI scope — a capacity-admin action in the Fabric/Azure portal

# Spread load by moving a workspace onto a different (less-loaded) capacity
fdw workspaces assign-capacity MyWorkspace \
  --capacity-id ab12cd34-ef56-7890-abcd-ef1234567890

assign-capacity moves the workspace; it does not resize a SKU. A SKU resize is a capacity-admin action — see Scale your Fabric capacity and Evaluate and optimize your Fabric capacity for right-sizing and load-balancing guidance.

Auditing is supporting, not diagnostic

The fdw audit commands (get / enable / set-retention / set-groups / add-group / remove-group) provide traceability that complements performance work but is not itself a diagnosis tool. Enable auditing for accountability, not to find slow queries.

Worked example¶

A copy-pasteable session that walks the whole loop on a warehouse that "feels slow". Or let the warehouse-performance skill drive these same steps for you.

# 1. MONITOR — confirm the warehouse exists, check live load and pool pressure, check the SKU
fdw warehouses get
fdw queries running
fdw sql-pools insights --since 2026-06-13T00:00:00
fdw workspaces list-capacities

# 2. DIAGNOSE — find the worst offenders and inspect their CPU / data-scanned / cache hits
fdw queries long-running --limit 20
fdw queries frequent --limit 20
fdw queries history --limit 50 --since 2026-06-13T00:00:00
fdw statistics list --table dbo.sales --user-only

# 3. TUNE — apply the cheapest effective lever(s) for what you found
fdw settings result-set-caching SalesWH on          # repeated identical queries (on|off follows, so warehouse positional kept)
fdw statistics create \
  --table dbo.sales --column order_date --name _stat_order_date --fullscan   # stale/missing stats
fdw sql-pools create \
  --name ETL --max-percent 30 --no-optimize-for-reads \
  --classifier-type "Application Name" --classifier-value "ETL"     # ETL starving reads

# 4. RE-MEASURE — confirm the change helped before reaching for a capacity change
fdw queries long-running --limit 20
fdw sql-pools insights --since 2026-06-13T00:00:00

Only if throttling persists after re-measuring should you consider Step 4 — reassigning the workspace to another capacity, or escalating a SKU resize to a capacity admin.

Limits of this tool¶

fabric-dw covers the read and most of the tune levers, but a few things are deliberately out of scope:

No capacity SKU resize. fabric-dw reads capacity state, lists capacities, and can reassign a workspace to another capacity — but a SKU resize (e.g. F64 → F128) is a capacity-admin action in the Fabric/Azure portal.
No Capacity Metrics app. For CU-usage and throttling trends over time, use the Capacity Metrics app — this tool does not replace it.
Application-name classifier only. The only SQL-pool routing key today is the application-name classifier; statement-type routing is observe-only with no API (#596).
No cache-cooldown control for result-set caching (#595).