# Tableau Best Practices: Extracts, Performance, and Fast Dashboards

Every Tableau deployment I've been called in to rescue had the same shape of problem: the analysis was good, the dashboards were useful, and they were unbearably slow — slow enough that people stopped opening them. Almost none of it was Tableau's fault. It was a pile of small, reasonable-looking choices (a live connection here, a high-cardinality field on a tooltip there, a workbook with fourteen sheets on one dashboard) that compounded into a thirty-second load. The good news: the levers that fix it are few, and they're the same ones every time.

This is the companion to my [Tableau internals](tableau-internals) piece — if that one explained *how* a viz becomes a query and a set of marks, this one is the playbook built on that model. Everything here reduces to two goals: make the **queries** cheaper, and make the **rendering** lighter. I'll go through extracts, the performance levers that matter, calculation choices, data-source design, and dashboards.

## Extracts vs live: pick deliberately

The first decision is the data connection, and it sets your performance ceiling. A **live connection** is the right call when data must be current to the second and your source is a fast analytical database (or you're legally required not to copy the data). An **extract** — Tableau's own columnar file, now powered by the Hyper engine — is the right default for nearly everything else, because it's purpose-built for the aggregate queries Tableau generates and it takes the load off your source.

And an extract is not all-or-nothing: you can make it leaner, which makes every query against it faster.

- **Aggregate the extract to the visible level of detail** if you never need row-level data — turning millions of transaction rows into thousands of daily summaries.

- **Filter the extract** to the rows that matter (last two years, relevant regions) instead of hauling the entire history.

- **Hide unused fields** before extracting so they're never materialized.

A rule that has never let me down: **start with an extract unless you have a specific reason to go live.** Teams default to live "to be safe" and then fight latency forever. Real-time freshness is a genuine requirement far less often than people assume — a scheduled refresh every hour or every morning is fine for the overwhelming majority of dashboards, and the speed difference is night and day.

## The performance levers that actually matter

When a workbook is slow, resist the urge to start tweaking calculations. Open the **Performance Recorder** (Help → Settings and Performance) first — it tells you exactly where the time goes: executing queries, geocoding, computing layouts, or rendering. Then pull the right lever. In rough order of how often they're the culprit:

| Lever | What it fixes | How |
| --- | --- | --- |
| Reduce marks | Slow rendering | Aggregate higher; keep high-cardinality fields off Detail; split dense views |
| Context filters | Heavy repeated queries | Make a big "first cut" filter a context filter so other filters run against the smaller set |
| Extract instead of live | Slow source queries | Move to a Hyper extract; aggregate/filter it |
| Fewer worksheets per dashboard | Everything-at-once load | Each sheet is its own query; trim or defer with a tabbed layout |
| Reduce filter cardinality | Slow "show relevant values" filters | Avoid "Only Relevant Values" on high-cardinality quick filters; prefer wildcard or range |

```mermaid
graph TD
    SLOW["Dashboard is slow"]
    REC["Run the Performance Recorder"]
    Q{"Where is the time?"}
    QUERY["Executing query"]
    REND["Rendering / layout"]
    QFIX["Extract + aggregate;context filters;simpler calc types"]
    RFIX["Cut marks;fewer sheets per dashboard;drop high-cardinality from Detail"]
    SLOW --> REC --> Q
    Q -->|"query time dominates"| QUERY --> QFIX
    Q -->|"layout/render dominates"| REND --> RFIX
          
```

Diagnose before you tune. The Performance Recorder splits a slow load into query time vs layout/render time. Heavy queries point you at extracts, aggregation, and context filters; heavy rendering points you at the mark count and the number of sheets on the dashboard. Guessing which one it is wastes more time than the recording takes.

## Calculations: choose the cheapest type that works

Not all Tableau calculations cost the same, and picking the wrong type is a quiet performance tax. The three families, cheapest to most expensive in the wrong hands:

- **Row-level and aggregate calculations** are pushed into the database/Hyper query, so they're computed where the data lives — generally the cheapest. Prefer them.

- **Level-of-Detail (LOD) expressions** (`FIXED`, `INCLUDE`, `EXCLUDE`) compute aggregations at a grain different from the view, in the query. They're powerful for "value per customer regardless of the view" problems — but a `FIXED` on a high-cardinality dimension can be heavy, so use them with intent.

- **Table calculations** run *after* the query, in Tableau, on the returned result set. For running totals and rank over a modest result that's fine; over a huge result set they get slow, because Tableau is doing the work the database could have.

```text
Cheapest → push work to the data layer:
  aggregate calc   SUM([Sales]) / SUM([Profit])         -- in the query
  LOD (use w/ care) { FIXED [Customer] : SUM([Sales]) }  -- in the query, different grain
Most expensive when the result set is large:
  table calc       RUNNING_SUM(SUM([Sales]))             -- computed in Tableau after the query
```

The principle mirrors every analytical system: do work where the data already is, and bring back as little as possible. A calculation that runs in the query touches all the data once, efficiently; a table calculation over a giant result set drags that whole set into Tableau first.

## Data-source design: joins and blends

Tableau is fastest against a well-shaped source, and in 2019 you have two main ways to combine data — and they behave very differently.

**Joins** happen in the data layer (in the extract or pushed to the live source): you join tables once into a single logical table, ideally a clean **star schema** — a fact table with dimension tables around it. This is the performant default. The trap is the *fan-out*: joining a fact to a dimension at the wrong grain duplicates rows and silently inflates your `SUM`s, so mind the cardinality of every join.

**Blends** are different — they combine data from separate sources *after* each is queried, aggregating each source to a common linking dimension and stitching the results in Tableau. Blending is the right tool when the data genuinely lives in different places at different grains, but it's slower and more constrained than a join (the secondary source can only contribute aggregated measures), so reach for it when you must, not by default. When both datasets can live in one source, join them; blend only across sources.

**The fan-out join is the bug that corrupts numbers without an error.** Join orders to a line-items table and suddenly every order-level measure is multiplied by its number of line items — and nothing turns red; the totals are just wrong. It's the analytics cousin of a duplicate-row join in SQL. Always sanity-check a total against a known number after adding a join, and keep your fact at one consistent grain. A dashboard that's fast and wrong is worse than one that's slow.

## Dashboards that stay fast

A dashboard's load time is the sum of its sheets' queries plus the layout work, so the design choices matter as much as the data:

- **Fewer sheets.** Every worksheet is its own query. Six focused sheets beat fourteen, and a dashboard that opens fast gets used.

- **Use a fixed dashboard size** rather than "Automatic" where you can — it lets Tableau cache layouts and avoids recomputing on every resize.

- **Prefer dashboard actions over a wall of quick filters.** Filter and highlight actions (and the newer set and parameter actions) are interactive and cheaper than many high-cardinality quick filters re-querying for relevant values.

- **Design for the device.** Use the Device Designer to give phone and tablet layouts fewer, simpler sheets rather than shrinking the desktop view.

## What to carry away

Fast Tableau isn't a bag of tricks — it's two questions asked relentlessly: *is this query as cheap as it can be, and is this view drawing as few marks as it can?* Default to a **Hyper extract** and make it lean (aggregate, filter, hide fields); diagnose with the **Performance Recorder** before tuning; cut **marks** and worksheets to fix rendering; push calculations into the query and treat **table calculations** over large result sets as a smell; shape the source as a clean **star schema** with joins, blending only across truly separate sources, and watch for fan-out.

Do those, and Tableau delivers what it's good at: dashboards people actually open. For the engine-level reasoning behind all of it — VizQL, Hyper, and why marks dominate — start with the [Tableau internals](tableau-internals) piece.
