For years, the most common Tableau bug wasn't a bug at all — it was a join doing exactly what you told it to. You'd join Orders to a Targets table, throw SUM([Sales]) on a sheet, and the number would be inflated — doubled, tripled — because the join had duplicated every order row once per matching target row. The fix was a folklore of LOD tricks and careful aggregation, passed around like a survival skill. Tableau 2020.2 introduced relationships — the connecting line everyone calls "the noodle" — and quietly made that whole class of duplication problems go away by changing when and at what granularity the join happens.
The key idea: a relationship is not a join you set up once and flatten the data with. It's a declared association between tables, and Tableau decides the actual join — its type and granularity — per sheet, based on the fields you use. The data is no longer pre-flattened into one wide, duplicated table. That single change is why the numbers stop lying.
Two layers: logical and physical
2020.2 split the data source canvas into two layers, and understanding the split is understanding relationships.
graph TD
subgraph LOGICAL["Logical layer (the new top canvas)"]
T1["Orders"]
T2["Targets"]
T3["Customers"]
T1 ---|"relationship (noodle)"| T2
T1 ---|"relationship (noodle)"| T3
end
subgraph PHYSICAL["Physical layer (open a logical table to see it)"]
P1["Orders + OrderDetails
(joined / unioned into
one physical table)"]
end
NOTE["Relationships stay separate & keep each table's
level of detail; joins flatten into one table"]
LOGICAL --> NOTE
T1 -.->|"double-click to open"| PHYSICAL
The two-layer model. The logical layer is the new top-level canvas where tables are connected by relationships (noodles) — they stay distinct and each keeps its own granularity. The physical layer lives inside each logical table (double-click to open it) and is where the old-style joins and unions still happen, flattening tables into one. Relationships sit above joins: you relate logical tables, and Tableau works out the physical join per query.
- Logical layer — the top canvas. Tables here are connected by relationships. They are not merged; each keeps its own level of detail, and Tableau figures out the join contextually at query time.
- Physical layer — inside each logical table (double-click to enter). This is the classic world of joins and unions that flatten multiple tables into one physical table. It still exists, for when you genuinely want a flattened table.
Why relationships stop the duplication
Here's the mechanism that matters. With a classic join, Tableau flattens the tables once, up front, into a single wide table — and that flattening is where fan-out duplication happens: one order joined to three targets becomes three rows, and SUM([Sales]) triple-counts. With a relationship, nothing is flattened up front. When you build a sheet, Tableau looks at which fields you've used and generates the appropriate query at the right level of detail for that sheet — aggregating each table to its own grain before combining, so SUM([Sales]) reflects orders at order-grain and SUM([Target]) reflects targets at target-grain, with neither inflating the other.
The mental upgrade: stop thinking "I'm joining tables" and start thinking "I'm telling Tableau how these tables relate." With a relationship you declare the matching fields once, and then trust Tableau to choose the join type (inner/left) and granularity per visualization. Put a measure from Orders on a sheet alone and you get all orders; bring in a dimension from Customers and Tableau joins only as much as it needs, at the right grain. You're describing the model, not pre-computing one flattened result — which is also why a single relationships-based data source can serve many sheets that would each have needed a different join before.
Relationships vs. joins: when to use which
| Relationship (logical layer) | Join (physical layer) | |
|---|---|---|
| When the join happens | Per sheet, at query time, contextually | Once, up front — flattens the tables |
| Level of detail | Each table keeps its own grain | Single flattened grain |
| Duplication / fan-out | Avoided — aggregates before combining | Classic risk — rows multiply |
| Best for | Most multi-table models (the default now) | Deliberately denormalizing; scaffolding / row-level needs |
Relationships should be your default for combining tables in 2020.2 and later — they're safer and more flexible. But joins didn't disappear, and there are real reasons to drop into the physical layer: when you specifically need a flattened, denormalized table; for some row-level scaffolding or densification tricks; or when you want full manual control over the exact join type and don't want Tableau choosing per sheet. The skill is knowing that relationships are the high-level default and joins are the low-level tool you reach into the physical layer for when you mean it.
Relationships changed the defaults, so old habits and old workbooks behave differently — don't assume. Two things bite teams adopting 2020.2. First, muscle memory: people still reach for a physical join out of habit and re-introduce the exact duplication relationships were built to prevent — if you don't need a flattened table, relate, don't join. Second, the relationship's defaults (it can behave like an inner or outer join depending on the fields and its performance-options settings for cardinality and referential integrity) mean results can differ from a hand-built join in subtle ways; if a number looks off, check the relationship's settings rather than assuming it's wrong. The model is better, but it's genuinely a different model — treat a migrated workbook as something to re-validate, not something that automatically behaves the same.
Why this echoes warehouse modeling
If this feels familiar from the database world, it should: relationships let you build something close to a star schema directly in Tableau — a central fact table related to dimension tables, each kept at its own grain, joined contextually per query. Before relationships, you either flattened everything (and fought duplication) or maintained the modeling outside Tableau. Now the dimensional model the data team designed can be represented faithfully in the Tableau data source, with the BI tool respecting grain the way a well-designed warehouse does. It's the analytical data model finally expressed natively in the analytical tool.
What to carry away
Tableau 2020.2 relationships fixed the duplication that classic joins quietly caused by changing the model: instead of flattening tables into one wide, fan-out-prone table up front, a relationship declares how tables associate and lets Tableau generate the right join at the right level of detail per sheet. The data source now has two layers — the logical layer of related tables (the noodle), and the physical layer inside each table where old-style joins and unions still live.
Make relationships your default for combining tables: declare how tables relate and trust Tableau to handle grain, which kills the SUM-is-doubled bug and lets one source serve many sheets. Drop into the physical layer for joins only when you deliberately want a flattened table or full manual control. And re-validate migrated workbooks rather than assuming identical behavior — it's a better model, but a genuinely different one. In effect, relationships brought faithful dimensional modeling into Tableau itself, which is why they're one of the most consequential changes the product ever made. For the engine that executes these queries, see Tableau internals.