The bar chart shows 7 categories of mis-estimation:
- Estimate is more than 16x too low
- Estimate is more than 8x too low
- Estimate is more than 4x too low
- Within +/- 2x - this is good!
- Estimate is more 4x too high
- Estimate is more than 8x too high
- Estimate is more than 16x too high
The height of the bar is the number of nodes in the query plan with that error. When estimates are good, the data gathers around the center of the bar chart
Example
Consider this data:
| Workload | Joining | |
|---|---|---|
| Rows Processed | Estimation Accuracy | |
| TPC-H | 1.2M | |
Here, we can see that the Database Engine we are looking at tends to over estimate the row count of joins - at least for the TPC-H workload.
We can also see that the entire workload of TPC-H did 1.2M join operations.