SQL Arena

The bar chart shows 7 categories of mis-estimation:

Estimate is more than 16x too low
Estimate is more than 8x too low
Estimate is more than 4x too low
Within +/- 2x - this is good!
Estimate is more 4x too high
Estimate is more than 8x too high
Estimate is more than 16x too high

The height of the bar is the number of nodes in the query plan with that error. When estimates are good, the data gathers around the center of the bar chart

Example

Consider this data:

Workload	Joining
Workload	Rows Processed	Estimation Accuracy
TPC-H	1.2M	Estimation Error Est Err

Here, we can see that the Database Engine we are looking at tends to over estimate the row count of joins - at least for the TPC-H workload.

We can also see that the entire workload of TPC-H did 1.2M join operations.

Rank is assigned for each category of operation the database engine will have to do. You can win for one operation and lose for another.

Lower number of operations is better (less work needs doing). The three lowest values for each operation are given:

- Best Score
- Second Best
- Third Best

The ranking is dense, so you can share the top seat with another engine. Databases that are not in the top 3 earns no stars. Because some database (notably PostgreSQL) don't report exact row counts, the Row counts are rounded down to nearest 10s before ranking.

The final scoreboard adds up rowcounts for all workloads, operators and queries and then rank the overall best.

Legend

Interpreting Estimation Accuracy

Example

Interpreting Rank