PlannerTPC-H — TPCH-Q08

Engine Compare

Accuracy chart, rows processed ?
Scan
Scan
Seek
Seek
Join Probe
Join
Sort
Sort
Hash Build
Hash
Aggregate
Agg
Distribute
Dist
ClickHouse
Estimation Error
Est Err
Row Operations
Ops
6,619,932
6.6M
Rank
Estimation Error
Est Err
Row Operations
Ops
6,623,797
6.6M
Rank
Estimation Error
Est Err
Row Operations
Ops
2
2
Rank
Estimation Error
Est Err
Row Operations
Ops
868,417
868K
Rank
Estimation Error
Est Err
Row Operations
Ops
2,647
2.6K
Rank
Estimation Error
Est Err
Row Operations
Ops
0
Rank
Databricks
Estimation Error
Est Err
Row Operations
Ops
6,607,232
6.6M
Rank
Estimation Error
Est Err
Row Operations
Ops
694,976
695K
Rank
Estimation Error
Est Err
Row Operations
Ops
2
2
Rank
Estimation Error
Est Err
Row Operations
Ops
6,090,855
6.1M
Rank
Estimation Error
Est Err
Row Operations
Ops
2,649
2.6K
Rank
Estimation Error
Est Err
Row Operations
Ops
0
Rank
DuckDB
Estimation Error
Est Err
Row Operations
Ops
6,041,438
6M
Rank
Estimation Error
Est Err
Row Operations
Ops
6,056,314
6.1M
Rank
Estimation Error
Est Err
Row Operations
Ops
2
2
Rank
Estimation Error
Est Err
Row Operations
Ops
104,959
105K
Rank
Estimation Error
Est Err
Row Operations
Ops
2,647
2.6K
Rank
Estimation Error
Est Err
Row Operations
Ops
0
Rank
PostgreSQL
Estimation Error
Est Err
Row Operations
Ops
101,107
101K
Rank
Estimation Error
Est Err
Row Operations
Ops
99,653
100K
Rank
Estimation Error
Est Err
Row Operations
Ops
115,237
115K
Rank
Estimation Error
Est Err
Row Operations
Ops
2,646
2.6K
Rank
Estimation Error
Est Err
Row Operations
Ops
101,429
101K
Rank
Estimation Error
Est Err
Row Operations
Ops
2,646
2.6K
Rank
Estimation Error
Est Err
Row Operations
Ops
0
Rank
SQL Server
Estimation Error
Est Err
Row Operations
Ops
60,525
61K
Rank
Estimation Error
Est Err
Row Operations
Ops
68,194
68K
Rank
Estimation Error
Est Err
Row Operations
Ops
2
2
Rank
Estimation Error
Est Err
Row Operations
Ops
107,601
108K
Rank
Estimation Error
Est Err
Row Operations
Ops
2,647
2.6K
Rank
Estimation Error
Est Err
Row Operations
Ops
0
Rank

Actual Query Plans

Query Plan per Engine ?
Query Plan
ClickHouse
Estimate    Actual  Operator
       -         2  PROJECT o_year, a2 / a3 AS mkt_share
       -         2  SORT o_year
       -         2  PROJECT o_year
       -         2  AGGREGATE sumIf(volume,a9) AS a2, SUM(volume) AS a3 GROUP BY HASH o_year
       -      2647  PROJECT EXTRACT(YEAR FROM o_orderdate) AS o_year, l_extendedprice * (1 - l_discount) AS volume, n_name = 'FRANCE' AS a9
       -      2647  PROJECT l_extendedprice, l_discount, o_orderdate, n_name
       -      2647  INNER JOIN HASH ON PROJECTION_222.n_nationkey = PROJECTION_183.s_nationkey
       -      2647  │└PROJECT s_nationkey, l_extendedprice, l_discount, o_orderdate
       -      2647   PROJECT l_extendedprice, l_discount, o_orderdate, s_nationkey
       -      2647   INNER JOIN HASH ON PROJECTION_219.s_suppkey = PROJECTION_186.l_suppkey
       -      2647   │└PROJECT l_suppkey, l_extendedprice, l_discount, o_orderdate
       -      2647    PROJECT l_extendedprice, l_discount, l_suppkey, o_orderdate
       -      2647    INNER JOIN HASH ON PROJECTION_216.p_partkey = PROJECTION_189.l_partkey
       -    370103    │└PROJECT l_partkey, l_extendedprice, l_discount, l_suppkey, o_orderdate
       -    370103     PROJECT l_extendedprice, l_discount, l_partkey, l_suppkey, o_orderdate
       -    370103     INNER JOIN HASH ON PROJECTION_213.l_orderkey = PROJECTION_192.o_orderkey
       -     92715     │└PROJECT o_orderkey, o_orderdate
       -     92715      PROJECT o_orderdate, o_orderkey
       -     92715      INNER JOIN HASH ON PROJECTION_210.o_custkey = PROJECTION_195.c_custkey
       -     30197      │└PROJECT c_custkey
       -     30197       PROJECT c_custkey
       -     30197       INNER JOIN HASH ON PROJECTION_207.c_nationkey = PROJECTION_198.n_nationkey
       -         5       │└PROJECT n_nationkey
       -         5        PROJECT n_nationkey
       -         5        INNER JOIN HASH ON PROJECTION_204.n_regionkey = PROJECTION_201.r_regionkey
       -         1        │└PROJECT r_regionkey
       -         1         PROJECT r_regionkey
       -         1         TABLE SCAN region WHERE r_name = 'EUROPE'
       -        25        PROJECT n_regionkey, n_nationkey
       -        25        PROJECT n_nationkey, n_regionkey
       -        25        TABLE SCAN nation
       -    150000       PROJECT c_nationkey, c_custkey
       -    150000       PROJECT c_custkey, c_nationkey
       -    150000       TABLE SCAN customer
       -    457263      PROJECT o_custkey, o_orderdate, o_orderkey
       -    457263      PROJECT o_orderkey, o_orderdate, o_custkey
       -    457263      TABLE SCAN orders WHERE (o_orderdate >= '1995-01-01') AND (o_orderdate <= '1996-12-31')
       -   6001215     PROJECT l_orderkey, l_extendedprice, l_discount, l_partkey, l_suppkey
       -   6001215     PROJECT l_extendedprice, l_discount, l_orderkey, l_partkey, l_suppkey
       -   6001215     TABLE SCAN lineitem
       -      1403    PROJECT p_partkey
       -      1403    PROJECT p_partkey
       -      1403    TABLE SCAN part WHERE p_type = 'SMALL POLISHED NICKEL'
       -     10000   PROJECT s_suppkey, s_nationkey
       -     10000   PROJECT s_suppkey, s_nationkey
       -     10000   TABLE SCAN supplier
       -        25  PROJECT n_nationkey, n_name
       -        25  PROJECT n_name, n_nationkey
       -        25  TABLE SCAN nation
DuckDB
Estimate    Actual  Operator
       -         2  SORT all_nations.o_year
    1588         2  PROJECT o_year, mkt_share
       -         2  AGGREGATE SUM(#1), SUM(#2) GROUP BY HASH #0
    1757      2647  PROJECT o_year, CASE WHEN (nation = 'FRANCE') THEN volume ELSE 0.0000 END, volume
    1757      2647  PROJECT o_year, volume, nation
    1757      2647  INNER JOIN HASH ON s_nationkey = n_nationkey
      25        25  │└TABLE SCAN nation
    1827      2647  INNER JOIN HASH ON s_suppkey = l_suppkey
    1827      2647  │└INNER JOIN HASH ON c_nationkey = n_nationkey
       5         5   │└INNER JOIN HASH ON n_regionkey = r_regionkey
       1         1    │└TABLE SCAN region WHERE r_name = 'EUROPE'
      25        25    TABLE SCAN nation
    9502     12905   INNER JOIN HASH ON c_custkey = o_custkey
    9003     12905   │└INNER JOIN HASH ON o_orderkey = l_orderkey
   44646     42054    │└INNER JOIN HASH ON l_partkey = p_partkey
    1419      1403     │└TABLE SCAN part WHERE p_type = 'SMALL POLISHED NICKEL'
 6001215   6000555     TABLE SCAN lineitem
  300000     17277    TABLE SCAN orders WHERE o_orderdate >= '1995-01-01' AND o_orderdate <= '1996-12-31'
  150000     12152   TABLE SCAN customer WHERE c_custkey <= 149999
   10000     10000  TABLE SCAN supplier
Databricks
Estimate    Actual  Operator
   41200         2  SORT all_nations.o_year ASC NULLS FIRST
   41200         2  AGGREGATE SUM(casewhen((all_nations.nation = 'FRANCE'collate UTF8_BINARY),all_nations.volume,0.0000BD)), SUM(all_nations.volume) GROUP BY HASH all_nations.o_year
   41200         2  DISTRIBUTE HASH ON all_nations.o_year
   41200         2  AGGREGATE SUM(casewhen((all_nations.nation = 'FRANCE'collate UTF8_BINARY),all_nations.volume,0.0000BD)), SUM(all_nations.volume) GROUP BY HASH all_nations.o_year
   41200      2647  INNER JOIN HASH ON l_suppkey = s_suppkey
   41200      2647  │└INNER JOIN HASH ON o_custkey = c_custkey
   40700     12874   │└INNER JOIN HASH ON l_orderkey = o_orderkey
   40700     42054    │└DISTRIBUTE GATHER
       5     42054     INNER JOIN HASH ON l_partkey = p_partkey
       5   5993078     │└TABLE SCAN lineitem
       5      1403     DISTRIBUTE GATHER
   10000      1403     TABLE SCAN part WHERE p_type = 'SMALL POLISHED NICKEL'collate UTF8_BINARY
  200000    456028    TABLE SCAN orders WHERE (o_orderdate >= DATE'1995-01-01') AND (o_orderdate <= DATE'1996-12-31')
   41200     30197   DISTRIBUTE GATHER
   30000     30197   INNER JOIN HASH ON n1.n_nationkey = c_nationkey
   30000         5   │└DISTRIBUTE GATHER
   10000         5    INNER JOIN HASH ON n1.n_regionkey = r_regionkey
  150000         5    │└TABLE SCAN nation
   10000         1    DISTRIBUTE GATHER
      25         1    TABLE SCAN region WHERE r_name = 'EUROPE'collate UTF8_BINARY
      25    146692   TABLE SCAN customer
   41200     10000  DISTRIBUTE GATHER
   41200     10000  INNER JOIN HASH ON s_nationkey = n2.n_nationkey
 6000000     10000  │└TABLE SCAN supplier
   41200        25  DISTRIBUTE GATHER
 1500000        25  TABLE SCAN nation
PostgreSQL
Estimate    Actual  Operator
    2406         2  AGGREGATE SUM(CASE WHEN (n_name = 'FRANCE') THEN (l_extendedprice * ('1' - l_discount)) ELSE '0' END) / SUM(l_extendedprice * ('1' - l_discount)) GROUP BY SORT EXTRACT(year FROM o_orderdate)
    3015      2646  SORT EXTRACT(year FROM o_orderdate)
    1005       882  INNER JOIN HASH ON s_nationkey = n_nationkey
      25        25  │└TABLE SCAN nation AS n2
    3015      2646  INNER JOIN LOOP ON s_suppkey = l_suppkey
    1005       882  │└INNER JOIN HASH ON c_nationkey = n_nationkey
       5         5   │└INNER JOIN HASH ON n_regionkey = r_regionkey
       1         1    │└TABLE SCAN region AS region WHERE r_name = 'EUROPE'
      25        25    TABLE SCAN nation AS n1
   15066     12905   INNER JOIN LOOP ON c_custkey = o_custkey
   15066     12905   │└INNER JOIN LOOP ON o_orderkey = l_orderkey
   49548     42054    │└INNER JOIN LOOP ON l_partkey = p_partkey
    1650      1403     │└TABLE SCAN part AS part WHERE p_type = 'SMALL POLISHED NICKEL'
   43493     42047     TABLE SEEK lineitem AS lineitem
   42054     42054    TABLE SEEK orders AS orders WHERE (o_orderdate >= '1995-01-01') AND (o_orderdate <= '1996-12-31')
   12905     12905   TABLE SEEK customer AS customer
    2647      2647  TABLE SEEK supplier AS supplier
SQL Server
Estimate    Actual  Operator
       2         2  SORT Expr1022
       2         2  PROJECT Expr1024 / Expr1025 AS Expr1026
       2         2  AGGREGATE SUM(Expr1051) AS Expr1024, SUM(Expr1027) AS Expr1025 GROUP BY HASH Expr1022
    3012      2647  PROJECT CASE WHEN n_name as n_name = 'FRANCE' THEN l_extendedprice * (1. - l_discount) ELSE 0.0000 END AS Expr1051
    3012      2647  INNER JOIN HASH ON n1.n_regionkey = r_regionkey
       1         1  │└TABLE SCAN region WHERE r_name = 'EUROPE'
    1506      2647  INNER JOIN HASH ON s_nationkey = n2.n_nationkey
      25        25  │└TABLE SCAN nation AS n2
    1506      2647  INNER JOIN HASH ON c_nationkey = n1.n_nationkey
       2         5  │└TABLE SCAN nation AS n1 WHERE BLOOM(n_regionkey as n_regionkey)
    1506      2647  INNER JOIN HASH ON s_suppkey = l_suppkey
    1506      2647  │└INNER JOIN HASH ON c_custkey = o_custkey
   15060     12905   │└INNER JOIN HASH ON o_orderkey = l_orderkey
   42117     42054    │└INNER JOIN HASH ON l_partkey = p_partkey
    1403      1403     │└TABLE SCAN part WHERE p_type = 'SMALL POLISHED NICKEL'
   60012     42054     PROJECT l_extendedprice * (1. - l_discount) AS Expr1027
   60012     42054     TABLE SCAN lineitem WHERE BLOOM(l_partkey)
   45706     12714    PROJECT datepart(year,o_orderdate) AS Expr1022
   45706     12714    TABLE SCAN orders WHERE o_orderdate >= '1995-01-01' AND o_orderdate <= '1996-12-31' AND BLOOM(o_orderkey)
    1500      2432   TABLE SCAN customer WHERE BLOOM(c_custkey) AND BLOOM(c_nationkey)
    1000      1891  TABLE SCAN supplier WHERE BLOOM(s_suppkey)

Commentary

Optimal Join Order

The most important optimization for Query 8 is taking the highly selective filter on part ( p_type = 'MEDIUM PLATED BRASS') as early as possible. This reduces the lineitem stream to less than 1% before joining with larger tables like orders.

Bushy Joins

Query 8 presents an opportunity for a bushy join. Instead of joining the main stream directly to customer and then through nation and region, an optimizer can pre-join nation and region (applying the r_name = 'EUROPE' filter) and then join the result. This reduces the total number of join operations.

Join Algorithms

The choice of join algorithm (Hash, Merge, or Loop) affects both performance and memory usage. While Hash joins are typically faster for analytical workloads, some optimizers like SQL Server may choose Merge joins to minimize memory consumption when the runtime is comparable.