Обсуждение: Possible regression in PG18 beta1
Dear all,
I was testing PG18 beta 1 new features (noticeably, AIO and index skip scan), and I came up with this example:
===========
drop table if exists t;
create TEMP table t(i,j,k)
as select n,n,n
from generate_series(1,10_000_000) as n;
analyze t;
create index on t(i,j,k);
explain (analyze,buffers,costs off,timing off)
select * from t where k = 1;
create TEMP table t(i,j,k)
as select n,n,n
from generate_series(1,10_000_000) as n;
analyze t;
create index on t(i,j,k);
explain (analyze,buffers,costs off,timing off)
select * from t where k = 1;
===========
On my laptop (MacBook Air M3), Postgres 17.5 runs the query over 2x faster than 18 beta1. The former uses index-only scan, while the latter uses seq scan:
=== PG 17.5 =====
QUERY PLAN
----------------------------------------------------------------
Index Only Scan using t_i_j_k_idx on t (actual rows=1 loops=1)
Index Cond: (k = 1)
Heap Fetches: 1
Buffers: local hit=1 read=38317
Planning Time: 0.315 ms
Execution Time: 242.711 ms
(6 rows)
----------------------------------------------------------------
Index Only Scan using t_i_j_k_idx on t (actual rows=1 loops=1)
Index Cond: (k = 1)
Heap Fetches: 1
Buffers: local hit=1 read=38317
Planning Time: 0.315 ms
Execution Time: 242.711 ms
(6 rows)
=== PG 18 beta 1 =====
QUERY PLAN
------------------------------------------
Seq Scan on t (actual rows=1.00 loops=1)
Filter: (k = 1)
Rows Removed by Filter: 9999999
Buffers: local hit=1 read=54079
Planning Time: 0.115 ms
Execution Time: 520.465 ms
(6 rows)
------------------------------------------
Seq Scan on t (actual rows=1.00 loops=1)
Filter: (k = 1)
Rows Removed by Filter: 9999999
Buffers: local hit=1 read=54079
Planning Time: 0.115 ms
Execution Time: 520.465 ms
(6 rows)
If I turn off enable_seqscan on the latter, it will use the index-only scan, with the query time similar to PG17 (yet a tad slower):
=== PG 18 beta 1, with enable_seqscan disabled =====
QUERY PLAN
-------------------------------------------------------------------
Index Only Scan using t_i_j_k_idx on t (actual rows=1.00 loops=1)
Index Cond: (k = 1)
Heap Fetches: 1
Index Searches: 1
Buffers: local hit=1 read=38317
Planning Time: 0.200 ms
Execution Time: 281.742 ms
(7 rows)
-------------------------------------------------------------------
Index Only Scan using t_i_j_k_idx on t (actual rows=1.00 loops=1)
Index Cond: (k = 1)
Heap Fetches: 1
Index Searches: 1
Buffers: local hit=1 read=38317
Planning Time: 0.200 ms
Execution Time: 281.742 ms
(7 rows)
* Also, I noticed "explain analyze" reports actual rows as a decimal number (1.00) as opposed to an integer (1); not sure if that's intentional.
* Changing the table from TEMP to UNLOGGED makes both versions use "Parallel Seq Scan", with PG18 being ~25% faster. (190ms vs. 150ms).
Best Regards,
Sadeq Dousti
Hi,
You're right, with settings, it revealed that PG17 had random_page_cost=1.1 configured.
Adding that to PG18 sets the plan to index only; however, PG18 is still somewhat (~15%) slower:
============ PG17.5 =================
QUERY PLAN
----------------------------------------------------------------
Index Only Scan using t_i_j_k_idx on t (actual rows=1 loops=1)
Index Cond: (k = 1)
Heap Fetches: 1
Buffers: local read=38318 written=424
Settings: random_page_cost = '1.1'
Planning:
Buffers: shared hit=29, local read=1 written=1
Planning Time: 0.098 ms
Execution Time: 137.209 ms
(9 rows)
----------------------------------------------------------------
Index Only Scan using t_i_j_k_idx on t (actual rows=1 loops=1)
Index Cond: (k = 1)
Heap Fetches: 1
Buffers: local read=38318 written=424
Settings: random_page_cost = '1.1'
Planning:
Buffers: shared hit=29, local read=1 written=1
Planning Time: 0.098 ms
Execution Time: 137.209 ms
(9 rows)
QUERY PLAN
-------------------------------------------------------------------
Index Only Scan using t_i_j_k_idx on t (actual rows=1.00 loops=1)
Index Cond: (k = 1)
Heap Fetches: 1
Index Searches: 1
Buffers: local read=38318 written=443
Settings: random_page_cost = '1.1'
Planning:
Buffers: shared hit=30, local read=1
Planning Time: 0.097 ms
Execution Time: 160.595 ms
(10 rows)
-------------------------------------------------------------------
Index Only Scan using t_i_j_k_idx on t (actual rows=1.00 loops=1)
Index Cond: (k = 1)
Heap Fetches: 1
Index Searches: 1
Buffers: local read=38318 written=443
Settings: random_page_cost = '1.1'
Planning:
Buffers: shared hit=30, local read=1
Planning Time: 0.097 ms
Execution Time: 160.595 ms
(10 rows)
So, one mystery solved (no planner regression), still three questions:
* Somewhat slower execution in PG18
* Planner differences in TEMP vs. UNLOGGED
* Actual rows with decimal (1.00) vs. integer (1)
Best Regards,
Sadeq Dousti
On Sat, May 17, 2025 at 6:18 PM Christophe Courtois <christophe.courtois@dalibo.com> wrote:
Hi,
- I cannot reproduce your regression, I have always a Seq Scan (PG 13,
17, 18, default config, last versions).
I suggest that you add a SETTINGS clause in EXPLAIN and \d+ and \di+
before, just in case.
- I see the difference between TEMP and UNLOGGED too (since PG13),
this is funny but I have no explanation. Something to do with the access
to shared buffers, I suppose.
Yours,
Le 17/05/2025 à 17:45, Sadeq Dousti a écrit :
> Dear all,
>
> I was testing PG18 beta 1 new features (noticeably, AIO and index skip
> scan), and I came up with this example:
>
> ===========
> drop table if exists t;
>
> create TEMP table t(i,j,k)
> as select n,n,n
> from generate_series(1,10_000_000) as n;
>
> analyze t;
>
> create index on t(i,j,k);
>
> explain (analyze,buffers,costs off,timing off)
> select * from t where k = 1;
> ===========
>
>
> On my laptop (MacBook Air M3), Postgres 17.5 runs the query over 2x
> faster than 18 beta1. The former uses index-only scan, while the latter
> uses seq scan:
>
> === PG 17.5 =====
> QUERY PLAN
> ----------------------------------------------------------------
> Index Only Scan using t_i_j_k_idx on t (actual rows=1 loops=1)
> Index Cond: (k = 1)
> Heap Fetches: 1
> Buffers: local hit=1 read=38317
> Planning Time: 0.315 ms
> Execution Time: 242.711 ms
> (6 rows)
>
>
>
> === PG 18 beta 1 =====
> QUERY PLAN
> ------------------------------------------
> Seq Scan on t (actual rows=1.00 loops=1)
> Filter: (k = 1)
> Rows Removed by Filter: 9999999
> Buffers: local hit=1 read=54079
> Planning Time: 0.115 ms
> Execution Time: 520.465 ms
> (6 rows)
>
>
> If I turn off enable_seqscan on the latter, it will use the index-only
> scan, with the query time similar to PG17 (yet a tad slower):
>
> === PG 18 beta 1, with enable_seqscan disabled =====
> QUERY PLAN
> -------------------------------------------------------------------
> Index Only Scan using t_i_j_k_idx on t (actual rows=1.00 loops=1)
> Index Cond: (k = 1)
> Heap Fetches: 1
> Index Searches: 1
> Buffers: local hit=1 read=38317
> Planning Time: 0.200 ms
> Execution Time: 281.742 ms
> (7 rows)
>
>
> * Also, I noticed "explain analyze" reports actual rows as a decimal
> number (1.00) as opposed to an integer (1); not sure if that's intentional.
>
> * Changing the table from TEMP to UNLOGGED makes both versions use
> "Parallel Seq Scan", with PG18 being ~25% faster. (190ms vs. 150ms).
>
>
> Best Regards,
> Sadeq Dousti
--
_________ ____
| || | Christophe Courtois
| ||__ | Consultant DALIBO
| | | | 43, rue du Faubourg Montmartre
| - | / / 75009 Paris
|___| |___| \/ www.dalibo.com
On Sat, May 17, 2025 at 9:38 AM Sadeq Dousti <msdousti@gmail.com> wrote:
So, one mystery solved (no planner regression), still three questions:* Somewhat slower execution in PG18* Planner differences in TEMP vs. UNLOGGED* Actual rows with decimal (1.00) vs. integer (1)
Thanks,
Maciek
>> > * Also, I noticed "explain analyze" reports actual rows as a decimal >> > number (1.00) as opposed to an integer (1); not sure if that's intentional. Yes, this is intentional as of 95dbd827f2edc >> > * Changing the table from TEMP to UNLOGGED makes both versions use >> > "Parallel Seq Scan", with PG18 being ~25% faster. (190ms vs. 150ms). I'll be curious about tests with a normal table as well with a sufficiently large shared_buffers. -- Sami Imseih Amazon Web Services (AWS)
Yes, this is intentional as of 95dbd827f2edc
Thanks!
>> > * Changing the table from TEMP to UNLOGGED makes both versions use
>> > "Parallel Seq Scan", with PG18 being ~25% faster. (190ms vs. 150ms).
I'll be curious about tests with a normal table as well with a
sufficiently large shared_buffers.
Here are results for a normal table with default shared_buffers (128 MB) and large shared_buffers (4GB):
==== PG17.5, shared_buffers=128MB ======
QUERY PLAN
------------------------------------------------------
Gather (actual rows=1 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=3584 read=50496
-> Parallel Seq Scan on t (actual rows=0 loops=3)
Filter: (k = 1)
Rows Removed by Filter: 3333333
Buffers: shared hit=3584 read=50496
Settings: random_page_cost = '1.1'
Planning Time: 0.119 ms
Execution Time: 141.374 ms
(11 rows)
------------------------------------------------------
Gather (actual rows=1 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=3584 read=50496
-> Parallel Seq Scan on t (actual rows=0 loops=3)
Filter: (k = 1)
Rows Removed by Filter: 3333333
Buffers: shared hit=3584 read=50496
Settings: random_page_cost = '1.1'
Planning Time: 0.119 ms
Execution Time: 141.374 ms
(11 rows)
==== PG17.5, shared_buffers=4GB ======
select * from t where k = 1;QUERY PLAN
------------------------------------------------------
Gather (actual rows=1 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=3872 read=50208
-> Parallel Seq Scan on t (actual rows=0 loops=3)
Filter: (k = 1)
Rows Removed by Filter: 3333333
Buffers: shared hit=3872 read=50208
Settings: random_page_cost = '1.1'
Planning Time: 0.118 ms
Execution Time: 141.846 ms
(11 rows)
==== PG18 beta1, shared_buffers=128MB ======
QUERY PLAN
---------------------------------------------------------
Gather (actual rows=1.00 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=15986 read=38094
-> Parallel Seq Scan on t (actual rows=0.33 loops=3)
Filter: (k = 1)
Rows Removed by Filter: 3333333
Buffers: shared hit=15986 read=38094
Settings: random_page_cost = '1.1'
Planning Time: 0.123 ms
Execution Time: 110.650 ms
(11 rows)
---------------------------------------------------------
Gather (actual rows=1.00 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=15986 read=38094
-> Parallel Seq Scan on t (actual rows=0.33 loops=3)
Filter: (k = 1)
Rows Removed by Filter: 3333333
Buffers: shared hit=15986 read=38094
Settings: random_page_cost = '1.1'
Planning Time: 0.123 ms
Execution Time: 110.650 ms
(11 rows)
==== PG18 beta1, shared_buffers=4GB ======
QUERY PLAN
---------------------------------------------------------Gather (actual rows=1.00 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=15991 read=38089
-> Parallel Seq Scan on t (actual rows=0.33 loops=3)
Filter: (k = 1)
Rows Removed by Filter: 3333333
Buffers: shared hit=15991 read=38089
Settings: random_page_cost = '1.1'
Planning Time: 0.095 ms
Execution Time: 109.073 ms
(11 rows)
Best Regards,
Sadeq Dousti
>> I'll be curious about tests with a normal table as well with a >> sufficiently large shared_buffers. > Here are results for a normal table with default shared_buffers (128 MB) and large shared_buffers (4GB): thanks. I don't see regression for a normal table, at least for this test. In terms of your original test, I tried it out on my Ubuntu machine and with your test as-is, I see 2.8 seconds on 17.5 and 3.3 seconds on HEAD if the plan performs a seq scan without parallelism. However, the test as you have it is indexing all columns on the table. If I just index on the filtered column ``` create index on t(k); explain (analyze,buffers,costs off,timing off) select k from t where k = 1; ``` I see similar behavior between HEAD ``` test=# explain (analyze,buffers,costs off,timing off) select k from t where k = 1; QUERY PLAN --------------------------------------------------------------- Index Only Scan using t_k_idx on t (actual rows=1.00 loops=1) Index Cond: (k = 1) Heap Fetches: 1 Index Searches: 1 Buffers: local hit=4 Planning Time: 0.088 ms Execution Time: 0.059 ms ``` and 17.5 ``` test=# explain (analyze,buffers,costs off,timing off) select k from t where k = 1; QUERY PLAN ------------------------------------------------------------ Index Only Scan using t_k_idx on t (actual rows=1 loops=1) Index Cond: (k = 1) Heap Fetches: 1 Buffers: local hit=4 Planning Time: 0.084 ms Execution Time: 0.053 ms (6 rows) ``` -- Sami
thanks. I don't see regression for a normal table, at least for this test.
No, there isn't. I just added them as per your request ;)
In terms of your original test, I tried it out on my Ubuntu machine
and with your test as-is, I see 2.8 seconds on 17.5 and 3.3 seconds
on HEAD if the plan performs a seq scan without parallelism.
Which is unexpected, no?
However, the test as you have it is indexing all columns
on the table. If I just index on the filtered column
Yes, I agree. Changing the indexing setup will diminish the difference. However, given the sub-optimal index, PG18 seems to be slower.
Also, there's a meaningful difference in the plans for TEMP table vs. UNLOGGED, which is interesting.
Best regards,
Sadeq Dousti
On Sat, May 17, 2025 at 12:38 PM Sadeq Dousti <msdousti@gmail.com> wrote: > So, one mystery solved (no planner regression), still three questions: > * Somewhat slower execution in PG18 I cannot recreate the problem. The fastest plan for this query is a parallel sequential scan -- the t_i_j_k_idx index is useless. The only reason it's used for an index-only scan when random_page_cost is reduced to 1.1 is because it happens to have no fragmentation (meaning the index itself can be read in physical order). Even if I force an index-only scan, I still see no significant performance differences when I compare REL_17_STABLE to master/18. Here's what I see on master/18: regression=# explain (analyze,buffers,costs off,timing off) select * from t where k = 1; QUERY PLAN ------------------------------------------------------------------- Index Only Scan using t_i_j_k_idx on t (actual rows=1.00 loops=1) Index Cond: (k = 1) Heap Fetches: 0 Index Searches: 1 Buffers: shared hit=38318 Planning Time: 0.025 ms Execution Time: 152.502 ms (7 rows) Here's the same query on current REL_17_STABLE: regression=# explain (analyze,buffers,costs off,timing off) select * from t where k = 1; QUERY PLAN ---------------------------------------------------------------- Index Only Scan using t_i_j_k_idx on t (actual rows=1 loops=1) Index Cond: (k = 1) Heap Fetches: 0 Buffers: shared hit=38318 Planning Time: 0.042 ms Execution Time: 155.890 ms (6 rows) I used a regular/logged table for this. I repeatedly executed the query, to minimize noise (I'm showing the last execution for each of master and REL_17_STABLE). For what it's worth, there *are* known regressions with more complicated cases, most of which involve multiple inequality/range conditions on multiple high cardinality columns. This was discussed extensively over the course of work on skip scan -- they were deemed acceptable. But I see no reason to believe that this particular query will run any slower on 18. FWIW, similar queries that don't have to scan the full index (which makes an index-only scan attractive to the planner) are much faster on Postgres 18, compared to 17. Here's one executed on 18: regression=# explain (analyze,buffers,costs off,timing off) select * from t where i between 1 and 500_000 and k = 1; QUERY PLAN ------------------------------------------------------------------- Index Only Scan using t_i_j_k_idx on t (actual rows=1.00 loops=1) Index Cond: ((i >= 1) AND (i <= 500000) AND (k = 1)) Heap Fetches: 0 Index Searches: 1 Buffers: shared hit=1919 Planning: Buffers: shared hit=4 Planning Time: 0.039 ms Execution Time: 7.141 ms (9 rows) Here's the same query executed on Postgres 17: regression=# explain (analyze,buffers,costs off,timing off) select * from t where i between 1 and 500_000 and k = 1; QUERY PLAN ---------------------------------------------------------------- Index Only Scan using t_i_j_k_idx on t (actual rows=1 loops=1) Index Cond: ((i >= 1) AND (i <= 500000) AND (k = 1)) Heap Fetches: 0 Buffers: shared hit=1919 Planning: Buffers: shared hit=4 Planning Time: 0.039 ms Execution Time: 10.771 ms (8 rows) -- Peter Geoghegan
On Sat, May 17, 2025 at 2:00 PM Sadeq Dousti <msdousti@gmail.com> wrote: >> >> thanks. I don't see regression for a normal table, at least for this test. > > > No, there isn't. I just added them as per your request ;) > > >> In terms of your original test, I tried it out on my Ubuntu machine >> >> and with your test as-is, I see 2.8 seconds on 17.5 and 3.3 seconds >> >> on HEAD if the plan performs a seq scan without parallelism. > > > Which is unexpected, no? For the temp table test, it seems like I can account mostly all of the extra time to the fact that checksums are enabled by default in 18, due to 04bec894a04c I ran the below script which runs the select 100 times against the temp table on HEAD ``` drop table if exists t; create TEMP table t(i,j,k) as select n,n,n from generate_series(1,10_000_000) as n; analyze t; create index on t(i,j,k); SELECT 'select * from t where k = 1;' FROM generate_series(1, 100) \gexec ``` and looked at perf top at the time, which shows pg_checksum_block at the top using a cluster that was created with initdb without any flags. ``` 12.12% postgres [.] pg_checksum_block 11.97% postgres [.] ExecInterpExpr 10.57% postgres [.] slot_deform_heap_tuple_internal 5.90% postgres [.] fetch_att 4.18% postgres [.] heapgettup_pagemode ``` and explain analyze for a single execution ``` test=# EXPLAIN (ANALYZE, timing on) select * from t where k = 1; QUERY PLAN ------------------------------------------------------------------------------------------------------- Seq Scan on t (cost=0.00..179080.00 rows=1 width=12) (actual time=0.065..3375.125 rows=1.00 loops=1) Filter: (k = 1) Rows Removed by Filter: 9999999 Buffers: local read=54080 Planning Time: 0.090 ms Execution Time: 3375.149 ms (6 rows) ``` Now, with initdb and --no-data-checksums ``` 13.32% postgres [.] ExecInterpExpr 12.44% postgres [.] slot_deform_heap_tuple_internal 6.64% postgres [.] fetch_att 4.70% postgres [.] heapgettup_pagemode 4.22% postgres [.] slot_deform_heap_tuple 3.75% postgres [.] TupleDescCompactAttr ``` and explain for a single execution ``` test=# EXPLAIN (ANALYZE, timing on) select * from t where k = 1; QUERY PLAN ------------------------------------------------------------------------------------------------------- Seq Scan on t (cost=0.00..179080.00 rows=1 width=12) (actual time=0.043..2939.101 rows=1.00 loops=1) Filter: (k = 1) Rows Removed by Filter: 9999999 Buffers: local read=54080 Planning Time: 0.087 ms Execution Time: 2939.125 ms (6 rows) ``` v18 with --no-data-checksums gives me close performance to v17 Can you try the same test ( with --no-data-checksums) on you mac and see if that makes a difference? [0] states "Only data pages are protected by checksums; internal data structures and temporary files are not." Is what is observed with temp files being protected by checksums correct? [0] https://www.postgresql.org/docs/current/checksums.html -- Sami
Thanks, very interesting insights!
Can you try the same test ( with --no-data-checksums) on you mac
and see if that makes a difference?
I disabled checksums on PG18, and retried the tests, with and without modifying random_page_cost, and for TEMP tables only.
When random_page_cost is the default value (=4), both versions do Seq Scan, and PG18 is indeed a bit faster (313 ms vs. 347 ms).
However, when random_page_cost = 1.1, both versions do Index Only Scan, and PG 17.5 is faster (135 ms vs. 152 ms).
Another counter-intuitive part is that when I run "explain" again, both versions show slower results. (both 170-180 ms)
Using pg_prewarm to load both the table and the index into the shared buffers does not change anything.
To summarize:
* Test setup works for TEMP table with random_page_cost = 1.1; data-checksums did not play a role
* It's fast initially and slows down on both versions in subsequent runs
* PG 17.5 is faster, but much more noticeably on the first run
I'm now thinking maybe the compilation options for PG 17.5 have been different. I'm using the default version that comes with Homebrew, but for PG 18, I compiled it myself. Here are the results for `select version();` on both:
PostgreSQL 17.5 (Homebrew) on aarch64-apple-darwin24.4.0,
compiled by Apple clang version 17.0.0 (clang-1700.0.13.3), 64-bit
vs.
PostgreSQL 18beta1 on aarch64-darwin, compiled by clang-17.0.0, 64-bit
Best Regards,
Sadeq Dousti
I'm now thinking maybe the compilation options for PG 17.5 have been different. I'm using the default version that comes with Homebrew, but for PG 18, I compiled it myself. Here are the results for `select version();` on both:PostgreSQL 17.5 (Homebrew) on aarch64-apple-darwin24.4.0,compiled by Apple clang version 17.0.0 (clang-1700.0.13.3), 64-bit
In my tests I build from source from both the 17 stable and head branch
and use the same build options.
Why temp uses checksum, which also happens on 17 and presumably
earlier versions, is still not too clear to me.
—
Sami
In my tests I build from source from both the 17 stable and head branchand use the same build options.
OK, I built 17 from REL_17_STABLE, and got the same behavior as the Homebrew PG17:
* First run: 135 ms
* Subsequent runs: 169 ms, 181 ms, 187 ms, 177 ms, 170 ms
For comparison, with PG18:
* First run: 150 ms
* Subsequent runs: 183 ms, 182 ms, 185 ms, 187 ms, 180 ms
On the first and second runs, PG17 is consistently faster.
It's also strange to me that subsequent runs are consistently slower than the first run.
Best Regards,
Sadeq Dousti