Bad planner performance for tables with empty tuples when using JIT

От

Jurrie Overgoor

Тема

Дата

27 октября 2020 г. в 12:13:48

Msg-id

70ec49f7-e9c4-bea7-c689-a4dfbdd66f78@jurr.org

обсуждение

Список

pgsql-general

Дерево обсуждения

Bad planner performance for tables with empty tuples when using JIT Jurrie Overgoor <postgresql-mailinglist@jurr.org> 27 октября 2020 г. в 12:13:48

Re: Bad planner performance for tables with empty tuples when using JIT Michael Lewis <mlewis@entrata.com> 27 октября 2020 г. в 21:14:49

Hello everyone,

I'm currently in the process of upgrading our PostgreSQL installations
from 9.6 to 13. I am experiencing very slow query performance for empty
tables.

Our test environments get build from scratch every run, and thus contain
a lot of empty tables at first. We hit the issue discussed and resolved
in [1]. Problem is, this fix is not included in PG13... but JIT is :)

Some of our queries join a lot of tables, so the planner thinks this
will result in very high costs, and the JIT feature kicks in. This gives
bad performance. Executing the query is very fast of course; it's just
the planning stage that is time consuming. We do not see this behavior
in PG9.6, as there is no JIT. EXPLAIN gives the same high costs on PG9.6
and PG13, but on PG9.6 the planning time is low enough.

Now, how can I circumvent this?

- I could wait for PG14. I verified that PG14 solves my case and queries
are fast. But this would take at least until the third quarter of 2021,
so the website tells me.

- I could ask for a back-port of [2]. The commit is API breaking; is
this even a feasible option?

- I could turn off JIT in the server config, but I'd like to use the JIT
feature where it's appropriate!

- I could turn off JIT using a GUC prior to query execution. But then I
would need to detect the cases where I need to do this. (I.e. cases
where I query a table that has no tuples.) It would be very cumbersome
for me to write this in my application, and I feel this is more the
responsibility of the database than of my program.

- Another way is to fake the number of pages; set it to 1 where it is 0
everywhere. I verified that this produces fast query performance. But
fiddling with pg_class does not "feel" right... Is this okay to use in a
test setup (and maybe even in a production scenario)? Could I do the
following query just after our database is initialized:

update pg_catalog.pg_class c set relpages = 1 where c.relpages = 0 and
c.relnamespace = (select ns.oid from pg_catalog.pg_namespace ns where
ns.nspname = current_schema);

Could someone give me advice on what would be the best strategy?

With kind regads,

Jurrie

[1] https://postgr.es/m/F02298E0-6EF4-49A1-BCB6-C484794D9ACC@thebuild.com
[2]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=3d351d916b20534f973eda760cde17d96545d4c4

В списке pgsql-general по дате отправления

Предыдущее

От: xiebin (F)

Дата: 27 октября 2020 г. в 11:34:31

Сообщение: 答复: Security issues concerning pgsql replication

Следующее

От: Magnus Hagander

Дата: 27 октября 2020 г. в 13:40:21

Сообщение: Re: Security issues concerning pgsql replication