Обсуждение: Performance issue on temporary relations

Поиск

Список

Период

Сортировка

Performance issue on temporary relations

От

"章晨曦"

Дата:

19 августа, 18:26:12

Hi there!

Recently I noticed a performance issue on temporary relation. The issue will happened on
ON COMMIT DELETE temporary relations. If one session only create a few temporary relations,
well, it's fine. But if one session creates plenty of ON COMMIT DELETE kind temporary relations,
say 3,000, it will face a significant performance degradation issue. Check below:


One temporary relation
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

psql (19devel)
Type "help" for help.

postgres=# CREATE LOCAL TEMP TABLE a_lttk1(n INT) ON COMMIT DELETE ROWS;
CREATE TABLE
postgres=# \timing
Timing is on.
postgres=# SELECT count(*) FROM a_lttk1;
 count
-------
     0
(1 row)

Time: 3.004 ms
postgres=# SELECT count(*) FROM a_lttk1;
 count
-------
     0
(1 row)

Time: 3.884 ms
postgres=# SELECT count(*) FROM a_lttk1;
 count
-------
     0
(1 row)

Time: 4.041 ms
postgres=# SELECT count(*) FROM a_lttk1;
 count
-------
     0
(1 row)

Time: 3.827 ms
postgres=# SELECT count(*) FROM a_lttk1;
 count
-------
     0
(1 row)

Time: 3.783 ms


3,000 temporary relation
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

psql (19devel)
Type "help" for help.

postgres=# DO $$
postgres$# DECLARE
postgres$#   v_sql VARCHAR(100);
postgres$# BEGIN
postgres$#   FOR i IN 1..3000 LOOP
postgres$#     v_sql := 'CREATE LOCAL TEMP TABLE a_lttk'||i||'(n INT) ON COMMIT DELETE ROWS';
postgres$#     EXECUTE v_sql;
postgres$#   END LOOP;
postgres$# END;
postgres$# $$ LANGUAGE plpgsql;
DO
postgres=# \timing
Timing is on.
postgres=# SELECT count(*) FROM a_lttk1;
 count
-------
     0
(1 row)

Time: 45.471 ms
postgres=# SELECT count(*) FROM a_lttk1;
 count
-------
     0
(1 row)

Time: 27.320 ms
postgres=# SELECT count(*) FROM a_lttk1;
 count
-------
     0
(1 row)

Time: 27.482 ms
postgres=# SELECT count(*) FROM a_lttk1;
 count
-------
     0
(1 row)

Time: 26.907 ms
postgres=# SELECT count(*) FROM a_lttk1;
 count
-------
     0
(1 row)

Time: 31.055 ms
postgres=# SELECT count(*) FROM a_lttk1;
 count
-------
     0
(1 row)

Time: 28.624 ms
postgres=# SELECT count(*) FROM a_lttk1;
 count
-------
     0
(1 row)

Time: 25.277 ms


The performance has decreased by nearly 10 times. The reason is we just check if there is
operation on any ON COMMIT DELETE kind temporary relations. Regardless of how many temporary 
tables are actually accessed, even if only one is accessed, it will do the truncate on all 
the temporary tables.

To overcome this issue, A new list named in_use has been introduced to record the actually 
accessed temporary relations, and then will do the truncate only on the actually accessed 
temporary relations. And it seems works well.



After patch:
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

psql (19devel)
Type "help" for help.

postgres=# DO $$
postgres$# DECLARE
postgres$#   v_sql VARCHAR(100);
postgres$# BEGIN
postgres$#   FOR i IN 1..3000 LOOP
postgres$#     v_sql := 'CREATE LOCAL TEMP TABLE a_lttk'||i||'(n INT) ON COMMIT DELETE ROWS';
postgres$#     EXECUTE v_sql;
postgres$#   END LOOP;
postgres$# END;
postgres$# $$ LANGUAGE plpgsql;
DO
postgres=# \timing
Timing is on.
postgres=# SELECT count(*) FROM a_lttk1;
 count
-------
     0
(1 row)

Time: 5.253 ms
postgres=# SELECT count(*) FROM a_lttk1;
 count
-------
     0
(1 row)

Time: 5.512 ms
postgres=# SELECT count(*) FROM a_lttk1;
 count
-------
     0
(1 row)

Time: 5.095 ms
postgres=# SELECT count(*) FROM a_lttk1;
 count
-------
     0
(1 row)

Time: 5.119 ms
postgres=# SELECT count(*) FROM a_lttk1;
 count
-------
     0
(1 row)

Time: 5.057 ms
postgres=# SELECT count(*) FROM a_lttk1;
 count
-------
     0
(1 row)

Time: 5.006 ms

Regards,
Jet

Halo Tech (www.halodbtech.com)
openHalo (www.openhalo.org)

Вложения

improve_temporary_rel_performance_v01.patch

Re: Performance issue on temporary relations

От

Tom Lane

Дата:

19 августа, 18:35:51

"=?utf-8?B?56ug5pmo5pum?=" <zhangchenxi@halodbtech.com> writes:
> Recently I noticed a performance issue on temporary relation. The issue will happened on
> ON COMMIT DELETE temporary relations. If one session only create a few temporary relations,
> well, it's fine. But if one session creates plenty of ON COMMIT DELETE kind temporary relations,
> say 3,000, it will face a significant performance degradation issue.

Do you think that's supposed to be free?

> To overcome this issue, A new list named in_use has been introduced to record the actually
> accessed temporary relations, and then will do the truncate only on the actually accessed
> temporary relations. And it seems works well.

I do not think this is something we ought to consider.  It might help
certain corner use-cases, but it's probably a net loss for most.
In particular, I don't think that creating thousands of temp tables in
a session but then touching only a few of them in any one transaction
is a very plausible usage pattern.

            regards, tom lane

Re: Performance issue on temporary relations

От

"章晨曦"

Дата:

19 августа, 18:45:03

> I do not think this is something we ought to consider.  It might help
> certain corner use-cases, but it's probably a net loss for most.
> In particular, I don't think that creating thousands of temp tables in
> a session but then touching only a few of them in any one transaction
> is a very plausible usage pattern.

Acturely, we just facing such problem in some real systems. More than 3,700
temporary tables created! I accept such case is not that common, but it does exist.

Regards,
Jet

Halo Tech (www.halodbtech.com)
openHalo (www.openhalo.org)

Re: Performance issue on temporary relations

От

"David G. Johnston"

Дата:

19 августа, 18:55:41

On Tue, Aug 19, 2025 at 8:45 AM 章晨曦 <zhangchenxi@halodbtech.com> wrote:

> I do not think this is something we ought to consider. It might help
> certain corner use-cases, but it's probably a net loss for most.
> In particular, I don't think that creating thousands of temp tables in
> a session but then touching only a few of them in any one transaction
> is a very plausible usage pattern.

Acturely, we just facing such problem in some real systems. More than 3,700
temporary tables created! I accept such case is not that common, but it does exist.

It is unfair to add a performance penalty to everyone just because some people write bad code. I concur that adding complexity to the system to gracefully handle this corner-case doesn't seem justified. A use case description, not mere existence, is needed to provide such justification.

David J.

Re: Performance issue on temporary relations

От

Tom Lane

Дата:

19 августа, 19:02:39

"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Tue, Aug 19, 2025 at 8:45 AM 章晨曦 <zhangchenxi@halodbtech.com> wrote:
>> Acturely, we just facing such problem in some real systems. More than 3,700
>> temporary tables created! I accept such case is not that common, but it
>> does exist.

> It is unfair to add a performance penalty to everyone just because some
> people write bad code.  I concur that adding complexity to the system to
> gracefully handle this corner-case doesn't seem justified.  A use case
> description, not mere existence, is needed to provide such justification.

Yeah, the real sub-text here is "should we believe that this
application is well designed?".  It sounds like a fairly brute-force
solution.

            regards, tom lane

Re: Performance issue on temporary relations

От

"章晨曦"

Дата:

19 августа, 19:18:34

>> It is unfair to add a performance penalty to everyone just because some
>> people write bad code.  I concur that adding complexity to the system to
>> gracefully handle this corner-case doesn't seem justified.  A use case
>> description, not mere existence, is needed to provide such justification.

> Yeah, the real sub-text here is "should we believe that this
> application is well designed?".  It sounds like a fairly brute-force
> solution.

I agree the application is not well designed for PostgreSQL because it was migrated
from Oracle, and may not do such optimization. But back to this issue, even though
we only create 10 temporary relations, it will cause 10 truncates on every transaction.
Is that a good design?

Regards,
Jet

Halo Tech (www.halodbtech.com)
openHalo (www.openhalo.org)

Re: Performance issue on temporary relations

От

Tom Lane

Дата:

19 августа, 19:44:06

"=?utf-8?B?56ug5pmo5pum?=" <zhangchenxi@halodbtech.com> writes:
> I agree the application is not well designed for PostgreSQL because it was migrated
> from Oracle, and may not do such optimization. But back to this issue, even though
> we only create 10 temporary relations, it will cause 10 truncates on every transaction.
> Is that a good design?

[ shrug... ]  If you create an ON COMMIT DELETE temp table, you
are explicitly asking for a truncation to happen at every commit.
I don't think you have much room to beef about the fact that one
happens.

Maybe you could merge some of these tables (adding an additional
key column, probably) so that a single truncate suffices for all?

            regards, tom lane

Re: Performance issue on temporary relations

От

"章晨曦"

Дата:

19 августа, 20:06:20

> [ shrug... ]  If you create an ON COMMIT DELETE temp table, you
> are explicitly asking for a truncation to happen at every commit.
> I don't think you have much room to beef about the fact that one
> happens.

Yes. ON COMMIT DELETE temp table will be truncated at every commit.
But if we can control that only accessed temp tables will be truncated
may be better. When an temp tables accessed, it will be stored in in_use
list, and when on commit, it will check the in_use list if a truncation needed 
to happen on that temp table, and reduce unnecessary truncations.

Regards,
Jet

Halo Tech (www.halodbtech.com)
openHalo (www.openhalo.org)

Re: Performance issue on temporary relations

От

Tom Lane

Дата:

19 августа, 20:37:01

"=?utf-8?B?56ug5pmo5pum?=" <zhangchenxi@halodbtech.com> writes:
> Yes. ON COMMIT DELETE temp table will be truncated at every commit.
> But if we can control that only accessed temp tables will be truncated
> may be better. When an temp tables accessed, it will be stored in in_use
> list, and when on commit, it will check the in_use list if a truncation needed
> to happen on that temp table, and reduce unnecessary truncations.

The problem with this proposal is that you are ignoring the cost
of maintaining that list.  That's going to slow down every operation
on temp tables, and in common scenarios (where a truncate would have
to happen anyway) applications will get zero benefit for the extra
overhead.  So I'm not excited about adding complexity and long-term
maintenance burden to do this.

BTW, it appears to me that doing it this way is O(N^2) in the number
of active temp tables.  So it's not hard to believe that the patch
as-presented would actually be a fairly serious performance drag for
some use cases with lots of temp tables.  There are certainly ways
we could do better than that (hash table, bloom filter, etc) but
there would be even more engineering effort needed.

And it's also fair to wonder if you've found all the places where we'd
need to mark temp tables dirty, and what we'd need to do to be sure
we didn't introduce any oops-forgot-to-mark-it-dirty bugs in future.
If you search the commit log for mentions of bugs associated with
ON COMMIT DELETE, you'll find quite a few, which is one reason why
I'm allergic to adding more complexity here.

In short, I think you're underestimating the engineering costs and
overestimating the benefit of doing this.

            regards, tom lane

Re: Performance issue on temporary relations

От

"章晨曦"

Дата:

19 августа, 21:13:01

> BTW, it appears to me that doing it this way is O(N^2) in the number
> of active temp tables.  So it's not hard to believe that the patch
> as-presented would actually be a fairly serious performance drag for
> some use cases with lots of temp tables.  There are certainly ways
> we could do better than that (hash table, bloom filter, etc) but
> there would be even more engineering effort needed.

Yes, you're right. I also consider using like hash table to do more better and try
to merge the in_use list and on_commits list into one hashtable. But, as just you
said, it needs much more effort. Thanks any way.

Regards,
Jet

Halo Tech (www.halodbtech.com)
openHalo (www.openhalo.org)

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Performance issue on temporary relations

Вложения