Обсуждение: Performance issue on temporary relations
Hi there! Recently I noticed a performance issue on temporary relation. The issue will happened on ON COMMIT DELETE temporary relations. If one session only create a few temporary relations, well, it's fine. But if one session creates plenty of ON COMMIT DELETE kind temporary relations, say 3,000, it will face a significant performance degradation issue. Check below: One temporary relation +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ psql (19devel) Type "help" for help. postgres=# CREATE LOCAL TEMP TABLE a_lttk1(n INT) ON COMMIT DELETE ROWS; CREATE TABLE postgres=# \timing Timing is on. postgres=# SELECT count(*) FROM a_lttk1; count ------- 0 (1 row) Time: 3.004 ms postgres=# SELECT count(*) FROM a_lttk1; count ------- 0 (1 row) Time: 3.884 ms postgres=# SELECT count(*) FROM a_lttk1; count ------- 0 (1 row) Time: 4.041 ms postgres=# SELECT count(*) FROM a_lttk1; count ------- 0 (1 row) Time: 3.827 ms postgres=# SELECT count(*) FROM a_lttk1; count ------- 0 (1 row) Time: 3.783 ms 3,000 temporary relation +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ psql (19devel) Type "help" for help. postgres=# DO $$ postgres$# DECLARE postgres$# v_sql VARCHAR(100); postgres$# BEGIN postgres$# FOR i IN 1..3000 LOOP postgres$# v_sql := 'CREATE LOCAL TEMP TABLE a_lttk'||i||'(n INT) ON COMMIT DELETE ROWS'; postgres$# EXECUTE v_sql; postgres$# END LOOP; postgres$# END; postgres$# $$ LANGUAGE plpgsql; DO postgres=# \timing Timing is on. postgres=# SELECT count(*) FROM a_lttk1; count ------- 0 (1 row) Time: 45.471 ms postgres=# SELECT count(*) FROM a_lttk1; count ------- 0 (1 row) Time: 27.320 ms postgres=# SELECT count(*) FROM a_lttk1; count ------- 0 (1 row) Time: 27.482 ms postgres=# SELECT count(*) FROM a_lttk1; count ------- 0 (1 row) Time: 26.907 ms postgres=# SELECT count(*) FROM a_lttk1; count ------- 0 (1 row) Time: 31.055 ms postgres=# SELECT count(*) FROM a_lttk1; count ------- 0 (1 row) Time: 28.624 ms postgres=# SELECT count(*) FROM a_lttk1; count ------- 0 (1 row) Time: 25.277 ms The performance has decreased by nearly 10 times. The reason is we just check if there is operation on any ON COMMIT DELETE kind temporary relations. Regardless of how many temporary tables are actually accessed, even if only one is accessed, it will do the truncate on all the temporary tables. To overcome this issue, A new list named in_use has been introduced to record the actually accessed temporary relations, and then will do the truncate only on the actually accessed temporary relations. And it seems works well. After patch: +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ psql (19devel) Type "help" for help. postgres=# DO $$ postgres$# DECLARE postgres$# v_sql VARCHAR(100); postgres$# BEGIN postgres$# FOR i IN 1..3000 LOOP postgres$# v_sql := 'CREATE LOCAL TEMP TABLE a_lttk'||i||'(n INT) ON COMMIT DELETE ROWS'; postgres$# EXECUTE v_sql; postgres$# END LOOP; postgres$# END; postgres$# $$ LANGUAGE plpgsql; DO postgres=# \timing Timing is on. postgres=# SELECT count(*) FROM a_lttk1; count ------- 0 (1 row) Time: 5.253 ms postgres=# SELECT count(*) FROM a_lttk1; count ------- 0 (1 row) Time: 5.512 ms postgres=# SELECT count(*) FROM a_lttk1; count ------- 0 (1 row) Time: 5.095 ms postgres=# SELECT count(*) FROM a_lttk1; count ------- 0 (1 row) Time: 5.119 ms postgres=# SELECT count(*) FROM a_lttk1; count ------- 0 (1 row) Time: 5.057 ms postgres=# SELECT count(*) FROM a_lttk1; count ------- 0 (1 row) Time: 5.006 ms Regards, Jet Halo Tech (www.halodbtech.com) openHalo (www.openhalo.org)
Вложения
"=?utf-8?B?56ug5pmo5pum?=" <zhangchenxi@halodbtech.com> writes: > Recently I noticed a performance issue on temporary relation. The issue will happened on > ON COMMIT DELETE temporary relations. If one session only create a few temporary relations, > well, it's fine. But if one session creates plenty of ON COMMIT DELETE kind temporary relations, > say 3,000, it will face a significant performance degradation issue. Do you think that's supposed to be free? > To overcome this issue, A new list named in_use has been introduced to record the actually > accessed temporary relations, and then will do the truncate only on the actually accessed > temporary relations. And it seems works well. I do not think this is something we ought to consider. It might help certain corner use-cases, but it's probably a net loss for most. In particular, I don't think that creating thousands of temp tables in a session but then touching only a few of them in any one transaction is a very plausible usage pattern. regards, tom lane
> I do not think this is something we ought to consider. It might help > certain corner use-cases, but it's probably a net loss for most. > In particular, I don't think that creating thousands of temp tables in > a session but then touching only a few of them in any one transaction > is a very plausible usage pattern. Acturely, we just facing such problem in some real systems. More than 3,700 temporary tables created! I accept such case is not that common, but it does exist. Regards, Jet Halo Tech (www.halodbtech.com) openHalo (www.openhalo.org)
On Tue, Aug 19, 2025 at 8:45 AM 章晨曦 <zhangchenxi@halodbtech.com> wrote:
> I do not think this is something we ought to consider. It might help
> certain corner use-cases, but it's probably a net loss for most.
> In particular, I don't think that creating thousands of temp tables in
> a session but then touching only a few of them in any one transaction
> is a very plausible usage pattern.
Acturely, we just facing such problem in some real systems. More than 3,700
temporary tables created! I accept such case is not that common, but it does exist.
It is unfair to add a performance penalty to everyone just because some people write bad code. I concur that adding complexity to the system to gracefully handle this corner-case doesn't seem justified. A use case description, not mere existence, is needed to provide such justification.
David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes: > On Tue, Aug 19, 2025 at 8:45 AM 章晨曦 <zhangchenxi@halodbtech.com> wrote: >> Acturely, we just facing such problem in some real systems. More than 3,700 >> temporary tables created! I accept such case is not that common, but it >> does exist. > It is unfair to add a performance penalty to everyone just because some > people write bad code. I concur that adding complexity to the system to > gracefully handle this corner-case doesn't seem justified. A use case > description, not mere existence, is needed to provide such justification. Yeah, the real sub-text here is "should we believe that this application is well designed?". It sounds like a fairly brute-force solution. regards, tom lane
>> It is unfair to add a performance penalty to everyone just because some >> people write bad code. I concur that adding complexity to the system to >> gracefully handle this corner-case doesn't seem justified. A use case >> description, not mere existence, is needed to provide such justification. > Yeah, the real sub-text here is "should we believe that this > application is well designed?". It sounds like a fairly brute-force > solution. I agree the application is not well designed for PostgreSQL because it was migrated from Oracle, and may not do such optimization. But back to this issue, even though we only create 10 temporary relations, it will cause 10 truncates on every transaction. Is that a good design? Regards, Jet Halo Tech (www.halodbtech.com) openHalo (www.openhalo.org)
"=?utf-8?B?56ug5pmo5pum?=" <zhangchenxi@halodbtech.com> writes: > I agree the application is not well designed for PostgreSQL because it was migrated > from Oracle, and may not do such optimization. But back to this issue, even though > we only create 10 temporary relations, it will cause 10 truncates on every transaction. > Is that a good design? [ shrug... ] If you create an ON COMMIT DELETE temp table, you are explicitly asking for a truncation to happen at every commit. I don't think you have much room to beef about the fact that one happens. Maybe you could merge some of these tables (adding an additional key column, probably) so that a single truncate suffices for all? regards, tom lane
> [ shrug... ] If you create an ON COMMIT DELETE temp table, you > are explicitly asking for a truncation to happen at every commit. > I don't think you have much room to beef about the fact that one > happens. Yes. ON COMMIT DELETE temp table will be truncated at every commit. But if we can control that only accessed temp tables will be truncated may be better. When an temp tables accessed, it will be stored in in_use list, and when on commit, it will check the in_use list if a truncation needed to happen on that temp table, and reduce unnecessary truncations. Regards, Jet Halo Tech (www.halodbtech.com) openHalo (www.openhalo.org)
"=?utf-8?B?56ug5pmo5pum?=" <zhangchenxi@halodbtech.com> writes: > Yes. ON COMMIT DELETE temp table will be truncated at every commit. > But if we can control that only accessed temp tables will be truncated > may be better. When an temp tables accessed, it will be stored in in_use > list, and when on commit, it will check the in_use list if a truncation needed > to happen on that temp table, and reduce unnecessary truncations. The problem with this proposal is that you are ignoring the cost of maintaining that list. That's going to slow down every operation on temp tables, and in common scenarios (where a truncate would have to happen anyway) applications will get zero benefit for the extra overhead. So I'm not excited about adding complexity and long-term maintenance burden to do this. BTW, it appears to me that doing it this way is O(N^2) in the number of active temp tables. So it's not hard to believe that the patch as-presented would actually be a fairly serious performance drag for some use cases with lots of temp tables. There are certainly ways we could do better than that (hash table, bloom filter, etc) but there would be even more engineering effort needed. And it's also fair to wonder if you've found all the places where we'd need to mark temp tables dirty, and what we'd need to do to be sure we didn't introduce any oops-forgot-to-mark-it-dirty bugs in future. If you search the commit log for mentions of bugs associated with ON COMMIT DELETE, you'll find quite a few, which is one reason why I'm allergic to adding more complexity here. In short, I think you're underestimating the engineering costs and overestimating the benefit of doing this. regards, tom lane
> BTW, it appears to me that doing it this way is O(N^2) in the number > of active temp tables. So it's not hard to believe that the patch > as-presented would actually be a fairly serious performance drag for > some use cases with lots of temp tables. There are certainly ways > we could do better than that (hash table, bloom filter, etc) but > there would be even more engineering effort needed. Yes, you're right. I also consider using like hash table to do more better and try to merge the in_use list and on_commits list into one hashtable. But, as just you said, it needs much more effort. Thanks any way. Regards, Jet Halo Tech (www.halodbtech.com) openHalo (www.openhalo.org)