Re: BUG #18141: sorry, too many clients error occurring very frequently
От | Tom Lane |
---|---|
Тема | Re: BUG #18141: sorry, too many clients error occurring very frequently |
Дата | |
Msg-id | 1860466.1696189555@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: BUG #18141: sorry, too many clients error occurring very frequently (Joe Conway <mail@joeconway.com>) |
Список | pgsql-bugs |
Joe Conway <mail@joeconway.com> writes: > On 10/1/23 14:09, Tom Lane wrote: >> Joe Conway <mail@joeconway.com> writes: >>> But it is possible that something else backpatched to both branches >>> between Aug 09, 2021 and Aug 8, 2022 has caused a more general >>> performance regression which we have yet to track down. >> Hmm. My first instinct is to wonder about changes in plan selection. >> How complex were the troublesome queries? > I think both of these cases involved a number of common attributes: > * The queries against pg_stat_statements were > relatively complex > * The other queries on the system were relatively > long and complex (and thus the query string length > in pg_stat_statements) > * Prior to the upgrade the systems were overall > keeping up, but extremely busy > In one case it seems that the upgrade caused a significant increase of > temp file usage. This impacted the system enough that other active > queries took longer, and thus number of active connections increased. > Raising work_mem eliminated the temp file usage and cpu loads dropped > back to similar levels as they were prior to the minor upgrade. Interesting. After some desultory trawling through the commit log, I'm wondering if there could be some connection to Author: Tom Lane <tgl@sss.pgh.pa.us> Branch: REL_13_STABLE Release: REL_13_5 [dede14399] 2021-09-20 11:48:52 -0400 Branch: REL_12_STABLE Release: REL_12_9 [f230614da] 2021-09-20 11:48:52 -0400 Branch: REL_11_STABLE Release: REL_11_14 [914e54501] 2021-09-20 11:48:52 -0400 Branch: REL_10_STABLE Release: REL_10_19 [923b7efc2] 2021-09-20 11:48:52 -0400 Branch: REL9_6_STABLE Release: REL9_6_24 [183b3aced] 2021-09-20 11:48:52 -0400 Don't elide casting to typmod -1. That's in the right timeframe for these reports. A fairly direct connection to pg_stat_statements could be made: addition of RelabelType nodes that weren't there before would change the query ID hash of affected statements, so that after running for a little while the pg_stat_statements hash would have a bunch of duplicate entries (assuming you didn't bump up against pg_stat_statements.max). As long as nothing happened to age out the now-dead entries with the old query hashes, you'd have more rows in the pg_stat_statements view than before, which could explain performance decreases in queries on that view. The big hole in this theory is that I would not have expected this casting change to affect any large proportion of SQL commands, so that it's a bit hard to credit it causing a lot of bloat in pg_stat_statements. Still, some specific coding habit or DDL detail could maybe allow that to happen in a particular application. Another line of thought is that the extra RelabelType nodes could block a planner optimization that used to occur before. It's not apparent why that would manifest specifically in connection with pg_stat_statements queries though. If that is the problem, we'd need an example of a query whose plan changed in order to pin down the cause. Anyway, that's just a theory, and it might be hot air. regards, tom lane
В списке pgsql-bugs по дате отправления: