Re: pg_stat_statements fingerprinting logic and ArrayExpr
От | Robert Haas |
---|---|
Тема | Re: pg_stat_statements fingerprinting logic and ArrayExpr |
Дата | |
Msg-id | CA+Tgmob=1BJp-w3DobvT50WPGoG7_CdVXbZz_LY7=rkN0f_sCQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: pg_stat_statements fingerprinting logic and ArrayExpr (Andres Freund <andres@2ndquadrant.com>) |
Ответы |
Re: pg_stat_statements fingerprinting logic and ArrayExpr
Re: pg_stat_statements fingerprinting logic and ArrayExpr Re: pg_stat_statements fingerprinting logic and ArrayExpr |
Список | pgsql-hackers |
On Tue, Dec 10, 2013 at 5:38 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2013-12-10 14:30:36 -0800, Peter Geoghegan wrote: >> Did you really find pg_stat_statements to be almost useless in such >> situations? That seems worse than I thought. > > It's very hard to see where you should spend efforts when every "logical > query" is split into hundreds of pg_stat_statement entries. Suddenly > it's important whether a certain counts of parameters are more frequent > than others because in the equally distributed cases they fall out of > p_s_s again pretty soon. I think that's probably a worse than average > case, but certainly not something only I could have the bad fortune of > looking at. Right, but the flip side is that you could collapse things that people don't want collapsed. If you've got lots of query that differ only in that some of them say user_id IN (const1, const2) and others say user_id IN (const1, const2, const3) and the constants vary a lot, then of course this seems attractive. On the other hand if you have two queries and one of them looks like this: WHERE status IN ('active') AND user_id = ? and the other looks like this: WHERE status IN ('inactive', 'deleted') AND user_id = ? ...it might actually annoy you to have those two things conflated; it's easy to imagine one having much different performance characteristics than the other. Part of me wonders if the real solution here is to invent a way to support an arbitrarily large hash table of entries efficiently, and then let people do further roll-ups of the data in userland if they don't like our rollups. Part of the pain here is that when you overflow the hash table, you start losing information that can't be recaptured after the fact. If said hash table were by chance also suitable for use as part of the stats infrastructure, in place of the whole-file-rewrite technique we use today, massive win. Of course, even if we had all this, it necessarily make doing additional rollups *easy*; it's easy to construct cases that can be handled much better given access to the underlying parse tree representation than they can be with sed and awk. But it's a thought. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: