Re: pg_stat_statements fingerprinting logic and ArrayExpr
От | Andres Freund |
---|---|
Тема | Re: pg_stat_statements fingerprinting logic and ArrayExpr |
Дата | |
Msg-id | 20131210225538.GC7730@awork2.anarazel.de обсуждение исходный текст |
Ответ на | Re: pg_stat_statements fingerprinting logic and ArrayExpr (Robert Haas <robertmhaas@gmail.com>) |
Список | pgsql-hackers |
On 2013-12-10 17:46:56 -0500, Robert Haas wrote: > On Tue, Dec 10, 2013 at 5:38 PM, Andres Freund <andres@2ndquadrant.com> wrote: > > On 2013-12-10 14:30:36 -0800, Peter Geoghegan wrote: > >> Did you really find pg_stat_statements to be almost useless in such > >> situations? That seems worse than I thought. > > > > It's very hard to see where you should spend efforts when every "logical > > query" is split into hundreds of pg_stat_statement entries. Suddenly > > it's important whether a certain counts of parameters are more frequent > > than others because in the equally distributed cases they fall out of > > p_s_s again pretty soon. I think that's probably a worse than average > > case, but certainly not something only I could have the bad fortune of > > looking at. > > Right, but the flip side is that you could collapse things that people > don't want collapsed. If you've got lots of query that differ only in > that some of them say user_id IN (const1, const2) and others say > user_id IN (const1, const2, const3) and the constants vary a lot, then > of course this seems attractive. Yea, completely agreed. It might also lead to users missing the fact that their precious prepared-statement cache is just using up loads of backend memory and individual prepared statements are seldomly re-executed because there are so many... > On the other hand if you have two > queries and one of them looks like this: > > WHERE status IN ('active') AND user_id = ? > > and the other looks like this: > > WHERE status IN ('inactive', 'deleted') AND user_id = ? That too. > Part of me wonders if the real solution here is to invent a way to > support an arbitrarily large hash table of entries efficiently, and > then let people do further roll-ups of the data in userland if they > don't like our rollups. Part of the pain here is that when you > overflow the hash table, you start losing information that can't be > recaptured after the fact. If said hash table were by chance also > suitable for use as part of the stats infrastructure, in place of the > whole-file-rewrite technique we use today, massive win. > > Of course, even if we had all this, it necessarily make doing > additional rollups *easy*; it's easy to construct cases that can be > handled much better given access to the underlying parse tree > representation than they can be with sed and awk. But it's a thought. That would obviously be neat, but I have roughly no clue how to achieve that. Granular control over how such rollups would work sounds very hard to achieve unless that granular control just is getting passed a tree and returning another. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
В списке pgsql-hackers по дате отправления: