Re: Hash id in pg_stat_statements
От | Daniel Farina |
---|---|
Тема | Re: Hash id in pg_stat_statements |
Дата | |
Msg-id | CAAZKuFYMos+X6+D0TCjBRyRFyZ6ECJTkxTWiPWgMdndvkZ9a+g@mail.gmail.com обсуждение исходный текст |
Ответ на | Hash id in pg_stat_statements (Magnus Hagander <magnus@hagander.net>) |
Ответы |
Re: Hash id in pg_stat_statements
|
Список | pgsql-hackers |
On Mon, Oct 1, 2012 at 12:57 AM, Magnus Hagander <magnus@hagander.net> wrote: > Can we please expose the internal hash id of the statements in > pg_stat_statements? > > I know there was discussions about it earlier, and it wasn't done with > an argument of it not being stable between releases (IIRC). I think we > can live with that drawback, assuming of course that we document this > properly. > > I've now run into multiple customer installations where it would be > very useful to have. The usecase is mainly storing snapshots of the > pg_stat_statements output over time and analyzing those. Weird things > happen for example when the query text is the same, but the hash is > different (which can happen for example when a table is dropped and > recreated). And even without that, in order to do anything useful with > it, you end up hashing the query text anyway - so using the already > existing hash would be easier and more useful. I have a similar problem, however, I am not sure if the hash generated is ideal. Putting aside the number of mechanical, versioning, shut-down/stats files issues, etc reasons given in the main branch of the thread, I also have this feeling that it is not what I want. Consider the following case: SELECT * FROM users WHERE id = ? <this query isn't seen for a while> SELECT * FROM users WHERE id = ? In the intervening time, an equivalent hash could still be evicted and reintroduced and the statistics silently reset, and that'll befuddle principled tools. This is worse than merely less-useful, because it can lead to drastic underestimations that otherwise pass inspection. Instead, I think it makes sense to assign a number -- arbitrarily, but uniquely -- to the generation of a new row in pg_stat_statements, and, on the flip side, whenever a row is retired its number should be eliminated, practically, for-ever. This way re-introductions between two samplings of pg_stat_statements cannot be confused for a contiguously maintained statistic on a query. -- fdr
В списке pgsql-hackers по дате отправления: