Show a human-readable n_distinct in pg_stats view
От | Maxence Ahlouche |
---|---|
Тема | Show a human-readable n_distinct in pg_stats view |
Дата | |
Msg-id | CAJeaomWnmAvyNZaJL9dByyOmr=h=b7azVBXwusF2t7hFiE8kbg@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: Show a human-readable n_distinct in pg_stats view
Re: Show a human-readable n_distinct in pg_stats view |
Список | pgsql-hackers |
Hi, It seems to me that since the pg_stats view is supposed to be human-readable, it would make sense to show a human-readable version of n_distinct. Currently, when the stats collector estimates that the number of distinct values is more than 10% of the total row count, what is stored in pg_statistic.stadistinct is -1 * n_distinct / totalrows, the rationale being that if new rows are inserted in the table, they are likely to introduce new values, and storing that value allows the stadistinct not to get stale too fast. You can find attached a simple WIP patch to show the proper n_distinct value in pg_stats. * Is this desired? * Would it make sense to add a column in the pg_stats view to display the information "lost", that is the fact that postgres will assume that inserting new rows means a higher n_distinct? * Am I right to assume that totalrows in the code (src/backend/commands/analyze.c:2170) actually corresponds to n_live_tup? That's what I gathered from glancing at the code, but I might be wrong. * Should the catalog version be changed for this kind of change? * Should I add this patch to the commitfest? If this patch is actually desired, I'll update the documentation as well. I'm guessing this patch would break scripts relying on the pg_stats view, but I do not know how much we want to avoid that, since they should rely on the base tables rather than on the views. Thanks in advance for your input! Regards, Maxence Ahlouche
Вложения
В списке pgsql-hackers по дате отправления: