Re: Whether to back-patch fix for aggregate transtype width estimates

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Whether to back-patch fix for aggregate transtype width estimates
Дата
Msg-id 14124.1466266457@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Whether to back-patch fix for aggregate transtype width estimates  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Whether to back-patch fix for aggregate transtype width estimates  (Greg Stark <stark@mit.edu>)
Re: Whether to back-patch fix for aggregate transtype width estimates  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Jun 17, 2016 at 10:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Ordinarily I'd just summarily back-patch a fix, but that commit shipped
>> in 9.0, which means it's been broken a long time.  I'm worried that
>> back-patching a change might be more likely to destabilize plan choices
>> than to do anything anybody's happy about.

> I suspect the consequences here aren't too bad, or someone would have
> noticed by now.  So I would be tempted to leave it alone in
> back-branches.  But I might change my mind if it's actually awful...

Well, you can construct scenarios where it would cause failures.
Consider "SELECT max(varchar_col) FROM tab GROUP BY foo".  The planner
will need to estimate the size of the hash table to decide whether
hash-style aggregation is OK.  In all 8.x releases, it would use the
varchar_col's typmod (max width) to determine the per-aggregate trans
value space requirement.  In 9.x, that's broken and it falls back to
get_typavgwidth's default guess of 32 bytes.  If what you've actually
got is, say, varchar(255) and most of the entries actually approach
that length, this could result in a drastic underestimate, possibly
leading to OOM from hash table growth.

However, I can't recall many field reports that seem to match that
theory, so in practice it's probably pretty rare.  It's certainly not
going to help people who declare their wide columns as "text"
not "varchar(n)".

Thinking about this more, I wonder why we delegate to get_typavgwidth
at all; if the input is a Var seems like we should go to pg_statistic
for a column width estimate.  But that's definitely not something to
back-patch.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Questionabl description in datatype.sgml
Следующее
От: Tom Lane
Дата:
Сообщение: Re: New design for FK-based join selectivity estimation