Re: Bad optimizer data for xml (WAS: xml data type implications of no =)
От | Mark Kirkwood |
---|---|
Тема | Re: Bad optimizer data for xml (WAS: xml data type implications of no =) |
Дата | |
Msg-id | 4C0837E0.7060706@catalyst.net.nz обсуждение исходный текст |
Ответ на | Re: xml data type implications of no = (Mark Kirkwood <mark.kirkwood@catalyst.net.nz>) |
Ответы |
Re: Bad optimizer data for xml (WAS: xml data type
implications of no =)
|
Список | pgsql-bugs |
On 27/05/10 13:37, Mark Kirkwood wrote: > On 25/05/10 16:43, Mark Kirkwood wrote: >> Today I ran into some interesting consequences of the xml data type >> being without an "=" operator. One I thought I'd post here because it >> has a *possible* planner impact. I'm not sure it is actually a bug as >> such, but this seemed the best forum to post in initially: >> >> test=# \d bug >> Table "public.bug" >> Column | Type | Modifiers >> --------+---------+----------- >> id | integer | >> val | xml | >> >> test=# explain select val::text from bug; >> QUERY PLAN >> -------------------------------------------------------------- >> Seq Scan on bug (cost=0.00..58127.78 rows=1000278 width=32) >> >> >> Note the width estimate. However a more realistic estimate for width is: >> >> test=# select 8192/(reltuples/relpages) as width from pg_class where >> relname='bug'; >> width >> ------------------ >> 394.130431739976 >> >> So we are going to massively underestimate the "size" of such a >> dataset. Now this appears to be a consequence of no "=" operator >> (std_typanalyze in analyze.c bails if there isn't one), so the >> planner has no idea about how wide 'val' actually is. I'm wondering >> if it is worth having at least an "=" operator to enable some minimal >> stats to be available for xml columns. >> > > Adding a minimal = op (see attached) and an analyze results in: > > test=# explain select val::text from bug; > QUERY PLAN > --------------------------------------------------------------- > Seq Scan on bug (cost=0.00..62632.08 rows=1000008 width=385) > > which gives a much better indication of dataset size. > > > Maybe I gave this guy a bad title - is it a concern that the 'width' estimate is so far off for xml datatypes (because of no = op)? It seemed to me that this could result in some bad plan choices (e.g in subqueries etc). regards Mark
В списке pgsql-bugs по дате отправления: