Re: -HEAD planner issue wrt hash_joins on dbt3 ?
От | Stefan Kaltenbrunner |
---|---|
Тема | Re: -HEAD planner issue wrt hash_joins on dbt3 ? |
Дата | |
Msg-id | 45068F46.9070603@kaltenbrunner.cc обсуждение исходный текст |
Ответ на | Re: -HEAD planner issue wrt hash_joins on dbt3 ? (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
Tom Lane wrote: > Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes: >> btw - the "hashjoin is bad" was more or less based on the observation >> that nearly all of the cpu is burned in hash-related functions in the >> profile (when profiling over a longer period of time those accumulate >> even more % of the time than in the short profile I included in the >> original report) > > [ shrug... ] Two out of the three functions you mentioned are not used > by hash join, and anyway the other plan probably has a comparable > execution density in sort-related functions; does that make it bad? hmm sorry for that - I should have checked the source before I made that assumption :-( > > It's possible that the large time for ExecScanHashBucket has something > to do with skewed usage of the hash buckets due to an unfortunate data > distribution, but that's theorizing far in advance of the data. http://www.kaltenbrunner.cc/files/4/ has preliminary data of the dbt3/scaling 10 run I did which seems to imply we have at least 4 queries in there that take an excessive amount of time (query 5 is the one I started the complaint with). However those results have to be taken with a graint of salt since there is an appearant bug in the dbt3 code which seems to rely on add_missing_from=on (as can be seen in some of the errorlogs of the database) and towards the end of the throughput run I did some of the explain analyzes for the report (those are the small 100% spikes in the graph due to the box using the second CPU to run them). I will redo those tests later this week though ... Stefan
В списке pgsql-hackers по дате отправления: