Re: why postgresql define NTUP_PER_BUCKET as 10, not other numbers smaller
От | Robert Haas |
---|---|
Тема | Re: why postgresql define NTUP_PER_BUCKET as 10, not other numbers smaller |
Дата | |
Msg-id | CA+TgmoZWzuX-KgKCa7hH0p2Fq9EZ6FMx5gX6z1NnnzwGFQ+sFg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: why postgresql define NTUP_PER_BUCKET as 10, not other numbers smaller (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
On Tue, Jun 10, 2014 at 10:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Tue, Jun 10, 2014 at 10:27 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I don't really recall any hard numbers being provided. I think if we >>> looked at some results that said "here's the average gain, and here's >>> the worst-case loss, and here's an estimate of how often you'd hit >>> the worst case", then we could make a decision. > >> The worst case loss is that you have to rescan the entire inner >> relation, so it's pretty darned bad. I'm not sure how to construct an >> optimal worst case fot that being monumentally expensive, but making >> the inner relation gigantic is probably a good start. > > "Rescan"? I'm pretty sure there are no cases where nodeHash reads the > inner relation more than once. If you mean dumping to disk vs not dumping > to disk, yeah, that's a big increment in the cost. Sorry, that's what I meant. >> If we could allow NTUP_PER_BUCKET to drop when the hashtable is >> expected to fit in memory either way, perhaps with some safety margin >> (e.g. we expect to use less than 75% of work_mem), I bet that would >> make the people who have been complaining about this issue happy. > > Could be a good plan. We still need some test results though. Sounds reasonable. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: