Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash
От | Thomas Munro |
---|---|
Тема | Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash |
Дата | |
Msg-id | CA+hUKGL0T7-Lsbp-dda5JWubAXEHMkL7j94-XW9ZhXDb4sn2+A@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash (James Coleman <jtc331@gmail.com>) |
Ответы |
Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash
|
Список | pgsql-bugs |
On Sun, Nov 10, 2019 at 3:25 PM James Coleman <jtc331@gmail.com> wrote: > So I should have run the earlier attached plan with VERBOSE, but > here's the interesting thing: the parallel hash node's seq scan node > outputs two columns: let's call them (from the redacted plan) > items.system_id and items.account_id. The first (system_id) is both > not null and unique; the second (account_id) definitely has massive > skew. I'm not very up-to-speed on how the hash building works, but I > would have (perhaps naïvely?) assumed that the first column being > unique would make the hash keys very likely not to collide in any > significantly skewed way. Am I missing something here? Hrm. So the compound key is unique then. I was assuming up until now that it had duplicates. The hashes of the individual keys are combined (see ExecHashGetHashValue()), so assuming there is nothing funky about the way citext gets hashed (and it's not jumping out at me), your unique keys should give you uniform hash values and thus partition size, and repartitioning should be an effective way of reducing hash table size. So now it sounds like you have a simple case of underestimation, but now I'm confused about how you got a 344MB hash table with work_mem = 150MB: Buckets: 4194304 (originally 4194304) Batches: 32768 (originally 4096) Memory Usage: 344448kB And I'm confused about what was different when it wanted the crazy number of batches.
В списке pgsql-bugs по дате отправления: