Re: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker
От | Thomas Munro |
---|---|
Тема | Re: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker |
Дата | |
Msg-id | CA+hUKGJvi2V294+r3LEDAV+DU_YD2J_79ZVGzKMkAKb3W_mw6w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Ответы |
Re: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker
|
Список | pgsql-bugs |
On Mon, Mar 4, 2024 at 12:49 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > On 3/3/24 23:12, Thomas Munro wrote: > > But the real long term question is what strategy > > we're going to use to actually deal with this situation properly > > *without* giving up our memory usage policies and hoping for the best, > > and that remains an open question. To summarise the two main ideas > > put forward so far: (1) allow very high number of batches, but process > > at most N of M batches at a time, using a temporary "all-the-rest" > > batch to be re-partitioned to feed the next N batches + the rest in a > > later cycle, (2) fall back to looping over batches multiple times in > > order to keep nbatches <= a small limit while also not exceeding a > > hash table size limit. > > I'm not sure about (2), but (1) sounds very much like a recursive hash > join, where we'd allow only a limited fan out at each stage. It's also a > bit like the "spill" files in my other proposal - that worked fine, and > it enforced the memory limit better than the memory balancing (which is > more like best-effort heuristics). Yeah, I was trying to describe your spill idea in few words. It's a good idea. > > Both have some tricky edge cases, especially > > with parallelism in the picture but probably even without it. I'm > > willing to work more on exploring this some time after the 17 cycle. > > I haven't thought very much about parallel hash joins, but what would be > the corner cases for non-parallel cases? Sure, it may increase the > amount of I/O, but only for cases with unexpectedly many batches (and > then it's just a natural trade-off I/O vs. enforcing memory limit). For parallel-specific problems I'd have to swap that back into my brain... but I do remember one general problem with idea #1: if you have too many duplicate keys, then anything based only on partition by hash bits won't help, which was one of my objections to that idea and the reason why I was talking about idea #2, which doesn't suffer from that problem (but has other problems). This morning I've been wondering about a highly targeted solution to that: could we recognise this condition cheaply, and then activate a special case that spills a *bucket* to disk? That is, a problem bucket that is indivisible by hashing because all the hash values are the same. Or something like that.
В списке pgsql-bugs по дате отправления: