Re: Parallel Full Hash Join
От | Melanie Plageman |
---|---|
Тема | Re: Parallel Full Hash Join |
Дата | |
Msg-id | CAAKRu_YX8PdxHbqV4wEk6v-ivMTwN+tcYG8AtTe3HT3gz=ewmA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Parallel Full Hash Join (Thomas Munro <thomas.munro@gmail.com>) |
Ответы |
Re: Parallel Full Hash Join
|
Список | pgsql-hackers |
On Fri, Nov 26, 2021 at 3:11 PM Thomas Munro <thomas.munro@gmail.com> wrote: > > On Sun, Nov 21, 2021 at 4:48 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > On Wed, Nov 17, 2021 at 01:45:06PM -0500, Melanie Plageman wrote: > > > Yes, this looks like that issue. > > > > > > I've attached a v8 set with the fix I suggested in [1] included. > > > (I added it to 0001). > > > > This is still crashing :( > > https://cirrus-ci.com/task/6738329224871936 > > https://cirrus-ci.com/task/4895130286030848 > > I added a core file backtrace to cfbot's CI recipe a few days ago, so > now we have: > > https://cirrus-ci.com/task/5676480098205696 > > #3 0x00000000009cf57e in ExceptionalCondition (conditionName=0x29cae8 > "BarrierParticipants(&accessor->shared->batch_barrier) == 1", > errorType=<optimized out>, fileName=0x2ae561 "nodeHash.c", > lineNumber=lineNumber@entry=2224) at assert.c:69 > No locals. > #4 0x000000000071575e in ExecParallelScanHashTableForUnmatched > (hjstate=hjstate@entry=0x80a60a3c8, > econtext=econtext@entry=0x80a60ae98) at nodeHash.c:2224 I believe this assert can be safely removed. It is possible for a worker to attach to the batch barrier after the "last" worker was elected to scan and emit unmatched inner tuples. This is safe because the batch barrier is already in phase PHJ_BATCH_SCAN and this newly attached worker will simply detach from the batch barrier and look for a new batch to work on. The order of events would be as follows: W1: advances batch to PHJ_BATCH_SCAN W2: attaches to batch barrier in ExecParallelHashJoinNewBatch() W1: calls ExecParallelScanHashTableForUnmatched() (2 workers attached to barrier at this point) W2: detaches from the batch barrier The attached v10 patch removes this assert and updates the comment in ExecParallelScanHashTableForUnmatched(). I'm not sure if I should add more detail about this scenario in ExecParallelHashJoinNewBatch() under PHJ_BATCH_SCAN or if the detail in ExecParallelScanHashTableForUnmatched() is sufficient. - Melanie
Вложения
В списке pgsql-hackers по дате отправления: