Re: Parallel Full Hash Join
От | Melanie Plageman |
---|---|
Тема | Re: Parallel Full Hash Join |
Дата | |
Msg-id | CAAKRu_ZraYTHdfNA=sGqt9J+hsoKSas5wr4PBrtmVe_tc2+qbw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Parallel Full Hash Join (Zhihong Yu <zyu@yugabyte.com>) |
Ответы |
Re: Parallel Full Hash Join
|
Список | pgsql-hackers |
On Fri, Apr 2, 2021 at 3:06 PM Zhihong Yu <zyu@yugabyte.com> wrote: > > Hi, > For v6-0003-Parallel-Hash-Full-Right-Outer-Join.patch > > + * current_chunk_idx: index in current HashMemoryChunk > > The above comment seems to be better fit for ExecScanHashTableForUnmatched(), instead of ExecParallelPrepHashTableForUnmatched. > I wonder where current_chunk_idx should belong (considering the above comment and what the code does). > > + while (hashtable->current_chunk_idx < hashtable->current_chunk->used) > ... > + next = hashtable->current_chunk->next.unshared; > + hashtable->current_chunk = next; > + hashtable->current_chunk_idx = 0; > > Each time we advance to the next chunk, current_chunk_idx is reset. It seems current_chunk_idx can be placed inside chunk. > Maybe the consideration is that, with the current formation we save space by putting current_chunk_idx field at a higherlevel. > If that is the case, a comment should be added. > Thank you for the review. I think that moving the current_chunk_idx into the HashMemoryChunk would probably take up too much space. Other places that we loop through the tuples in the chunk, we are able to just keep a local idx, like here in ExecParallelHashIncreaseNumBuckets(): case PHJ_GROW_BUCKETS_REINSERTING: ... while ((chunk = ExecParallelHashPopChunkQueue(hashtable, &chunk_s))) { size_t idx = 0; while (idx < chunk->used) but, since we cannot do that while also emitting tuples, I thought, let's just stash the index in the hashtable for use in serial hash join and the batch accessor for parallel hash join. A comment to this effect sounds good to me.
В списке pgsql-hackers по дате отправления: