Re: index prefetching
От | Melanie Plageman |
---|---|
Тема | Re: index prefetching |
Дата | |
Msg-id | CAAKRu_ZPDhNwwFxQwS8NdeTFkycM1c=tNLKdU0J-M6KxCjdEmQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: index prefetching (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Ответы |
Re: index prefetching
Re: index prefetching |
Список | pgsql-hackers |
On Fri, Jan 12, 2024 at 11:42 AM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > On 1/9/24 21:31, Robert Haas wrote: > > On Thu, Jan 4, 2024 at 9:55 AM Tomas Vondra > > <tomas.vondra@enterprisedb.com> wrote: > >> Here's a somewhat reworked version of the patch. My initial goal was to > >> see if it could adopt the StreamingRead API proposed in [1], but that > >> turned out to be less straight-forward than I hoped, for two reasons: > > > > I guess we need Thomas or Andres or maybe Melanie to comment on this. > > > > Yeah. Or maybe Thomas if he has thoughts on how to combine this with the > streaming I/O stuff. I've been studying your patch with the intent of finding a way to change it and or the streaming read API to work together. I've attached a very rough sketch of how I think it could work. We fill a queue with blocks from TIDs that we fetched from the index. The queue is saved in a scan descriptor that is made available to the streaming read callback. Once the queue is full, we invoke the table AM specific index_fetch_tuple() function which calls pg_streaming_read_buffer_get_next(). When the streaming read API invokes the callback we registered, it simply dequeues a block number for prefetching. The only change to the streaming read API is that now, even if the callback returns InvalidBlockNumber, we may not be finished, so make it resumable. Structurally, this changes the timing of when the heap blocks are prefetched. Your code would get a tid from the index and then prefetch the heap block -- doing this until it filled a queue that had the actual tids saved in it. With my approach and the streaming read API, you fetch tids from the index until you've filled up a queue of block numbers. Then the streaming read API will prefetch those heap blocks. I didn't actually implement the block queue -- I just saved a single block number and pretended it was a block queue. I was imagining we replace this with something like your IndexPrefetch->blockItems -- which has light deduplication. We'd probably have to flesh it out more than that. There are also table AM layering violations in my sketch which would have to be worked out (not to mention some resource leakage I didn't bother investigating [which causes it to fail tests]). 0001 is all of Thomas' streaming read API code that isn't yet in master and 0002 is my rough sketch of index prefetching using the streaming read API There are also numerous optimizations that your index prefetching patch set does that would need to be added in some way. I haven't thought much about it yet. I wanted to see what you thought of this approach first. Basically, is it workable? - Melanie
Вложения
В списке pgsql-hackers по дате отправления: