Re: 9.3 Pre-proposal: Range Merge Join
От | Jeff Davis |
---|---|
Тема | Re: 9.3 Pre-proposal: Range Merge Join |
Дата | |
Msg-id | 1334823058.5487.84.camel@jdavis обсуждение исходный текст |
Ответ на | Re: 9.3 Pre-proposal: Range Merge Join (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: 9.3 Pre-proposal: Range Merge Join
|
Список | pgsql-hackers |
On Wed, 2012-04-18 at 01:21 -0400, Tom Lane wrote: > It would be a pretty weird implementation of mergejoin that could > "discard tuples from the middle" of an input stream. Or to be more > specific, it wouldn't be the mergejoin itself that could do that at all > --- you'd need the input plan node to be some sort of tweaked version of > tuplestore or tuplesort that could respond to a request like that. As I said in my reply to Robert, I think there are some ways we can make this idea work. > I can't escape the feeling that Jeff has chosen the wrong basis for his > thought experiment, and that what he really ought to be thinking about > is hashjoin, which keeps an in-memory table that it could easily modify > on the fly if it chose to. The multi-batch aspect of hybrid hashjoin > could be useful too (IOW, when you're out of better ideas, throw the > tuple back in the water to process later). Obviously hashing is not going to be much use for anything but equality. So I believe this approach is very similar to the temporary-index method, except with batching, and always keeping the index in memory. I don't think we would get the partitioning benefit of hashjoin, because we'd have to put the same tuple in multiple partitions, so it's probably better to just leave the outer side intact. But in-memory indexes and multiple passes of the outer seems like a reasonable alternative, particularly because an in-memory index might be very fast (to build as well as query). > This is just handwaving of course. I think some digging in the > spatial-join literature would likely find ideas better than any of > these. I will look in some more detail. The merge-like approach did seem to be represented in the paper referenced by Alexander (the external plane sweep), but it also refers to several methods based on partitioning. I'm beginning to think that more than one of these ideas has merit. Regards,Jeff Davis
В списке pgsql-hackers по дате отправления: