Re: slow self-join query
От | Scott Marlowe |
---|---|
Тема | Re: slow self-join query |
Дата | |
Msg-id | CAOR=d=3c2RYFCSR+TCLkuLWZ+kK5433SKqJtv_h2qdZpoER_yQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: slow self-join query (Robert Poor <rdpoor@gmail.com>) |
Ответы |
Re: slow self-join query
|
Список | pgsql-performance |
On Sun, Mar 18, 2012 at 8:37 AM, Robert Poor <rdpoor@gmail.com> wrote: > On Sat, Mar 17, 2012 at 23:12, Scott > Marlowe <scott.marlowe@gmail.com> wrote: > >> >> Also also this looks like it's the most expensive operation: >> >> Seq Scan on followings f2 (cost=0.00..93523.95 rows=5534395 width=8) >> (actual time=0.041..19365.834 rows=5535964 loops=1) >> >> I'm guessing the f2.follower_id isn't very selective? > > > Not 100% sure what you mean -- f2.follower_id is very sparse (compared to > f1.follower_id), but that's the point of this particular query. But since > upping work_mem makes it run really fast, I'm not overly concerned about > this one. Thanks for your help! Selectivity is how selective is a single value is likely to be. So if f2.follower_id has 5000 entries and there's only 2 values, it's not likely to be very selective, as most of the table will match one of two values. If it's 1M rows and 1M distinct follower_ids then it's selectivity is 1.0 because one value will get just one row ever. > One last thought: I could re-cast this as a subquery / query pair, each with > a single join. Am I correct in thinking that could make it really easy on > the planner (especially if the tables were properly indexed)? Why are you joining twice to the parent table? If you're trying to recurse without a with clause, then wouldn't you join the last table to the one before it?
В списке pgsql-performance по дате отправления: