Re: TB-sized databases
От | Matthew |
---|---|
Тема | Re: TB-sized databases |
Дата | |
Msg-id | Pine.LNX.4.58.0712061758360.3731@aragorn.flymine.org обсуждение исходный текст |
Ответ на | Re: TB-sized databases (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: TB-sized databases
|
Список | pgsql-performance |
On Thu, 6 Dec 2007, Tom Lane wrote: > Matthew <matthew@flymine.org> writes: > > ... For this query, Postgres would perform a nested loop, > > iterating over all rows in the small table, and doing a hundred index > > lookups in the big table. This completed very quickly. However, adding the > > LIMIT meant that suddenly a merge join was very attractive to the planner, > > as it estimated the first row to be returned within milliseconds, without > > needing to sort either table. > > > The problem is that Postgres didn't know that the first hit in the big > > table would be about half-way through, after doing a index sequential scan > > for half a bazillion rows. > > Hmm. IIRC, there are smarts in there about whether a mergejoin can > terminate early because of disparate ranges of the two join variables. > Seems like it should be straightforward to fix it to also consider > whether the time-to-return-first-row will be bloated because of > disparate ranges. I'll take a look --- but it's probably too late > to consider this for 8.3. Very cool. Would that be a planner cost estimate fix (so it avoids the merge join), or a query execution fix (so it does the merge join on the table subset)? Matthew -- I've run DOOM more in the last few days than I have the last few months. I just love debugging ;-) -- Linus Torvalds
В списке pgsql-performance по дате отправления: