Re: Fix for seg picksplit function
От | Yeb Havinga |
---|---|
Тема | Re: Fix for seg picksplit function |
Дата | |
Msg-id | 4CE7C0C4.6070902@gmail.com обсуждение исходный текст |
Ответ на | Re: Fix for seg picksplit function (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: Fix for seg picksplit function
|
Список | pgsql-hackers |
On 2010-11-20 04:46, Robert Haas wrote: > On Tue, Nov 16, 2010 at 6:07 AM, Alexander Korotkov > <aekorotkov@gmail.com> wrote: >> On Tue, Nov 16, 2010 at 3:07 AM, Robert Haas<robertmhaas@gmail.com> wrote: >>> But on a broader note, I'm not very certain the sorting algorithm is >>> sensible. For example, suppose you have 10 segments that are exactly >>> '0' and 20 segments that are exactly '1'. Maybe I'm misunderstanding, >>> but it seems like this will result in a 15/15 split when we almost >>> certainly want a 10/20 split. I think there will be problems in more >>> complex cases as well. The documentation says about the less-than and >>> greater-than operators that "These operators do not make a lot of >>> sense for any practical purpose but sorting." >> In order to illustrate a real problem we should think about >> gist behavior with great enough amount of data. For example, I tried to >> extrapolate this case to 100000 of segs where 40% are (0,1) segs and 60% are >> (1,2) segs. And this case doesn't seem a problem for me. > Well, the problem with just comparing on< is that it takes very > little account of the upper bounds. I think the cases where a simple > split would hurt you the most are those where examining the upper > bound is necessary to to get a good split. With the current 8K default blocksize, I put my money on the sorting algorithm for any seg case. The r-tree algorithm's performance is probably more influenced by large buckets->low tree depth->generic keys on non leaf nodes. regards, Yeb Havinga
В списке pgsql-hackers по дате отправления: