Re: Performance on inserts

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Performance on inserts
Дата
Msg-id 9612.967244422@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Performance on inserts  (Alfred Perlstein <bright@wintelcom.net>)
Ответы Re: Performance on inserts  (Jules Bean <jules@jellybean.co.uk>)
Список pgsql-hackers
Alfred Perlstein <bright@wintelcom.net> writes:
> A slightly incorrect startscan point is better than starting
> from the beginning every time.

Not for lookups, it isn't ;-).

I did think about making the btree-descending code act differently
for inserts than lookups, but AFAICS that wouldn't buy anything for
this problem, since until you get down to the leaf level you can't
see where there's free space anyway.  You'd simply be exchanging the
problem of missing free space that's too far to the right for the
problem of missing free space that's to the left of where you chose
to start looking.

The algorithm does seem to work quite nicely just the way I described
it, although it turns out I was off base about a good probability
setting.  I find that something up around 0.99 seems to be good.
Using the same (perhaps overly simplistic) test case:

# tuples inserted        6.5        current+random hack @ 0.99        Time    index size    Time    index size
1536            <1sec    90112        <1sec    106496
3072            1.56    163840        <1sec    188416
6144            3.70    286720        1.40    376832
12288            9.73    532480        2.65    688128
24576            93.26    1024000        5.22    1368064
49152            363.23    2007040        10.34    2727936
98304                        22.07    5545984
196608                        45.60    11141120
393216                        92.53    22290432

I tried probabilities from 0.67 to 0.999 and found that runtimes didn't
vary a whole lot (though this is near the minimum), while index size
consistently got larger as the probability of moving right decreased.
The runtime is nicely linear throughout the range.

The index size increase might look like a bit of a jump, but actually
this is right where we want it to be.  The old code was effectively
packing each page as full as it could be under these conditions.  That's
not what you want for a btree.  Steady-state load factor for btrees is
usually quoted as somewhere around 70%, and this method manages to
approximate that pretty well with a move-right probability of 0.99 or
a bit less.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Vince Vielhaber
Дата:
Сообщение: Pure ODBMS (fwd)
Следующее
От: Thomas Lockhart
Дата:
Сообщение: Re: Performance on inserts