Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

Поиск

Список

Период

Сортировка

От	Claudio Freire
Тема	Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
Дата	21 апреля 2017 г. 03:24:13
Msg-id	CAGTBQpZ1g62XYU=xiFZxsYyqXHe+R2D5tkH9+La+WfcykLGEhw@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem (Robert Haas <robertmhaas@gmail.com>)
Ответы	Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem (Robert Haas <robertmhaas@gmail.com>) Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem (Masahiko Sawada <sawada.mshk@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On Wed, Apr 12, 2017 at 4:35 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Apr 11, 2017 at 4:38 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
>> In essence, the patch as it is proposed, doesn't *need* a binary
>> search, because the segment list can only grow up to 15 segments at
>> its biggest, and that's a size small enough that linear search will
>> outperform (or at least perform as well as) binary search. Reducing
>> the initial segment size wouldn't change that. If the 12GB limit is
>> lifted, or the maximum segment size reduced (from 1GB to 128MB for
>> example), however, that would change.
>>
>> I'd be more in favor of lifting the 12GB limit than of reducing the
>> maximum segment size, for the reasons above. Raising the 12GB limit
>> has concrete and readily apparent benefits, whereas using bigger (or
>> smaller) segments is far more debatable. Yes, that will need a binary
>> search. But, I was hoping that could be a second (or third) patch, to
>> keep things simple, and benefits measurable.
>
> To me, it seems a bit short-sighted to say, OK, let's use a linear
> search because there's this 12GB limit so we can limit ourselves to 15
> segments.  Because somebody will want to remove that 12GB limit, and
> then we'll have to revisit the whole thing anyway.  I think, anyway.

Ok, attached an updated patch that implements the binary search

> What's not clear to me is how sensitive the performance of vacuum is
> to the number of cycles used here.  For a large index, the number of
> searches will presumably be quite large, so it does seem worth
> worrying about performance.  But if we just always used a binary
> search, would that lose enough performance with small numbers of
> segments that anyone would care?  If so, maybe we need to use linear
> search for small numbers of segments and switch to binary search with
> larger numbers of segments.

I just went and tested.

I implemented the hybrid binary search attached, and ran a few tests
with and without the sequential code enabled, at small scales.

The difference is statistically significant, but small (less than 3%).
With proper optimization of the binary search, however, the difference
flips:

claudiofreire@klaumpp:~/src/postgresql.vacuum> fgrep shufp80
fullbinary.s100.times
vacuum_bench_s100.1.shufp80.log:CPU: user: 6.20 s, system: 1.42 s,
elapsed: 18.34 s.
vacuum_bench_s100.2.shufp80.log:CPU: user: 6.44 s, system: 1.40 s,
elapsed: 19.75 s.
vacuum_bench_s100.3.shufp80.log:CPU: user: 6.28 s, system: 1.41 s,
elapsed: 18.48 s.
vacuum_bench_s100.4.shufp80.log:CPU: user: 6.39 s, system: 1.51 s,
elapsed: 20.60 s.
vacuum_bench_s100.5.shufp80.log:CPU: user: 6.26 s, system: 1.42 s,
elapsed: 19.16 s.

claudiofreire@klaumpp:~/src/postgresql.vacuum> fgrep shufp80
hybridbinary.s100.times
vacuum_bench_s100.1.shufp80.log:CPU: user: 6.49 s, system: 1.39 s,
elapsed: 19.15 s.
vacuum_bench_s100.2.shufp80.log:CPU: user: 6.36 s, system: 1.33 s,
elapsed: 18.40 s.
vacuum_bench_s100.3.shufp80.log:CPU: user: 6.36 s, system: 1.31 s,
elapsed: 18.87 s.
vacuum_bench_s100.4.shufp80.log:CPU: user: 6.59 s, system: 1.35 s,
elapsed: 26.43 s.
vacuum_bench_s100.5.shufp80.log:CPU: user: 6.54 s, system: 1.28 s,
elapsed: 20.02 s.

That's after inlining the compare on both the linear and sequential
code, and it seems it lets the compiler optimize the binary search to
the point where it outperforms the sequential search.

That's not the case when the compare isn't inlined.

That seems in line with [1], that show the impact of various
optimizations on both algorithms. It's clearly a close enough race
that optimizations play a huge role.

Since we're not likely to go and implement SSE2-optimized versions, I
believe I'll leave the binary search only. That's the attached patch
set.

I'm running the full test suite, but that takes a very long while.
I'll post the results when they're done.

[1] https://schani.wordpress.com/2010/04/30/linear-vs-binary-search/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Tom Lane
Дата: 21 апреля 2017 г., 02:49:20
Сообщение: Re: [HACKERS] OK, so culicidae is *still* broken

Следующее

От: Alvaro Herrera
Дата: 21 апреля 2017 г., 03:24:26
Сообщение: [HACKERS] WITH clause in CREATE STATISTICS

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

Вложения

Предыдущее

Следующее