Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
От | Claudio Freire |
---|---|
Тема | Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem |
Дата | |
Msg-id | CAGTBQpZ1g62XYU=xiFZxsYyqXHe+R2D5tkH9+La+WfcykLGEhw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
(Robert Haas <robertmhaas@gmail.com>)
Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem (Masahiko Sawada <sawada.mshk@gmail.com>) |
Список | pgsql-hackers |
On Wed, Apr 12, 2017 at 4:35 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Apr 11, 2017 at 4:38 PM, Claudio Freire <klaussfreire@gmail.com> wrote: >> In essence, the patch as it is proposed, doesn't *need* a binary >> search, because the segment list can only grow up to 15 segments at >> its biggest, and that's a size small enough that linear search will >> outperform (or at least perform as well as) binary search. Reducing >> the initial segment size wouldn't change that. If the 12GB limit is >> lifted, or the maximum segment size reduced (from 1GB to 128MB for >> example), however, that would change. >> >> I'd be more in favor of lifting the 12GB limit than of reducing the >> maximum segment size, for the reasons above. Raising the 12GB limit >> has concrete and readily apparent benefits, whereas using bigger (or >> smaller) segments is far more debatable. Yes, that will need a binary >> search. But, I was hoping that could be a second (or third) patch, to >> keep things simple, and benefits measurable. > > To me, it seems a bit short-sighted to say, OK, let's use a linear > search because there's this 12GB limit so we can limit ourselves to 15 > segments. Because somebody will want to remove that 12GB limit, and > then we'll have to revisit the whole thing anyway. I think, anyway. Ok, attached an updated patch that implements the binary search > What's not clear to me is how sensitive the performance of vacuum is > to the number of cycles used here. For a large index, the number of > searches will presumably be quite large, so it does seem worth > worrying about performance. But if we just always used a binary > search, would that lose enough performance with small numbers of > segments that anyone would care? If so, maybe we need to use linear > search for small numbers of segments and switch to binary search with > larger numbers of segments. I just went and tested. I implemented the hybrid binary search attached, and ran a few tests with and without the sequential code enabled, at small scales. The difference is statistically significant, but small (less than 3%). With proper optimization of the binary search, however, the difference flips: claudiofreire@klaumpp:~/src/postgresql.vacuum> fgrep shufp80 fullbinary.s100.times vacuum_bench_s100.1.shufp80.log:CPU: user: 6.20 s, system: 1.42 s, elapsed: 18.34 s. vacuum_bench_s100.2.shufp80.log:CPU: user: 6.44 s, system: 1.40 s, elapsed: 19.75 s. vacuum_bench_s100.3.shufp80.log:CPU: user: 6.28 s, system: 1.41 s, elapsed: 18.48 s. vacuum_bench_s100.4.shufp80.log:CPU: user: 6.39 s, system: 1.51 s, elapsed: 20.60 s. vacuum_bench_s100.5.shufp80.log:CPU: user: 6.26 s, system: 1.42 s, elapsed: 19.16 s. claudiofreire@klaumpp:~/src/postgresql.vacuum> fgrep shufp80 hybridbinary.s100.times vacuum_bench_s100.1.shufp80.log:CPU: user: 6.49 s, system: 1.39 s, elapsed: 19.15 s. vacuum_bench_s100.2.shufp80.log:CPU: user: 6.36 s, system: 1.33 s, elapsed: 18.40 s. vacuum_bench_s100.3.shufp80.log:CPU: user: 6.36 s, system: 1.31 s, elapsed: 18.87 s. vacuum_bench_s100.4.shufp80.log:CPU: user: 6.59 s, system: 1.35 s, elapsed: 26.43 s. vacuum_bench_s100.5.shufp80.log:CPU: user: 6.54 s, system: 1.28 s, elapsed: 20.02 s. That's after inlining the compare on both the linear and sequential code, and it seems it lets the compiler optimize the binary search to the point where it outperforms the sequential search. That's not the case when the compare isn't inlined. That seems in line with [1], that show the impact of various optimizations on both algorithms. It's clearly a close enough race that optimizations play a huge role. Since we're not likely to go and implement SSE2-optimized versions, I believe I'll leave the binary search only. That's the attached patch set. I'm running the full test suite, but that takes a very long while. I'll post the results when they're done. [1] https://schani.wordpress.com/2010/04/30/linear-vs-binary-search/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
В списке pgsql-hackers по дате отправления: