Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
От | Rushabh Lathia |
---|---|
Тема | Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation) |
Дата | |
Msg-id | CAGPqQf3a79OOVX=N3eqY37aqVVw3kQdKs+RQxv8D1ga8=XM2mg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation) (Thomas Munro <thomas.munro@enterprisedb.com>) |
Ответы |
Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
|
Список | pgsql-hackers |
On Wed, Mar 22, 2017 at 3:19 AM, Thomas Munro <thomas.munro@enterprisedb.com > wrote:
As per the earlier discussion in the thread, I did experiment using
BufFileSet interface from parallel-hash-v18.patchset. I took the reference
of parallel-hash other patches to understand the BufFileSet APIs, and
incorporate the changes to parallel create index.
In order to achieve the same:
- Applied 0007-Remove-BufFile-s-isTemp- flag.patch and
0008-Add-BufFileSet-for- sharing-temporary-files- between-b.patch from the
parallel-hash-v18.patchset.
- Removed the buffile.c/logtap.c/fd.c changes from the parallel CREATE
INDEX v10 patch.
- incorporate the BufFileSet API to the parallel tuple sort for CREATE INDEX.
- Changes into few existing functions as well as added few to support the
BufFileSet changes.
To check the performance, I used the similar test which Peter posted in
his earlier thread. which is:
Machine: power2 machine with 512GB of RAM
Setup:
CREATE TABLE parallel_sort_test AS
SELECT hashint8(i) randint,
md5(i::text) collate "C" padding1,
md5(i::text || '2') collate "C" padding2
FROM generate_series(0, 1e9::bigint) i;
vacuum ANALYZE parallel_sort_test;
postgres=# show max_parallel_workers_per_ gather;
max_parallel_workers_per_ gather
------------------------------ ---
8
(1 row)
postgres=# show maintenance_work_mem;
maintenance_work_mem
----------------------
8GB
(1 row)
postgres=# show max_wal_size ;
max_wal_size
--------------
4GB
(1 row)
CREATE INDEX serial_idx ON parallel_sort_test (randint);
Without patch:
Time: 3430054.220 ms (57:10.054)
With patch (max_parallel_workers_ maintenance = 8):
Time: 1163445.271 ms (19:23.445)
Thanks to my colleague Thomas Munro for his help and off-line discussion
for the patch.
On Wed, Mar 22, 2017 at 10:03 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Mar 21, 2017 at 3:50 PM, Peter Geoghegan <pg@bowt.ie> wrote:
>> I disagree with that. It is a
>> trade-off, I suppose. I have now run out of time to work through it
>> with you or Thomas, though.
>
> Bummer.
I'm going to experiment with refactoring the v10 parallel CREATE INDEX
patch to use the SharedBufFileSet interface from
hj-shared-buf-file-v8.patch today and see what problems I run into.
As per the earlier discussion in the thread, I did experiment using
BufFileSet interface from parallel-hash-v18.patchset. I took the reference
of parallel-hash other patches to understand the BufFileSet APIs, and
incorporate the changes to parallel create index.
In order to achieve the same:
- Applied 0007-Remove-BufFile-s-isTemp-
0008-Add-BufFileSet-for-
parallel-hash-v18.patchset.
- Removed the buffile.c/logtap.c/fd.c changes from the parallel CREATE
INDEX v10 patch.
- incorporate the BufFileSet API to the parallel tuple sort for CREATE INDEX.
- Changes into few existing functions as well as added few to support the
BufFileSet changes.
To check the performance, I used the similar test which Peter posted in
his earlier thread. which is:
Machine: power2 machine with 512GB of RAM
Setup:
CREATE TABLE parallel_sort_test AS
SELECT hashint8(i) randint,
md5(i::text) collate "C" padding1,
md5(i::text || '2') collate "C" padding2
FROM generate_series(0, 1e9::bigint) i;
vacuum ANALYZE parallel_sort_test;
postgres=# show max_parallel_workers_per_
max_parallel_workers_per_
------------------------------
8
(1 row)
postgres=# show maintenance_work_mem;
maintenance_work_mem
----------------------
8GB
(1 row)
postgres=# show max_wal_size ;
max_wal_size
--------------
4GB
(1 row)
CREATE INDEX serial_idx ON parallel_sort_test (randint);
Without patch:
Time: 3430054.220 ms (57:10.054)
With patch (max_parallel_workers_
Time: 1163445.271 ms (19:23.445)
Thanks to my colleague Thomas Munro for his help and off-line discussion
for the patch.
Attaching v11 patch and trace_sort output for the test.
Thanks,
Rushabh Lathia
Вложения
В списке pgsql-hackers по дате отправления: