CLUSTER, reform_and_rewrite_tuple(), and parallelism
От | Peter Geoghegan |
---|---|
Тема | CLUSTER, reform_and_rewrite_tuple(), and parallelism |
Дата | |
Msg-id | CAM3SWZTCU6DCgvMFzA1+=Os7NViiDM65Jkc36RCJqvp0ZEBAFw@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: CLUSTER, reform_and_rewrite_tuple(), and parallelism
Re: CLUSTER, reform_and_rewrite_tuple(), and parallelism Re: CLUSTER, reform_and_rewrite_tuple(), and parallelism |
Список | pgsql-hackers |
During preliminary analysis of what it would take to produce a parallel CLUSTER patch that is analogous of what I came up with for CREATE INDEX, which in general seems quite possible, I identified reform_and_rewrite_tuple() as a major bottleneck for the current CLUSTER implementation. Excluding the cost of the subsequent REINDEX of the clustered-on index, reform_and_rewrite_tuple() appears to account for roughly 25% - 35% of both the cache misses, and instructions executed, for my test case (this used a tuplesort, not an indexscan on the old heap relation, of course). Merging itself was far less expensive (with my optimization of how the heap is maintained during merging + 16 tapes/runs), so it would be reasonable to not parallelize that part, just as it was for parallel CREATE INDEX. I don't think that it's reasonable to not do anything about this reform_and_rewrite_tuple() bottleneck, though. Does anyone have any ideas on how to: 1). Directly address the reform_and_rewrite_tuple() bottleneck. and/or: 2). Push down some or all of the reform_and_rewrite_tuple() work till before tuples are passed to the tuplesort. "2" would probably make it straightforward to have reform_and_rewrite_tuple() work occur in parallel workers instead, which buys us a lot. -- Peter Geoghegan
В списке pgsql-hackers по дате отправления: