Re: Parallel Sort
От | Tom Lane |
---|---|
Тема | Re: Parallel Sort |
Дата | |
Msg-id | 3859.1368457059@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Parallel Sort (Noah Misch <noah@leadboat.com>) |
Ответы |
Re: Parallel Sort
Re: Parallel Sort Re: Parallel Sort |
Список | pgsql-hackers |
Noah Misch <noah@leadboat.com> writes: > Each worker needs to make SnapshotNow visibility decisions coherent with the > master. For sorting, this allows us to look up comparison functions, even > when the current transaction created or modified those functions. This will > also be an essential building block for any parallelism project that consults > user tables. Implementing this means copying the subtransaction stack and the > combocid hash to each worker. > [ ... and GUC settings, and who knows what else ... ] This approach seems to me to be likely to guarantee that the startup overhead for any parallel sort is so large that only fantastically enormous sorts will come out ahead. I think you need to think in terms of restricting the problem space enough so that the worker startup cost can be trimmed to something reasonable. One obvious suggestion is to forbid the workers from doing any database access of their own at all --- the parent would have to do any required catalog lookups for sort functions etc. before forking the children. I think we should also seriously think about relying on fork() and copy-on-write semantics to launch worker subprocesses, instead of explicitly copying so much state over to them. Yes, this would foreclose ever having parallel query on Windows, but that's okay with me (hm, now where did I put my asbestos longjohns ...) Both of these lines of thought suggest that the workers should *not* be full-fledged backends. regards, tom lane
В списке pgsql-hackers по дате отправления: