Re: Parallel copy
От | Robert Haas |
---|---|
Тема | Re: Parallel copy |
Дата | |
Msg-id | CA+TgmoZw+F3y+oaxEsHEZBxdL1x1KAJ7pRMNgCqX0WjmjGNLrA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Parallel copy (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: Parallel copy
|
Список | pgsql-hackers |
On Thu, Apr 9, 2020 at 2:55 PM Andres Freund <andres@anarazel.de> wrote: > I'm fairly certain that we do *not* want to distribute input data between processes on a single tuple basis. Probably noteven below a few hundred kb. If there's any sort of natural clustering in the loaded data - extremely common, think timestamps- splitting on a granular basis will make indexing much more expensive. And have a lot more contention. That's a fair point. I think the solution ought to be that once any process starts finding line endings, it continues until it's grabbed at least a certain amount of data for itself. Then it stops and lets some other process grab a chunk of data. Or are you are arguing that there should be only one process that's allowed to find line endings for the entire duration of the load? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: