Re: An idea for parallelizing COPY within one backend
От | Florian G. Pflug |
---|---|
Тема | Re: An idea for parallelizing COPY within one backend |
Дата | |
Msg-id | 47C597E6.5060609@phlo.org обсуждение исходный текст |
Ответ на | Re: An idea for parallelizing COPY within one backend (Andrew Dunstan <andrew@dunslane.net>) |
Ответы |
Re: An idea for parallelizing COPY within one backend
|
Список | pgsql-hackers |
Andrew Dunstan wrote: > Florian G. Pflug wrote: >>> Would it be possible to determine when the copy is starting that this >>> case holds, and not use the parallel parsing idea in those cases? >> >> In theory, yes. In pratice, I don't want to be the one who has to >> answer to an angry user who just suffered a major drop in COPY >> performance after adding an ENUM column to his table. >> > I am yet to be convinced that this is even theoretically a good path to > follow. Any sufficiently large table could probably be partitioned and > then we could use the parallelism that is being discussed for pg_restore > without any modification to the backend at all. Similar tricks could be > played by an external bulk loader for third party data sources. That assumes that some specific bulkloader like pg_restore, pgloader or similar is used to perform the load. Plain libpq-users would either need to duplicate the logic these loaders contain, or wouldn't be able to take advantage of fast loads. Plus, I'd see this as a kind of testbed for gently introducing parallelism into postgres backends (especially thinking about sorting here). CPU gain more and more cores, so in the long run I fear that we will have to find ways to utilize more than one of those to execute a single query. But of course the architectural details need to be sorted out before any credible judgement about the feasability of this idea can be made... regards, Florian Pflug
В списке pgsql-hackers по дате отправления: