Re: Idea for improving speed of pg_restore
От | Christopher Browne |
---|---|
Тема | Re: Idea for improving speed of pg_restore |
Дата | |
Msg-id | m3wuc9saqn.fsf@wolfe.cbbrowne.com обсуждение исходный текст |
Ответ на | Idea for improving speed of pg_restore (Ron Johnson <ron.l.johnson@cox.net>) |
Ответы |
Re: Idea for improving speed of pg_restore
Re: Idea for improving speed of pg_restore |
Список | pgsql-general |
A long time ago, in a galaxy far, far away, ron.l.johnson@cox.net (Ron Johnson) wrote: > Pardon if this has already been suggested and shot down as impossible > or too difficult, etc. None of this is forcibly the problem. The _big_ problem is that this isn't necessarily going to improve the speed of the would-be restore. I have brought servers to their knees by trying to restore multiple databases at once. The reason why this isn't likely to work is that the "multitasking" that you need to maximize is not of the CPUs, but rather of the disk drives. Restoring a database involves, for each table: 1. Reading table data from the source file; 2. Writing data to the database file for the table; 3. After that, reading the database file data, and 4. Writing the sorted bits to the index file. 5. Along with all of this, HEFTY amounts of updates to WAL. Ideal performance for this involves having 1. and 2. take place concurrently, where a sequential set of reads from one disk lead to more-or-less sequential writes to the "database drive" and to the "WAL drive." Multitask that and it is easy for the I/O patterns to go bad, where one process is trying to sequence writes to one table whilst another is writing to another, so you thereby have the disk seeking back and forth between the one and the other. There might well be some merit to having the capability to work on several indices on the same table all at once, so that after reading through the table once, _three_ indices are generated, mostly in memory, and then consecutively spilled to disk. But it may be just as well to let the shared memory cache do the work so that the table gets read in once, to generate the first index, and then is already in cache, which will surely help performance for generating the subsequent indices. That's a longwinded way of saying that doing multiple concurrent mass data loads sucks pretty bad. -- "aa454","@","freenet.carleton.ca" http://www3.sympatico.ca/cbbrowne/spiritual.html "If it can't be abused, it's not freedom. A man may be in as just possession of truth as of a city, and yet be forced to surrender." -- Thomas Browne
В списке pgsql-general по дате отправления: