Re: Bulk Inserts

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема Re: Bulk Inserts
Дата
Msg-id f67928030909141855y2ff8993epe4d967a769cebb56@mail.gmail.com
обсуждение исходный текст
Ответ на Bulk Inserts  (Pierre Frédéric Caillaud<lists@peufeu.com>)
Ответы Re: Bulk Inserts  (Pierre Frédéric Caillaud<lists@peufeu.com>)
Список pgsql-hackers
2009/9/14 Pierre Frédéric Caillaud <lists@peufeu.com>

I've done a little experiment with bulk inserts.

=> heap_bulk_insert()

Behaves like heap_insert except it takes an array of tuples (HeapTuple *tups, int ntups).

- Grabs a page (same as heap_insert)

- While holding exclusive lock, inserts as many tuples as it can on the page.
       - Either the page gets full
       - Or we run out of tuples.

- Generate xlog : choice between
       - Full Xlog mode :
               - if we inserted more than 10 tuples (totaly bogus heuristic), log the entire page
               - Else, log individual tuples as heap_insert does

Does that heuristic change the timings much?  If not, it seems like it would better to keep it simple and always do the same thing, like log the tuples (if it is done under one WALInsertLock, which I am assuming it is..)
 
       - Light log mode :
               - if page was empty, only xlog a "new empty page" record, not page contents
               - else, log fully
               - heap_sync() at the end

- Release the page
- If we still have tuples to insert, repeat.

Am I right in assuming that :

1)
- If the page was empty,
- and log archiving isn't used,
- and the table is heap_sync()'d at the end,
=> only a "new empty page" record needs to be created, then the page can be completely filled ?

Do you even need the new empty page record?  I think a zero page will be handled correctly next time it is read into shared buffers, won't it?  But I guess it is need to avoid  problems with partial page writes that would leave in a state that is neither all zeros nor consistent.



2)
- If the page isn't empty
- or log archiving is used,
=> logging either the inserted tuples or the entire page is OK to guarantee persistence ?

If the entire page is logged, would it have to marked as not removable by the log compression tool?  Or can the tool recreate the needed delta?
 
Jeff

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Issues for named/mixed function notation patch
Следующее
От: Tom Lane
Дата:
Сообщение: Re: CommitFest 2009-09: Now In Progress