Re: Inserting heap tuples in bulk in COPY
От | Heikki Linnakangas |
---|---|
Тема | Re: Inserting heap tuples in bulk in COPY |
Дата | |
Msg-id | 4E8D9204.2010304@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: Inserting heap tuples in bulk in COPY (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: Inserting heap tuples in bulk in COPY
|
Список | pgsql-hackers |
On 25.09.2011 19:01, Robert Haas wrote: > On Wed, Sep 14, 2011 at 6:52 AM, Heikki Linnakangas > <heikki.linnakangas@enterprisedb.com> wrote: >>> Why do you need new WAL replay routines? Can't you just use the existing >>> XLOG_HEAP_NEWPAGE support? >>> >>> By any large, I think we should be avoiding special-purpose WAL entries >>> as much as possible. >> >> I tried that, but most of the reduction in WAL-size melts away with that. >> And if the page you're copying to is not empty, logging the whole page is >> even more expensive. You'd need to fall back to retail inserts in that case >> which complicates the logic. > > Where does it go? I understand why it'd be a problem for partially > filled pages, but it seems like it ought to be efficient for pages > that are initially empty. A regular heap_insert record leaves out a lot of information that can be deduced at replay time. It can leave out all the headers, including just the null bitmap + data. In addition to that, there's just the location of the tuple (RelFileNode+ItemPointer). At replay, xmin is taken from the WAL record header. For a multi-insert record, you don't even need to store the RelFileNode and the block number for every tuple, just the offsets. In comparison, a full-page image will include the full tuple header, and also the line pointers. If I'm doing my math right, a full-page image takes 25 bytes more data per tuple, than the special-purpose multi-insert record. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: