Re: COPY with hints, rebirth
От | Heikki Linnakangas |
---|---|
Тема | Re: COPY with hints, rebirth |
Дата | |
Msg-id | 4F4A851E.3080501@enterprisedb.com обсуждение исходный текст |
Ответ на | COPY with hints, rebirth (Simon Riggs <simon@2ndQuadrant.com>) |
Ответы |
Re: COPY with hints, rebirth
|
Список | pgsql-hackers |
On 24.02.2012 22:55, Simon Riggs wrote: > A long time ago, in a galaxy far away, we discussed ways to speed up > data loads/COPY. > http://archives.postgresql.org/pgsql-hackers/2007-01/msg00470.php > > In particular, the idea that we could mark tuples as committed while > we are still loading them, to avoid negative behaviour for the first > reader. > > Simple patch to implement this is attached, together with test case. > > ... > > What exactly does it do? Previously, we optimised COPY when it was > loading data into a newly created table or a freshly truncated table. > This patch extends that and actually sets the tuple header flag as > HEAP_XMIN_COMMITTED during the load. Doing so is simple 2 lines of > code. The patch also adds some tests for corner cases that would make > that action break MVCC - though those cases are minor and typical data > loads will benefit fully from this. This doesn't work with subtransactions: postgres=# create table a as select 1 as id; SELECT 1 postgres=# copy a to '/tmp/a'; COPY 1 postgres=# begin; BEGIN postgres=# truncate a; TRUNCATE TABLE postgres=# savepoint sp1; SAVEPOINT postgres=# copy a from '/tmp/a'; COPY 1 postgres=# select * from a; id ---- (0 rows) The query should return the row copied in the same subtransaction. > In the link above, Tom suggested reworking HeapTupleSatisfiesMVCC() > and adding current xid to snapshots. That is an invasive change that I > would wish to avoid at any time and explains the long delay in > tackling this. The way I've implemented it, is just as a short test > during XidInMVCCSnapshot() so that we trap the case when the xid == > xmax and so would appear to be running. This is much less invasive and > just as performant as Tom's original suggestion. TransactionIdIsCurrentTransactionId() can be fairly expensive if you have a lot of subtransactions open... -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: