Re: COPY FROM performance improvements

Поиск
Список
Период
Сортировка
От Andrew Dunstan
Тема Re: COPY FROM performance improvements
Дата
Msg-id 1688.24.211.165.134.1119696313.squirrel@www.dunslane.net
обсуждение исходный текст
Ответ на Re: COPY FROM performance improvements  ("Luke Lonergan" <llonergan@greenplum.com>)
Ответы Re: COPY FROM performance improvements  ("Luke Lonergan" <llonergan@greenplum.com>)
Re: COPY FROM performance improvements  ("Alon Goldshuv" <agoldshuv@greenplum.com>)
Список pgsql-patches
Luke Lonergan said:
> I've attached Alon's patch ported to the CVS trunk.  It applies cleanly
> and passes the regressions.  With fsync=false it is 40% faster loading
> a sample dataset with 15 columns of varied type.  It's 19% faster with
> fsync=true.
>
> This patch separates the CopyFrom code into two pieces, the new logic
> for delimited data and the existing logic for CSV and Binary.
>


A few of quick comments - I will probably have many more later when I have
time to review this in depth.

1. Postgres does context diffs for patches, not unidiffs.

2. This comment raises a flag in my mind:

+ * each attribute begins. If a specific attribute is not used for this
+ * COPY command (ommitted from the column list), a value of 0 will be
assigned.+ * For example: for table foo(a,b,c,d,e) and COPY foo(a,b,e)
+ * attr_offsets may look something like this after this routine
+ * returns: [0,20,0,0,55]. That means that column "a" value starts
+ * at byte offset 0, "b" in 20 and "e" in 55, in attr_bytebuf.

Would it not be better to mark missing attributes with something that can't
be a valid offset, like -1?


3. This comment needs improving:

+/*
+ * Copy FROM file to relation with faster processing.
+ */

4. We should indeed do this for CSV, especially since a lot of the relevant
logic for detecting attribute starts is already there for CSV in
CopyReadLine. I'm prepared to help you do that if necessary, since I'm
guilty of perpetrating that code.

cheers

andrew



В списке pgsql-patches по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: Add PG version number to NLS files
Следующее
От: Andrew Dunstan
Дата:
Сообщение: Re: plperl features