Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

Поиск
Список
Период
Сортировка
От jian he
Тема Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
Дата
Msg-id CACJufxGBb-WE7R1fM4kJxvmsgsyKL_4vce3s_KxLbo_ZEnFnWw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)  (Andrei Lepikhov <a.lepikhov@postgrespro.ru>)
Ответы Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)  (Alena Rybakina <lena.ribackina@yandex.ru>)
Список pgsql-hackers
hi.
here is my implementation based on previous discussions

add a new COPY FROM flag save_error.
save_error only works with non-BINARY flags.
save_error is easier for me to implement, if using "save error" I
worry, 2 words, gram.y will not work.
save_error also works other flag like {csv mode, force_null, force_not_null}

overall logic is:
if  save_error is specified then
   if error_holding table not exists then create one
   if error_holding table exists set error_firsttime to false.
if  save_error is not specified then work as master branch.

if errors happen then insert error info to error_holding table.
if errors do not exist and error_firsttime is true then drop the table.
if errors do not exist and error_firsttime is false then raise a
notice: All the past error holding saved at %s.%s

error holding table:
schema will be the same as COPY destination table.
the table name will be: COPY destination name concatenate with "_error".

error_holding table definition:
CREATE TABLE err_nsp.error_rel (LINENO BIGINT, LINE TEXT,
FIELD TEXT, SOURCE TEXT, ERR_MESSAGE TEXT,
ERR_DETAIL TEXT, ERRORCODE TEXT);

the following field is not implemented.
FIELDS  text[], separated, de-escaped string fields (the data that was
or would be fed to input functions)

because imagine following case:
create type test as (a int, b text);
create table copy_comp (c1 int, c2 test default '(11,test)', c3 date);
copy copy_comp from stdin with (default '\D');
1 \D '2022-07-04'
\.
table copy_comp;

I feel it's hard from textual '\D'  to get text[] `(11,test)` via SPI.
--------------------------------------
demo:

create table copy_default_error_save (
id integer,
text_value text not null default 'test',
ts_value timestamp without time zone not null default '2022-07-05'
);
copy copy_default_error_save from stdin with (save_error, default '\D');
k value '2022-07-04'
z \D '2022-07-03ASKL'
s \D \D
\.

NOTICE:  3 rows were skipped because of error. skipped row saved to
table public.copy_default_error_save_error
select  * from copy_default_error_save_error;
 lineno |               line               |  field   |      source
  |                         err_message                         |
err_detail | errorcode

--------+----------------------------------+----------+------------------+-------------------------------------------------------------+------------+-----------
      1 | k       value   '2022-07-04'     | id       | k
  | invalid input syntax for type integer: "k"                  |
      | 22P02
      2 | z       \D      '2022-07-03ASKL' | id       | z
  | invalid input syntax for type integer: "z"                  |
      | 22P02
      2 | z       \D      '2022-07-03ASKL' | ts_value |
'2022-07-03ASKL' | invalid input syntax for type timestamp:
"'2022-07-03ASKL'" |            | 22007
      3 | s       \D      \D               | id       | s
  | invalid input syntax for type integer: "s"                  |
      | 22P02
(4 rows)

The doc is not so good.

COPY FROM (save_error),  it will not be as fast as COPY FROM (save_error false).
With save_error, we can only use InputFunctionCallSafe, which I
believe is not as fast as InputFunctionCall.
If any conversion error happens, we need to call the SPI interface,
that would add more overhead. also we can only insert error cases row
by row. (maybe we can insert to error_save values(error1), (error2).
(I will try later)...

The main code is about constructing SPI query, and test and test output.

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: li jie
Дата:
Сообщение: Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding
Следующее
От: Jeff Janes
Дата:
Сообщение: connection timeout hint