Re: Add tuples_skipped to pg_stat_progress_copy
От | torikoshia |
---|---|
Тема | Re: Add tuples_skipped to pg_stat_progress_copy |
Дата | |
Msg-id | bfd3fedea94e564a45547ad854b533f1@oss.nttdata.com обсуждение исходный текст |
Ответ на | Re: Add tuples_skipped to pg_stat_progress_copy (Masahiko Sawada <sawada.mshk@gmail.com>) |
Ответы |
Re: Add tuples_skipped to pg_stat_progress_copy
|
Список | pgsql-hackers |
On 2024-01-24 17:05, Masahiko Sawada wrote: > On Tue, Jan 23, 2024 at 1:02 AM torikoshia <torikoshia@oss.nttdata.com> > wrote: >> >> On 2024-01-17 14:47, Masahiko Sawada wrote: >> > On Wed, Jan 17, 2024 at 2:22 PM torikoshia <torikoshia@oss.nttdata.com> >> > wrote: >> >> >> >> Hi, >> >> >> >> 132de9968840c introduced SAVE_ERROR_TO option to COPY and enabled to >> >> skip malformed data, but there is no way to watch the number of >> >> skipped >> >> rows during COPY. >> >> >> >> Attached patch adds tuples_skipped to pg_stat_progress_copy, which >> >> counts the number of skipped tuples because source data is malformed. >> >> If SAVE_ERROR_TO is not specified, this column remains zero. >> >> >> >> The advantage would be that users can quickly notice and stop COPYing >> >> when there is a larger amount of skipped data than expected, for >> >> example. >> >> >> >> As described in commit log, it is expected to add more choices for >> >> SAVE_ERROR_TO like 'log' and using such options may enable us to know >> >> the number of skipped tuples during COPY, but exposed in >> >> pg_stat_progress_copy would be easier to monitor. >> >> >> >> >> >> What do you think? >> > >> > +1 >> > >> > The patch is pretty simple. Here is a comment: >> > >> > + (if <literal>SAVE_ERROR_TO</literal> is specified, otherwise >> > zero). >> > + </para></entry> >> > + </row> >> > >> > To be precise, this counter only advances when a value other than >> > 'ERROR' is specified to SAVE_ERROR_TO option. >> >> Thanks for your comment and review! >> >> Updated the patch according to your comment and option name change by >> b725b7eec. > > Thanks! The patch looks good to me. I'm going to push it tomorrow, > barring any objections. Thanks! >> >> BTW, based on this patch, I think we can add another option which >> specifies the maximum tolerable number of malformed rows. >> I remember this was discussed in [1], and feel it would be useful when >> loading 'dirty' data but there is a limit to how dirty it can be. >> Attached 0002 is WIP patch for this(I haven't added doc yet). > > Yeah, it could be a good option. > >> This may be better discussed in another thread, but any comments(e.g. >> necessity of this option, option name) are welcome. > > I'd recommend forking a new thread for this option. As far as I > remember, there also was an opinion that "reject limit" stuff is not > very useful. OK, I'll make another thread for this. -- Regards, -- Atsushi Torikoshi NTT DATA Group Corporation
В списке pgsql-hackers по дате отправления: