Re: confusing / inefficient "need_transcoding" handling in copy
От | Andres Freund |
---|---|
Тема | Re: confusing / inefficient "need_transcoding" handling in copy |
Дата | |
Msg-id | 20240206222445.hzq22pb2nye7rm67@awork3.anarazel.de обсуждение исходный текст |
Ответ на | Re: confusing / inefficient "need_transcoding" handling in copy (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: confusing / inefficient "need_transcoding" handling in copy
Re: confusing / inefficient "need_transcoding" handling in copy |
Список | pgsql-hackers |
Hi, On 2024-02-06 12:51:48 -0500, Tom Lane wrote: > Michael Paquier <michael@paquier.xyz> writes: > > On Mon, Feb 05, 2024 at 06:05:04PM -0800, Andres Freund wrote: > >> I haven't yet dug into the code history. One guess is that this should only > >> have been set this way for COPY FROM. > > > Looking the git history, this looks like an oversight of c61a2f58418e > > that has added the condition on pg_database_encoding_max_length(), no? > > Adding Tom and Ishii-san, even if this comes from 2005. > > Yeah, back in 8.1 that code was applied for both directions, but > probably it should have enforced validation for same-encoding > cases only for COPY FROM. > > It looks like now we have a mess, because the condition was copied > verbatim into copyto.c but not copyfrom.c. Aren't we failing to > validate encoding in this case in COPY FROM, which is where we > actually need to? I think the code is just very confusing - there actually *is* verification of the encoding, it just happens at a different, earlier, layer, namely in copyfromparse.c: CopyConvertBuf() which says: /* * If the file and server encoding are the same, no encoding conversion is * required. However, we still need to verify that the input is valid for * the encoding. */ And does indeed verify that. One unfortunate issue: We don't have any tests verifying that COPY FROM catches encoding issues. Regards, Andres
В списке pgsql-hackers по дате отправления: