Re: Extend COPY FROM with HEADER to skip multiple lines
От | Fujii Masao |
---|---|
Тема | Re: Extend COPY FROM with HEADER |
Дата | |
Msg-id | 608b08b4-b708-49c4-a186-2524b83246dc@oss.nttdata.com обсуждение исходный текст |
Ответ на |
Extend COPY FROM with HEADER |
Ответы |
Re: Extend COPY FROM with HEADER Re: Extend COPY FROM with HEADER |
Список | pgsql-hackers |
On 2025/06/09 16:10, Shinya Kato wrote: > Hi hackers, > > I'd like to propose a new feature for the COPY FROM command to allow > skipping multiple header lines when loading data. This enhancement > would enable files with multi-line headers to be loaded without any > preprocessing, which would significantly improve usability. > > In real-world scenarios, it's common for data files to contain > multiple header lines, such as file descriptions or column > explanations. Currently, the COPY command cannot load these files > directly, which requires users to preprocess them with tools like sed > or tail. > > Although you can use "COPY t FROM PROGRAM 'tail -n +3 /path/to/file'", > some environments do not have the tail command available. > Additionally, this approach requires superuser privileges or > membership in the pg_execute_server_program role. > > This feature also has precedent in other major RDBMS: > - MySQL: LOAD DATA ... IGNORE N LINES [1] > - SQL Server: BULK INSERT … WITH (FIRST ROW=N) [2] > - Oracle SQL*Loader: sqlldr … SKIP=N [3] > > I have not yet created a patch, but I am willing to implement an > extension for the HEADER option. I would like to discuss the > specification first. > > The specification I have in mind is as follows: > - Command: COPY FROM > - Formats: text and csv > - Option syntax: HEADER [ boolean | integer | MATCH] (Extend the > HEADER option to accept an integer value in addition to the existing > boolean and MATCH keywords.) > - Behavior: Let N be the specified integer. > - If N < 0, raise an error. > - If N = 0 or 1, same behavior when boolean is specified. > - If N > 1, skip the first N rows. > > Thoughts? I generally like the idea. However, a similar proposal was made earlier [1], and seemingly some hackers weren't in favor of it. It's probably worth reading that thread to understand the previous concerns. Regards, [1] https://postgr.es/m/CALAY4q8nGSXp0P5uf56vn-mD7reWqZP5k6PS1CGUm26X4FsYJA@mail.gmail.com -- Fujii Masao NTT DATA Japan Corporation
В списке pgsql-hackers по дате отправления: