Re: Problem with COPY CSV
От | Andrew Dunstan |
---|---|
Тема | Re: Problem with COPY CSV |
Дата | |
Msg-id | 438E7A24.9020106@dunslane.net обсуждение исходный текст |
Ответ на | Problem with COPY CSV (Christopher Kings-Lynne <chriskl@familyhealth.com.au>) |
Список | pgsql-hackers |
Christopher Kings-Lynne wrote: > Attached is a small test extract from the USDA nutrient database. > > The problem is that the script won't load the COPY data correctly. > This is with CVS HEAD (and 8.1). > > It is the 4th column in the table that gives the problem (nutr_no > integer). Each of the 3 COPY rows has a different way of specifying a > number: > > * As an unquoted empty value > * As a quoted number > * As a quoted empty value > > Now, I'm specifying "NULL AS ''" in the COPY command but it refuses to > recognize the quoted empty value as NULL. > > Anyone have any ideas? Is this a bug? The manual even says that > using "NULL AS ''" lets you make no distinction between empty and > quoted empty values. It is not a bug. It is working as designed and as documented. The rule is basically that a null value is never quoted and a quoted value is never null. The docs say: "The CSV format has no standard way to distinguish a NULL value from an empty string. PostgreSQL's COPY handles this by quoting. A NULL is output as the NULL string and is not quoted, while a data value matching the NULL string is quoted. Therefore, using the default settings, a NULL is written as an unquoted empty string, while an empty string is written with double quotes (""). Reading values follows similar rules. You can use FORCE NOT NULL to prevent NULL input comparisons for specific columns." The corollary of this is that you can't make a quoted value be accepted as null (FORCE NOT NULL does the converse, i.e. it makes an unquoted value be considered as not null). The handling of nulls in CSV mode was discussed in almost nauseous detail at the time we did CSV about 18 months ago, so there should be no surprise here. For data as irregular as this I suggest that you import it into a text field and then update the value of that field to NULL where it's empty - you could then do ALTER TYPE ... USING ... . The other possibility would be to preprocess the data. I at least am not inclined to tinker too much more with CSV mode - we could end up catering for every weird output format in the world if we're not careful. cheers andrew
В списке pgsql-hackers по дате отправления: