Обсуждение: importing 0xe3809c character, aka wave dash

Поиск
Список
Период
Сортировка

importing 0xe3809c character, aka wave dash

От
Jorg Heymans
Дата:
Hi,

I am having problems importing sql files containing WAVE DASH or MINUS
character :

WARNING:  ignoring unconvertible UTF-8 character 0xe3809c
WARNING:  ignoring unconvertible UTF-8 character 0xe28892

The data contains japanese characters and is imported correctly apart
from these two. It seems that postgres is filtering out these characters
during import, resulting in incorrect data.


I have isolated a small (<1kb) testcase for the wavedash,
www.domek.be/testwavedash.sql.

Note that the sql files are produced using postgis' shp2pgsql tool from
ESRI shape files.

The database is setup using EUC_JP encoding.

Regards
Jorg

Re: importing 0xe3809c character, aka wave dash

От
Jorg Heymans
Дата:
FWIW, I edited the mappings under
src/backend/utils/mb/Unicode, added the character mappings and rebuilt
postgres to make the characters imported correctly.

Learning more about the problem, it seemed that there is no 100%
standard for mapping certain characters and everyone sort of does them
how they see fit.

Regards
Jorg

Jorg Heymans wrote:
> Hi,
>
> I am having problems importing sql files containing WAVE DASH or MINUS
> character :
>
> WARNING:  ignoring unconvertible UTF-8 character 0xe3809c
> WARNING:  ignoring unconvertible UTF-8 character 0xe28892
>
> The data contains japanese characters and is imported correctly apart
> from these two. It seems that postgres is filtering out these characters
> during import, resulting in incorrect data.
>
>
> I have isolated a small (<1kb) testcase for the wavedash,
> www.domek.be/testwavedash.sql.
>
> Note that the sql files are produced using postgis' shp2pgsql tool from
> ESRI shape files.
>
> The database is setup using EUC_JP encoding.
>
> Regards
> Jorg
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>