Re: Mac OS: invalid byte sequence for encoding "UTF8"
От | Teodor Sigaev |
---|---|
Тема | Re: Mac OS: invalid byte sequence for encoding "UTF8" |
Дата | |
Msg-id | 56BB5C84.8060106@sigaev.ru обсуждение исходный текст |
Ответ на | Re: Mac OS: invalid byte sequence for encoding "UTF8" (Artur Zakirov <a.zakirov@postgrespro.ru>) |
Ответы |
Re: Mac OS: invalid byte sequence for encoding "UTF8"
|
Список | pgsql-hackers |
> It seems that *scanf() with %s format occures only here: > - check.c - get_bin_version() > - server.c - get_major_server_version() > - filemap.c - isRelDataFile() > - pg_backup_directory.c - _LoadBlobs() > - xlog.c - do_pg_stop_backup() > - mac.c - macaddr_in() > I think here sscanf() do not works with the UTF-8 characters. And probably this > is only spell.c issue. Hmm. Here src/backend/access/transam/xlog.c read_tablespace_map() using %s in scanf looks suspisious. I don't fully understand but it looks like it tries to read oid as string. So, it should be safe in usial case Next, _LoadBlobs() reads filename (fname) with a help of sscanf. Could file name be in UTF-8 encoding here? > > I agree that previous patch is wrong. Instead of using new parse_ooaffentry() > function maybe better to use sscanf() with %ls format. The %ls format is used to > read a wide character string. Does %ls modifier exist everewhere? Apple docs says (https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man3/sscanf.3.html): s ... If an l qualifier is present, the next pointer must be a pointer to wchar_t, into which the input will be placedafter conversion by mbrtowc Actually, it means that wchar2char() call should be used, but it uses wcstombs[_l] which could do not present on some platforms.Does it mean that l modifier of string presents too or not? What do we need to do if %l exists but wcstombs[_l] not? I'm a bit crazy with locale problems and it seems to me that Artur's patch is good idea. Actually, I don't remember exactly, but, seems, commit 7ac8a4be8946c11d5a6bf91bb971b9750c1c60e5 introduced parse_affentry() instead of corresponding sscanf to avoid problems with encoding and scanf. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
В списке pgsql-hackers по дате отправления: