Bug with UTF-8 character
От | Hans-Jürgen Schönig |
---|---|
Тема | Bug with UTF-8 character |
Дата | |
Msg-id | 44769E84.7000006@cybertec.at обсуждение исходный текст |
Ответы |
Re: Bug with UTF-8 character
Re: Bug with UTF-8 character |
Список | pgsql-hackers |
good morning, I got a bug request for the following unicode character in PostgreSQL 8.1.4: 0xedaeb8 ERROR: invalid byte sequence for encoding "UTF8": 0xedaeb8 This one seemed to work properly in PostgreSQL 8.0.3. I think the following code in postgreSQL 814 has a bug in it. File: postgresql-8.1.4/src/backend/utils/mb/wchar.c The entry values to the function are: source = ed ae b8 20 20 20 20 20 20 20 20 20 20 20 20 length = 3 (length is the length of current utf-8 character) But the code does a check where the second character should not be greater than 0x9F, when first character is 0xED. This is not according to UTF-8 standard in RFC 3629. I believe that is not a valid test. This test fails on our string, when it shouldn’t. I believe this is a bug, could you please confirm or let me know what I am doing wrong. Many thanks, Hans -- Cybertec Geschwinde & Schönig GmbH Schöngrabern 134; A-2020 Hollabrunn Tel: +43/1/205 10 35 / 340 www.postgresql.at, www.cybertec.at
В списке pgsql-hackers по дате отправления: