Re: Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding.
От | Alex Hunsaker |
---|---|
Тема | Re: Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding. |
Дата | |
Msg-id | CAFaPBrT9EpWWcHYrvUv5OQQz6Vgg+xQX0mE1ZaQguVUuXo6-Rg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding. (Amit Khandekar <amit.khandekar@enterprisedb.com>) |
Ответы |
Re: Re: [COMMITTERS] pgsql: Force strings passed to and
from plperl to be in UTF8 encoding.
|
Список | pgsql-hackers |
On Tue, Oct 4, 2011 at 03:09, Amit Khandekar <amit.khandekar@enterprisedb.com> wrote: > On 4 October 2011 14:04, Alex Hunsaker <badalex@gmail.com> wrote: >> On Mon, Oct 3, 2011 at 23:35, Amit Khandekar >> <amit.khandekar@enterprisedb.com> wrote: >> >>> WHen GetDatabaseEncoding() != PG_UTF8 case, ret will not be equal to >>> utf8_str, so pg_verify_mbstr_len() will not get called. [...] >> >> Consider a latin1 database where utf8_str was a string of ascii >> characters. [...] >> [Patch] Look ok to you? >> > > + if(GetDatabaseEncoding() == PG_UTF8) > + pg_verify_mbstr_len(PG_UTF8, utf8_str, len, false); > > In your patch, the above will again skip mb-validation if the database > encoding is SQL_ASCII. Note that in pg_do_encoding_conversion returns > the un-converted string even if *one* of the src and dest encodings is > SQL_ASCII. *scratches head* I thought the point of SQL_ASCII was no encoding conversion was done and so there would be nothing to verify. Ahh I see looks like pg_verify_mbstr_len() will make sure there are no NULL bytes in the string when we are a single byte encoding. > I think : > if (ret == utf8_str) > + { > + pg_verify_mbstr_len(PG_UTF8, utf8_str, len, false); > ret = pstrdup(ret); > + } > > This (ret == utf8_str) condition would be a reliable way for knowing > whether pg_do_encoding_conversion() has done the conversion at all. Yes. However (and maybe im nitpicking here), I dont see any reason to verify certain strings twice if we can avoid it. What do you think about: + /* + * when we are a PG_UTF8 or SQL_ASCII database pg_do_encoding_conversion() + * will not do any conversion or verification. we need to do it manually instead. + */ + if( GetDatabaseEncoding() == PG_UTF8 || GetDatabaseEncoding() == SQL_ASCII) + pg_verify_mbstr_len(PG_UTF8, utf8_str, len, false);
В списке pgsql-hackers по дате отправления: