Re: daitch_mokotoff module
От | Tom Lane |
---|---|
Тема | Re: daitch_mokotoff module |
Дата | |
Msg-id | 3563190.1641227676@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: daitch_mokotoff module (Dag Lem <dag@nimrod.no>) |
Ответы |
[PATCH] Run UTF8-dependent tests for citext [Re: daitch_mokotoff module]
|
Список | pgsql-hackers |
Dag Lem <dag@nimrod.no> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> (We do have methods for dealing with non-ASCII test cases, but >> I can't see that this patch is using any of them.) > I naively assumed that tests would be run in an UTF8 environment. Nope, not necessarily. Our current best practice for this is to separate out encoding-dependent test cases into their own test script, and guard the script with an initial test on database encoding. You can see an example in src/test/modules/test_regex/sql/test_regex_utf8.sql and the two associated expected-files. It's a good idea to also cover as much as you can with pure-ASCII test cases that will run regardless of the prevailing encoding. > Running "ack -l '[\x80-\xff]'" in the contrib/ directory reveals that > two other modules are using UTF8 characters in tests - citext and > unaccent. Yeah, neither of those have been upgraded to said best practice. (If you feel like doing the legwork to improve that situation, that'd be great.) > Looking into the unaccent module, I don't quite understand how it will > work with various encodings, since it doesn't seem to decode its input - > will it fail if run under anything but ASCII or UTF8? Its Makefile seems to be forcing the test database to use UTF8. I think this is a less-than-best-practice choice, because then we have zero test coverage for other encodings; but it does prevent test failures. regards, tom lane
В списке pgsql-hackers по дате отправления: