Обсуждение: pgsql: Unicode case mapping tables and functions.
Unicode case mapping tables and functions. Implements Unicode simple case mapping, in which all code points map to exactly one other code point unconditionally. These tables are generated from UnicodeData.txt, which is already being used by other infrastructure in src/common/unicode. The tables are checked into the source tree, so they only need to be regenerated when we update the Unicode version. In preparation for the builtin collation provider, and possibly useful for other callers. Discussion: https://postgr.es/m/ff4c2f2f9c8fc7ca27c1c24ae37ecaeaeaff6b53.camel%40j-davis.com Reviewed-by: Peter Eisentraut, Daniel Verite, Jeremy Schneider Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/5c40364dd6d9c6a260c8965dffe2e066642d6f79 Modified Files -------------- src/common/Makefile | 1 + src/common/meson.build | 1 + src/common/unicode/Makefile | 15 +- src/common/unicode/case_test.c | 100 + src/common/unicode/generate-unicode_case_table.pl | 134 + src/common/unicode/meson.build | 31 + src/common/unicode_case.c | 174 ++ src/common/wchar.c | 4 +- src/include/common/unicode_case.h | 27 + src/include/common/unicode_case_table.h | 3001 +++++++++++++++++++++ src/include/mb/pg_wchar.h | 15 + 11 files changed, 3498 insertions(+), 5 deletions(-)
On 07/03/2024 21:18, Jeff Davis wrote: > Unicode case mapping tables and functions. With -Wtype-limits, I'm seeing this warning: unicode_case.c: In function ‘convert_case’: unicode_case.c:107:47: warning: comparison of unsigned expression in ‘< 0’ is always false [-Wtype-limits] 107 | while (src[srcoff] != '\0' && (srclen < 0 || srcoff < srclen)) | ^ That seems like legit issue. The comment in unicode_strlower/upper() says: > * String src must be encoded in UTF-8. If srclen < 0, src must be > * NUL-terminated. But srclen is of type size_t, which is unsigned. -- Heikki Linnakangas Neon (https://neon.tech)
On Fri, 2024-03-08 at 10:24 +0200, Heikki Linnakangas wrote: > On 07/03/2024 21:18, Jeff Davis wrote: > > Unicode case mapping tables and functions. > > With -Wtype-limits, I'm seeing this warning: Thank you, fixed. Somehow I lost that flag from my script. Can you please add some recommended compiler warning flags here: https://wiki.postgresql.org/wiki/Committing_checklist ? Regards, Jeff Davis