pgsql: Add SQL functions for Unicode normalization
От | Peter Eisentraut |
---|---|
Тема | pgsql: Add SQL functions for Unicode normalization |
Дата | |
Msg-id | E1jJtwj-0000LS-M9@gemulon.postgresql.org обсуждение исходный текст |
Список | pgsql-committers |
Add SQL functions for Unicode normalization This adds SQL expressions NORMALIZE() and IS NORMALIZED to convert and check Unicode normal forms, per SQL standard. To support fast IS NORMALIZED tests, we pull in a new data file DerivedNormalizationProps.txt from Unicode and build a lookup table from that, using techniques similar to ones already used for other Unicode data. make update-unicode will keep it up to date. We only build and use these tables for the NFC and NFKC forms, because they are too big for NFD and NFKD and the improvement is not significant enough there. Reviewed-by: Daniel Verite <daniel@manitou-mail.org> Reviewed-by: Andreas Karlsson <andreas@proxel.se> Discussion: https://www.postgresql.org/message-id/flat/c1909f27-c269-2ed9-12f8-3ab72c8caf7a@2ndquadrant.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/2991ac5fc9b3904ca4582be6d323497d7c3d17c9 Modified Files -------------- doc/src/sgml/charset.sgml | 10 + doc/src/sgml/func.sgml | 48 + src/backend/catalog/sql_features.txt | 2 +- src/backend/catalog/system_views.sql | 15 + src/backend/parser/gram.y | 41 +- src/backend/utils/adt/varlena.c | 150 + src/common/unicode/.gitignore | 1 + src/common/unicode/Makefile | 9 +- .../unicode/generate-unicode_normprops_table.pl | 86 + src/common/unicode_norm.c | 110 + src/include/catalog/catversion.h | 2 +- src/include/catalog/pg_proc.dat | 8 + src/include/common/unicode_norm.h | 10 + src/include/common/unicode_normprops_table.h | 6154 ++++++++++++++++++++ src/include/parser/kwlist.h | 6 + src/test/regress/expected/unicode.out | 81 + src/test/regress/expected/unicode_1.out | 3 + src/test/regress/parallel_schedule | 2 +- src/test/regress/serial_schedule | 1 + src/test/regress/sql/unicode.sql | 32 + 20 files changed, 6764 insertions(+), 7 deletions(-)
В списке pgsql-committers по дате отправления: