Re: Unicode normalization SQL functions

Поиск

Список

Период

Сортировка

От	Andreas Karlsson
Тема	Re: Unicode normalization SQL functions
Дата	13 февраля 2020 г. 00:23:41
Msg-id	26150b35-240f-941c-e5a7-24f2d489b316@proxel.se обсуждение исходный текст
Ответ на	Re: Unicode normalization SQL functions (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Ответы	Re: Unicode normalization SQL functions Re: Unicode normalization SQL functions
Список	pgsql-hackers

Дерево обсуждения

On 1/28/20 9:21 PM, Peter Eisentraut wrote:
> You're right, this didn't make any sense.  Here is a new patch set with 
> that fixed.

Thanks for this patch. This is a feature which has been on my personal 
todo list for a while and something which I have wished to have a couple 
of times.

I took a quick look at the patch and here is some feedback:

A possible concern is increased binary size from the new tables for the 
quickcheck but personally I think they are worth it.

A potential optimization would be to merge utf8_to_unicode() and 
pg_utf_mblen() into one function in unicode_normalize_func() since 
utf8_to_unicode() already knows length of the character. Probably not 
worth it though.

It feels a bit wasteful to measure output_size in 
unicode_is_normalized() since unicode_normalize() actually already knows 
the length of the buffer, it just does not return it.

A potential optimization for the normalized case would be to abort the 
quick check on the first maybe and normalize from that point on only. If 
I can find the time I might try this out and benchmark it.

Nitpick: "split/\s*;\s*/, $line" in generate-unicode_normprops_table.pl 
should be "split /\s*;\s*/, $line".

What about using else if in the code below for clarity?

+        if (check == UNICODE_NORM_QC_NO)
+            return UNICODE_NORM_QC_NO;
+        if (check == UNICODE_NORM_QC_MAYBE)
+            result = UNICODE_NORM_QC_MAYBE;

Remove extra space in the line below.

+    else if (quickcheck == UNICODE_NORM_QC_NO )

Andreas

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Unicode normalization SQL functions