Re: Support LIKE with nondeterministic collations
От | Peter Eisentraut |
---|---|
Тема | Re: Support LIKE with nondeterministic collations |
Дата | |
Msg-id | b32cefe2-b9e2-499e-b919-fe8f21c5bc22@eisentraut.org обсуждение исходный текст |
Ответ на | Re: Support LIKE with nondeterministic collations ("Daniel Verite" <daniel@manitou-mail.org>) |
Список | pgsql-hackers |
On 03.05.24 16:58, Daniel Verite wrote: > * Generating bounds for a sort key (prefix matching) > > Having sort keys for strings allows for easy creation of bounds - > sort keys that are guaranteed to be smaller or larger than any sort > key from a give range. For example, if bounds are produced for a > sortkey of string “smith”, strings between upper and lower bounds > with one level would include “Smith”, “SMITH”, “sMiTh”. Two kinds > of upper bounds can be generated - the first one will match only > strings of equal length, while the second one will match all the > strings with the same initial prefix. > > CLDR 1.9/ICU 4.6 and later map U+FFFF to a collation element with > the maximum primary weight, so that for example the string > “smith\uFFFF” can be used as the upper bound rather than modifying > the sort key for “smith”. > > In other words it says that > > col LIKE 'smith%' collate "nd" > > is equivalent to: > > col >= 'smith' collate "nd" AND col < U&'smith\ffff' collate "nd" > > which could be obtained from an index scan, assuming a btree > index on "col" collate "nd". > > U+FFFF is a valid code point but a "non-character" [1] so it's > not supposed to be present in normal strings. Thanks, this could be very useful!
В списке pgsql-hackers по дате отправления: