Re: Unicode grapheme clusters
От | Bruce Momjian |
---|---|
Тема | Re: Unicode grapheme clusters |
Дата | |
Msg-id | Y8nktYFVf21NmmU+@momjian.us обсуждение исходный текст |
Ответ на | Re: Unicode grapheme clusters (Greg Stark <stark@mit.edu>) |
Ответы |
Re: Unicode grapheme clusters
|
Список | pgsql-hackers |
On Thu, Jan 19, 2023 at 07:37:48PM -0500, Greg Stark wrote: > This is how we've always documented it. Postgres treats code points as > "characters" not graphemes. > > You don't need to go to anything as esoteric as emojis to see this either. > Accented characters like é have no canonical forms that are multiple code > points and in some character sets some accented characters can only be > represented that way. > > But I don't think there's any reason to consider changing e existing functions. > They have to be consistent with substr and the other string manipulation > functions. > > We could add new functions to work with graphemes but it might bring more pain > keeping it up to date.... I am not sure what you are referring to above? character_length? I was talking about display length, and psql uses that --- at some point, our lack of support for graphemes will cause psql to not align columns. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Embrace your flaws. They make you human, rather than perfect, which you will never be.
В списке pgsql-hackers по дате отправления: