Re: Unicode grapheme clusters

Поиск
Список
Период
Сортировка
От Greg Stark
Тема Re: Unicode grapheme clusters
Дата
Msg-id CAM-w4HNoonCZW3p=D9J2ev7LpOKXiAsgaH-XOUV=3gL_OJMwOA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Unicode grapheme clusters  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Unicode grapheme clusters  (Isaac Morland <isaac.morland@gmail.com>)
Re: Unicode grapheme clusters  (Bruce Momjian <bruce@momjian.us>)
Список pgsql-hackers
On Sat, 21 Jan 2023 at 13:17, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Probably our long-term answer is to avoid depending on wcwidth
> and use wcswidth instead.  But it's hard to get excited about
> doing the legwork for that until popular libc implementations
> get it right.

Here's an interesting blog post about trying to do this in Rust:

https://tomdebruijn.com/posts/rust-string-length-width-calculations/

TL;DR... Even counting the number of graphemes isn't enough because
terminals typically (but not always) display emoji graphemes using two
columns.

At the end of the day Unicode kind of assumes a variable-width display
where the rendering is handled by something that has access to the
actual font metrics. So anything trying to line things up in columns
in a way that works with any rendering system down the line using any
font is going to be making a best guess.

-- 
greg



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jelte Fennema
Дата:
Сообщение: Re: run pgindent on a regular basis / scripted manner
Следующее
От: Tom Lane
Дата:
Сообщение: Re: run pgindent on a regular basis / scripted manner