Re: Doc: typo in config.sgml
От | Tatsuo Ishii |
---|---|
Тема | Re: Doc: typo in config.sgml |
Дата | |
Msg-id | 20241001.103350.1086523034528885049.ishii@postgresql.org обсуждение исходный текст |
Ответ на | Doc: typo in config.sgml (Tatsuo Ishii <ishii@postgresql.org>) |
Список | pgsql-hackers |
>> That's because non-breaking space (nbsp) is not encoded as 0xa0 in >> UTF-8. nbsp in UTF-8 is "0xc2 0xa0" (2 bytes) (A 0xa0 is a nbsp's code >> point in Unicode. i.e. U+00A0). >> So grep -P "[\xC2\xA0]" should work to detect nbsp. > > `LC_ALL=C grep -P "\xC2\xA0"` works for my environment. > ([ and ] were not necessary.) > > When LC_ALL is null, `grep -P "\xA0"` could not detect any characters in charset.sgml, > but I think it is better to specify both LC_ALL=C and "\xC2\xA0" for making sure detecting > nbsp. > > One problem is that -P option can be used in only GNU grep, and grep in mac doesn't support it. > > On bash, we can also use `grep $'\xc2\xa0'`, but I am not sure we can assume the shell is bash. > > Maybe, better way is use perl itself rather than grep as following. > > `perl -ne '/\xC2\xA0/ and print' ` > > I attached a patch fixed in this way. GNU sed can also be used without setting LC_ALL: sed -n /"\xC2\xA0"/p However I am not sure if non-GNU sed can do this too... Best reagards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
В списке pgsql-hackers по дате отправления: