What is the maximum encoding-conversion growth rate, anyway?
От | Tom Lane |
---|---|
Тема | What is the maximum encoding-conversion growth rate, anyway? |
Дата | |
Msg-id | 29182.1180371229@sss.pgh.pa.us обсуждение исходный текст |
Ответы |
Re: What is the maximum encoding-conversion growth rate,
anyway?
Re: What is the maximum encoding-conversion growth rate, anyway? |
Список | pgsql-hackers |
I just rearranged the code in mbutils.c a little bit to make it more robust if conversion of an over-length string is attempted, and noted this comment: /** When converting strings between different encodings, we assume that space* for converted result is 4-to-1 growth in theworst case. The rate for* currently supported encoding pairs are within 3 (SJIS JIS X0201 half width* kanna -> UTF8 isthe worst case). So "4" should be enough for the moment.** Note that this is not the same as the maximum character widthin any* particular encoding.*/ #define MAX_CONVERSION_GROWTH 4 It strikes me that this is overly pessimistic, since we do not support 5- or 6-byte UTF8 characters, and AFAICS there are no 1-byte characters in any supported encoding that require 4 bytes in another. Could we reduce the multiplier to 3? Or even 2? This has a direct impact on the longest COPY lines we can support, so I'd like it not to be larger than necessary. regards, tom lane
В списке pgsql-hackers по дате отправления: