Re: OCTET_LENGTH is wrong
От | Tatsuo Ishii |
---|---|
Тема | Re: OCTET_LENGTH is wrong |
Дата | |
Msg-id | 20011118150828R.t-ishii@sra.co.jp обсуждение исходный текст |
Ответ на | Re: OCTET_LENGTH is wrong (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: OCTET_LENGTH is wrong
Re: OCTET_LENGTH is wrong |
Список | pgsql-hackers |
> Peter Eisentraut <peter_e@gmx.net> writes: > > I noticed OCTET_LENGTH will return the size of the data after TOAST may > > have compressed it. While this could be useful information, this > > behaviour has no basis in the SQL standard and it's not what is > > documented. Moreover, it eliminates the standard useful behaviour of > > OCTET_LENGTH, which is to show the length in bytes of a multibyte string. > > I wondered about that too, the first time I noticed it. On the other > hand, knowing the compressed length is kinda useful too, at least for > hacking and DBA purposes. (One might also like to know whether a value > has been moved out of line, which is not currently determinable.) It seems the behavior of OCTET_LENGTH varies acording to the corresponding data type: TEXT: returns the size of data AFTER TOAST VARCHAR and CHAR: returns the size of data BEFORE TOAST I think we should fix at least these inconsistencies but am not sure if it's totally wrong that OCTET_LENGTH returns the length AFTER TOAST. The SQL standard does not have any idea about TOAST of course. Also, I tend to agree with Tom's point about hackers and DBAs. > I don't want to force an initdb at this stage, at least not without > compelling reason, so adding more functions right now is not feasible. > Maybe a TODO item for next time. > > That leaves us with the question whether to change OCTET_LENGTH now > or leave it for later. Anyone? My opinion is leaving it for 7.3, with the idea (adding new functions). > BTW, I noticed that textlength() is absolutely unreasonably slow when > MULTIBYTE is enabled --- yesterday I was trying to profile TOAST > overhead, and soon discovered that what I was looking at was nothing > but pg_mblen() calls. It really needs a short-circuit path for > single-byte encodings. It's easy to optimize that. However I cannot access CVS anymore after the IP address change. Will post patches later... -- Tatsuo Ishii
В списке pgsql-hackers по дате отправления: