Re: Collation rules and multi-lingual databases
От | Tom Lane |
---|---|
Тема | Re: Collation rules and multi-lingual databases |
Дата | |
Msg-id | 27478.1061757135@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: Collation rules and multi-lingual databases (Greg Stark <gsstark@mit.edu>) |
Список | pgsql-hackers |
Greg Stark <gsstark@mit.edu> writes: > The glibc docs sample code suggests using 2x the original string > length for the initial buffer. My testing showed that *always* > triggered the exceptional case. A bit of experimentation lead to the > 3x+4 which eliminates it except for 0 and 1 byte strings. I'm still > tweaking it. But on another OS, or in a more complex collation locale > maybe you would still trigger it a lot. On HPUX it seems you always need 4x. Also, *there are bugs* in some platforms' implementations of strxfrm, such that an undersized buffer may get overrun anyway. I had originally tried to optimize the buffer size like this in src/backend/utils/adt/selfuncs.c's use of strxfrm, and eventually was forced to give it up as hopeless. I strongly suggest using the same code now seen there: char *xfrmstr; size_t xfrmlen; size_t xfrmlen2; /* * Note: originally we guessed at a suitable output buffer size, * and only needed to call strxfrmtwice if our guess was too * small. However, it seems that some versions of Solaris have * buggy strxfrmthat can write past the specified buffer length * in that scenario. So, do it the dumb way for portability. * * Yet other systems (e.g., glibc) sometimes return a smaller value * from the second callthan the first; thus the Assert must be <= * not == as you'd expect. Can't any of these people program * their way out of a paper bag? */ xfrmlen = strxfrm(NULL, val, 0); xfrmstr = (char *) palloc(xfrmlen+ 1); xfrmlen2 = strxfrm(xfrmstr, val, xfrmlen + 1); Assert(xfrmlen2 <= xfrmlen); regards, tom lane
В списке pgsql-hackers по дате отправления: