Re: An idea on faster CHAR field indexing
От | Randall Parker |
---|---|
Тема | Re: An idea on faster CHAR field indexing |
Дата | |
Msg-id | 21130920118861@mail.nls.net обсуждение исходный текст |
Ответ на | An idea on faster CHAR field indexing ("Randall Parker" <randall@nls.net>) |
Ответы |
Re: An idea on faster CHAR field indexing
|
Список | pgsql-hackers |
Giles, I'm curious as to why the need for multiple passes. Is that true even in Latin 1 code pages? If not, this optimization could at least be used for code pages that don't require multiple passes. As for memory usage: I don't see the issue here. The translation to some collation sequence has to be done anyhow. Writing one's own routine to do look-ups into a collation sequence table is a fairly trivial exercise. One would have the option with SBCS code pages to either translate to 8 bit collation values or to translate them into master Unicode collation values. Not sure what the advantage would be of doing the latter. I only see it as useful if you have different rows storing text in different code pages and then only if the RDBMS can know for a given field on a per row basis what its code page is. On Thu, 22 Jun 2000 06:59:06 +1000, Giles Lean wrote: > >> So let me cut to the chase: I'm thinking that rather than store the >> actual character sequence of each field (or some subset of a field) >> in an index why not translate the characters into their collation >> sequence values and store _those_ in the index? > >This is not an obvious win, since: > >1. some collations rules require multiple passes over the data > >2. POSIX strxfrm() will convert strings of characters to a form that > can be compared by strcmp() [i.e. single pass] but tends to greatly > increase memory requirements > > I've only data for one implementation of strxfrm(), but the memory > usage startled me. In my application it was faster to use > strcoll() directly for collation than to pre-expand the data with > strxfrm(). > >Regards, > >Giles >
В списке pgsql-hackers по дате отправления: