Re: [HACKERS] Re: [GENERAL] indexed regex select optimisation missing?
От | Stuart Woolford |
---|---|
Тема | Re: [HACKERS] Re: [GENERAL] indexed regex select optimisation missing? |
Дата | |
Msg-id | 99110613104200.00731@test.macmillan.co.nz обсуждение исходный текст |
Ответ на | Re: [HACKERS] Re: [GENERAL] indexed regex select optimisation missing? (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-general |
Firstly, damb you guys are good, please accept my strongest complements for the response time on this issue! On Sat, 06 Nov 1999, Tom Lane wrote: > "Ross J. Reedstrom" <reedstrm@wallace.ece.rice.edu> writes: > > Reviewing my email logs from June, most of the work on this has to do with > > people who needs locales, and potentially multibyte character sets. Tom > > Lane is of the opinion that this particular optimization needs to be moved > > out of the parser, and deeper into the planner or optimizer/rewriter, > > so a good fix may be some ways out. > > Actually, that part is already done: addition of the index-enabling > comparisons is gone from the parser and is now done in the optimizer, > which has a whole bunch of benefits (one being that the comparison > clauses don't get added to the query unless they are actually used > with an index!). > > But the underlying LOCALE problem still remains: I don't know a good > character-set-independent method for generating a "just a little bit > larger" string to use as the righthand limit. If anyone out there is > an expert on foreign and multibyte character sets, some help would > be appreciated. Basically, given that we know the LIKE or regex > pattern can only match values beginning with FOO, we want to generate > string comparisons that select out the range of values that begin with > FOO (or, at worst, a slightly larger range). In USASCII locale it's not > hard: you can do > field >= 'FOO' AND field < 'FOP' > but it's not immediately obvious how to make this idea work reliably > in the presence of odd collation orders or multibyte characters... how about something along the lines of: file >='FOO' and field='FOO.*' ie, terminate once the search fails on a match of the static left-hand-side followed by anything (although I have the feeling this does not fit into your execution system..), and a simple regex type check be added to the scan validation code? > > BTW: the \377 hack is actually wrong for USASCII too, since it'll > exclude a data value like 'FOO\377x' which should be included. That's why I pointed out that in my particular case, I only have alpha and numeric data in the database, so it is safe, it's certainly no general solution. -- ------------------------------------------------------------ Stuart Woolford, stuartw@newmail.net Unix Consultant. Software Developer. Supra Club of New Zealand. ------------------------------------------------------------
В списке pgsql-general по дате отправления: