similarity and operator '%'
От | Volker Boehm |
---|---|
Тема | similarity and operator '%' |
Дата | |
Msg-id | 574C7E37.7080102@vboehm.de обсуждение исходный текст |
Ответы |
Re: similarity and operator '%'
Re: similarity and operator '%' |
Список | pgsql-performance |
Hello, I'm trying to find persons in an address database where I have built trgm-indexes on name, street, zip and city. When I search for all four parts of the address (name, street, zip and city) select name, street, zip, city from addresses where name % $1 and street % $2 and (zip % $3 or city % $4) everything works fine: It takes less than a second to get some (5 - 500) proposed addresses out of 500,000 addresses and the query plan shows Bitmap Heap Scan on addresses (cost=168.31..1993.38 rows=524 ... Recheck Cond: ... -> Bitmap Index Scan on ... Index Cond: ... The same happens when I search only by name with select name, street, zip, city from addresses where name % $1 But when I rewrite this query to select name, street, zip, city from addresses where similarity(name, $1) > 0.3 which means exactly then same as the second example, the query plan changes to Seq Scan on addresses (cost=0.00..149714.42 rows=174675 width=60) Filter: ... and the query lasts about a minute. The reason for using the similarity function in place of the '%'-operator is that I want to use different similarity values in one query: select name, street, zip, city from addresses where name % $1 and street % $2 and (zip % $3 or city % $4) or similarity(name, $1) > 0.8 which means: take all addresses where name, street, zip and city have little similarity _plus_ all addresses where the name matches very good. The only way I found, was to create a temporary table from the first query, change the similarity value with set_limit() and then select the second query UNION the temporary table. Is there a more elegant and straight forward way to achieve this result? regards Volker -- Volker Böhm Tel.: +49 4141 981155 Voßkuhl 5 mailto:volker@vboehm.de 21682 Stade http://www.vboehm.de
В списке pgsql-performance по дате отправления: