Re: Full text search tsv column aproach vs concat confusion
От | Artur Zakirov |
---|---|
Тема | Re: Full text search tsv column aproach vs concat confusion |
Дата | |
Msg-id | d9a43710-b53d-9ef9-4a52-ade354223fda@postgrespro.ru обсуждение исходный текст |
Ответ на | Full text search tsv column aproach vs concat confusion (cen <imbacen@gmail.com>) |
Список | pgsql-general |
Hi, On 16.11.2016 11:54, cen wrote: > Hi > > I am seeking some clarification in regard to full text search across > multiple tables and what the best approach is. Documentation talks about > two approaches when it comes to building a document: on-the-fly concat > of columns and a dedicated tsv column approach. Let's say I want to > perform a search for |"txt1 & txt2 & txt3" on columns table1.col1, > table1.col2 and table2.col1. I see the following solutions:| > > |1. Concat all three into a document and perform a FTS.| > > |SELECT * FROM ( > | > > |SELECTto_tsvector(table1.col1)||to_tsvector(table1.col2)||to_tsvector(table2.col1)asdocument > F|||ROM table1 LEFTJOINtable2 ONtable1.table2_id=table2.id| ) subquery |||WHEREsubquery.document@@to_tsquery(unaccent(?));|| > > |2. Create a tsv column in each table, concat tsv columns and perform > FTS on that.| > > |SELECT*FROMtable1 LEFTJOINtable2 ONtable1.table2_id=table2.id > WHEREtable1.tsv ||tale2.tsv @@to_tsquery(unaccent(?));| > > |3. Have a tsv column only in table1 and insert table2.col1 to the tsv > via triggers. Works but seems very hacky.| > > | > | > > |It seems to me that option #2 is fast and easy to implement but I am > not sure what the concat of tsvs really means from index usage and > performance standpoint. Option #1 is the most flexible and I'd use that > all the time if it was not THAT much slower than tsv column approacj. > Documentation on TSV columns states: "||Another advantage is that > searches will be faster, since it will not be necessary to redo the > to_tsvector calls to verify index matches." > | > > The question is, how much faster are tsv columns really? Are there any > benchmarks about this? If the performance difference is negligible I'd > advocate that using tsv columns is a waste of time and space in most > general cases. But since there is no information on how much faster it's > hard to decide. > I haven't any such benchmarks. But if you have a real database, you can perform tests using it on your solutions. Because it depends on your task and what you need. By the way, I suppose it is better to use COALESCE() function if your columns could have NULL value: SELECT * FROM ( SELECT to_tsvector(coalesce(table1.col1,'')) || to_tsvector(coalesce(table1.col2,'')) || to_tsvector(coalesce(table2.col1,'')) as document FROM table1 LEFT JOIN table2 ON table1.table2_id=table2.id ) subquery WHERE subquery.document @@ to_tsquery(unaccent(?)); And specifying a text search configuration makes queries a little bit faster: ... to_tsvector('english', coalesce(table1.col1,'')) ... -- Artur Zakirov Postgres Professional: http://www.postgrespro.com Russian Postgres Company
В списке pgsql-general по дате отправления: