Re: gsoc, text search selectivity and dllist enhancments
От | Tom Lane |
---|---|
Тема | Re: gsoc, text search selectivity and dllist enhancments |
Дата | |
Msg-id | 18435.1215996859@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: gsoc, text search selectivity and dllist enhancments (Jan Urbański <j.urbanski@students.mimuw.edu.pl>) |
Ответы |
Re: gsoc, text search selectivity and dllist enhancments
|
Список | pgsql-hackers |
Jan Urbański <j.urbanski@students.mimuw.edu.pl> writes: > OK, here's the (hopefully final) version of the typanalyze function for > tsvectors. It applies to HEAD and passes regression tests. > I now plan to move towards a selectivity function that'll use the > gathered statistics. Applied with some revisions. Rather than making pg_statistic stakind 4 be specific to tsvector, I thought it'd be better to define it as "most common elements", with the idea that it could be used for array and array-like types as well as tsvector. (I'm not actually planning to go off and make that happen right now, but it seems like a pretty obvious extension.) I thought it was a bit schizophrenic to repurpose pg_stats.most_common_freqs for element frequencies while creating a separate column for the elements themselves. What I've done for the moment is to define both most_common_vals and most_common_freqs as referring to the elements in the case of tsvector (or anything else that has stakind 4 in place of stakind 1). You could make an argument for inventing *two* new pg_stats columns instead, but I think that is probably overkill; I doubt it'll be useful to have both MCV and MCELEM stats for the same column. This could easily be changed though. I removed the prune step after the last tsvector. I'm not convinced that the LC algorithm's guarantees still hold if we prune partway through a bucket, and anyway it's far from clear that we'd save enough in the sort step to compensate for more HASH_REMOVE operations. I'm open to being convinced otherwise. I made some other cosmetic changes, but those were the substantive ones. regards, tom lane
В списке pgsql-hackers по дате отправления: