Re: gsoc, text search selectivity and dllist enhancments
От | Tom Lane |
---|---|
Тема | Re: gsoc, text search selectivity and dllist enhancments |
Дата | |
Msg-id | 19287.1215728376@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: gsoc, text search selectivity and dllist enhancments (Jan Urbański <j.urbanski@students.mimuw.edu.pl>) |
Ответы |
Re: gsoc, text search selectivity and dllist enhancments
|
Список | pgsql-hackers |
Jan Urbański <j.urbanski@students.mimuw.edu.pl> writes: > Tom Lane wrote: >> The way I think it ought to work is that the number of lexemes stored in >> the final pg_statistic entry is statistics_target times a constant >> (perhaps 100). I don't like having it vary depending on tsvector width > I think the existing code puts at most statistics_target elements in a > pg_statistic tuple. In compute_minimal_stats() num_mcv starts with > stats->attr->attstattarget and is adjusted only downwards. > My original thought was to keep that property for tsvectors (i.e. store > at most statistics_target lexemes) and advise people to set it high for > their tsvector columns (e.g. 100x their default). Well, (1) the normal measure would be statistics_target *tsvectors*, and we'd have to translate that to lexemes somehow; my proposal is just to use a fixed constant instead of tsvector width as in your original patch. And (2) storing only statistics_target lexemes would be uselessly small and would guarantee that people *have to* set a custom target on tsvector columns to get useful results. Obviously broken defaults are not my bag. > Also, the existing code decides which elements are worth storing as most > common ones by discarding those that are not frequent enough (that's > where num_mcv can get adjusted downwards). I mimicked that for lexemes > but maybe it just doesn't make sense? Well, that's not unreasonable either, if you can come up with a reasonable definition of "not frequent enough"; but that adds another variable to the discussion. regards, tom lane
В списке pgsql-hackers по дате отправления: