Re: Collect frequency statistics for arrays
От | Nathan Boley |
---|---|
Тема | Re: Collect frequency statistics for arrays |
Дата | |
Msg-id | CAHetpQSc-fyp6PyvAXGFgWLEDYswsG6ydAJ9bUdrrd-WBepgsw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Collect frequency statistics for arrays (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
[ sorry Tom, reply all this time... ] > What do you mean by "storing sequences as arrays"? So, a simple example is, for transcripts ( sequences of DNA that are turned into proteins ), we store each of the connected components as an array of the form: exon_type in [1,6] splice_type = [1,3] and then the array elements are [ exon_type, splice_type, exon_type ] ~ 99% of the elements are of the form [ [1,3], 1, [1,3] ], so I almost always get a hash or merge join ( correctly ) but for the rare junction types ( which are usually more interesting as well ) I correctly get nest loops with an index scan. > Can you demonstrate > that the existing stats are relevant at all to the query you're worried > about? Well, if we didn't have mcv's and just relied on ndistinct to estimate the '=' selectivities, either my low selectivity quals would use the index, or my high selectivity quals would use a table scan, either of which would be wrong. I guess I could wipe out the stats and get some real numbers tonight, but I can't see how the planner would be able to distinguish *without* mcv's... Best, Nathan
В списке pgsql-hackers по дате отправления: