Re: [SQL] querying with index on jsonb slower than standard column. Why?
От | Adrian Klaver |
---|---|
Тема | Re: [SQL] querying with index on jsonb slower than standard column. Why? |
Дата | |
Msg-id | 548614C8.90203@aklaver.com обсуждение исходный текст |
Ответ на | Re: [SQL] querying with index on jsonb slower than standard column. Why? (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: [SQL] querying with index on jsonb slower than standard column. Why?
Re: [SQL] querying with index on jsonb slower than standard column. Why? |
Список | pgsql-performance |
On 12/08/2014 12:53 PM, Tom Lane wrote: > I wrote: >> Adrian Klaver <adrian.klaver@aklaver.com> writes: >>> Seems work_mem is the key: > >> Fascinating. So there's some bad behavior in the lossy-bitmap stuff >> that's exposed by one case but not the other. > > Meh. I was overthinking it. A bit of investigation with oprofile exposed > the true cause of the problem: whenever the bitmap goes lossy, we have to > execute the "recheck" condition for each tuple in the page(s) that the > bitmap has a lossy reference to. So in the fast case we are talking about > > Recheck Cond: ((assay1_ic50 > 90::double precision) AND (assay2_ic50 < 10::double precision)) > > which involves little except pulling the float8 values out of the tuple > and executing float8gt and float8lt. In the slow case we have got > > Recheck Cond: ((((data ->> 'assay1_ic50'::text))::double precision > 90::double precision) AND (((data ->> 'assay2_ic50'::text))::doubleprecision < 10::double precision)) > > which means we have to pull the JSONB value out of the tuple, search > it to find the 'assay1_ic50' key, convert the associated value to text > (which is not exactly cheap because *the value is stored as a numeric*), > then reparse that text string into a float8, after which we can use > float8gt. And then probably do an equivalent amount of work on the way > to making the other comparison. > > So this says nothing much about the lossy-bitmap code, and a lot about > how the JSONB code isn't very well optimized yet. In particular, the > decision not to provide an operator that could extract a numeric field > without conversion to text is looking pretty bad here. > I think I understand the above. I redid the test on my 32-bit machine, setting work_mem=16MB, and I got comparable results to what I saw on the 64-bit machine. So, what I am still am puzzled by is why work_mem seems to make the two paths equivalent in time?: Fast case, assay1_ic50 > 90 and assay2_ic50 < 10: 1183.997 ms Slow case, (data->>'assay1_ic50')::float > 90 and (data->>'assay2_ic50')::float < 10;: 1190.187 ms > > regards, tom lane > > -- Adrian Klaver adrian.klaver@aklaver.com
В списке pgsql-performance по дате отправления: