Re: tablesample performance
От | Francisco Olarte |
---|---|
Тема | Re: tablesample performance |
Дата | |
Msg-id | CA+bJJbxfQym-n8N5w5N56GF6QCmB2SoT2ySnysPFAad-D2EBcg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: tablesample performance (Andy Colson <andy@squeakycode.net>) |
Список | pgsql-general |
Andy: On Tue, Oct 18, 2016 at 7:17 PM, Andy Colson <andy@squeakycode.net> wrote: > Ah, yes, you're right, there is a bit of a difference there. > > Speed wise: > 1) select one from ones order by random() limit 1; >> about 360ms > 2) select one from ones tablesample bernoulli(1) limit 1 ; >> about 4ms > 3) select one from ones tablesample bernoulli(1) order by random() limit 1; >> about 80ms Expected. It would be nice if you had provided some tbale structure / size data. > > Using the third option in batch, I'm getting about 15 transactions a second. > > Oddly: > select one from ones tablesample bernoulli(0.25) order by random() > takes almost 80ms also. mmm, it depends a lot on you total rows and average rows per > bernoulli(0.25) returns 3k rows > bernoulli(1) returns 14k rows This hints at 1M4 rows (14k / 1%). If your rows are small and you have more than 400 rows per page I would expect that, as .25% sample would hit every page. Tome hinted you at an extension. Also, if you are in a function ( which can loop ) you can do a little trick, instead of bernouilli(1) use bernouilli (N/table_size). This way you will select very few rows and speed up the last phase. Anyway, I fear bernouilly must read all the table too, to be able to discard randomly, so you may not win nothing ( I would compare the query time against a simple 'count(one) query', to have a benchmark of how much time the server expends reading the table. I would bet for 'about 80 ms'. Francisco Olarte.
В списке pgsql-general по дате отправления: