Re: [GENERAL] Large DB
От | Tom Lane |
---|---|
Тема | Re: [GENERAL] Large DB |
Дата | |
Msg-id | 10036.1080953867@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: [GENERAL] Large DB (Manfred Koizar <mkoi-pg@aon.at>) |
Ответы |
Re: [GENERAL] Large DB
|
Список | pgsql-hackers |
Manfred Koizar <mkoi-pg@aon.at> writes: >> You'd run the Vitter >> algorithm separately to decide whether to keep or discard each live row >> you find in the blocks you read. > You mean once a block is sampled we inspect it in any case? This was > not the way I had planned to do it, but I'll keep this idea in mind. Well, once we've gone to the trouble of reading in a block we definitely want to count the tuples in it, for the purposes of extrapolating the total number of tuples in the relation. Given that, I think the most painless route is simply to use the Vitter algorithm with the number-of-tuples-scanned as the count variable. You could dump the logic in acquire_sample_rows that tries to estimate where to read the N'th tuple from. If you like I can send you the Vitter paper off-list (I have a PDF of it). The comments in the code are not really intended to teach someone what it's good for ... regards, tom lane
В списке pgsql-hackers по дате отправления: