Re: PoC Refactor AM analyse API
От | Andrey Borodin |
---|---|
Тема | Re: PoC Refactor AM analyse API |
Дата | |
Msg-id | DC63AF76-FF80-4A5A-8AD4-89D807DB1E9B@yandex-team.ru обсуждение исходный текст |
Ответ на | Re: PoC Refactor AM analyse API (Denis Smirnov <sd@arenadata.io>) |
Ответы |
Re: PoC Refactor AM analyse API
|
Список | pgsql-hackers |
> 8 дек. 2020 г., в 16:44, Denis Smirnov <sd@arenadata.io> написал(а): > > Andrey, thanks for your feedback! > > I agree that AMs with fix sized blocks can have much alike code in acquire_sample_rows() (though it is not a rule). Butthere are several points about current master sampling. > > * It is not perfect - AM developers may want to improve it with other sampling algorithms. > * It is designed with a big influence of heap AM - for example, RelationGetNumberOfBlocks() returns uint32 while otherAMs can have a bigger amount of blocks. > * heapam_acquire_sample_rows() is a small function - I don't think it is not a big trouble to write something alike forany AM developer. > * Some AMs may have a single level sampling (only algorithm Z from Vitter for example) - why not? > > As a result we get a single and clear method to acquire rows for statistics. If we don’t modify but rather extend currentAPI ( for example in a manner it is done for FDW) the code becomes more complicated and difficult to understand. This makes sense. Purpose of the API is to provide flexible abstraction. Current table_scan_analyze_next_block()/table_scan_analyze_next_tuple()API assumes too much about AM implementation. But why do you pass int natts and VacAttrStats **stats to acquire_sample_rows()? Is it of any use? It seems to break abstractiontoo. Best regards, Andrey Borodin.
В списке pgsql-hackers по дате отправления: