Re: Super Optimizing Postgres
От | Hannu Krosing |
---|---|
Тема | Re: Super Optimizing Postgres |
Дата | |
Msg-id | 3BF7578F.7000108@sid.tm.ee обсуждение исходный текст |
Ответ на | Re: Super Optimizing Postgres (Bruce Momjian <pgman@candle.pha.pa.us>) |
Список | pgsql-hackers |
Justin Clift wrote: >>In my example, two computers with exactly the same hardware, except one has a >>5400 RPM IDE drive, the other has a 10,000 RPM IDE drive. These machines should >>not use the same settings, it is obvious that a sequential scan block read on >>one will be faster than the other. >> > >If we're going to do this bit properly, then we'll have to take into >consideration many database objects will need their own individual >statistics. For example, lets say we have a database with a bunch of >10k rpm SCSI drives which the tables are on, and the system also has one >or more 15k rpm SCSI drives (lets say a Seagate Cheetah II drives) on >which the indices have been placed. With the 10k rpm drives, the tables >needing the fastest throughput or having the highest usage are put on >the outer edges of the disk media, and the rest of the tables are placed >in the available space. > >On this theoretical system, we will be better off measuring the >performance of each table and index in turn then generating and storing >costs for each one which are as "accurate as possible at this point in >time". > That would mean that these statistic values must be stored in pg_class and not be SET variables at all. This will probably have the added benefit that some cacheing effects of small/big tables will be accounted for automatically so you dont have to do that in optimizer. > A model like this would probably have these costs re-calculated >each time the ANALYZE command is run to ensure their accuracy through >database growth and changes. > Then the ANALYZE should be run on both tables and indexes. AFAIK we currently analyze only real data. > >I think this would be decently accurate, and RAID systems would be >accurately analysed. Don't know how to take into account large cache >sizes though. :) > Maybe some volatile statistict on how much of table "may be" cached in the disk/fs caches, assuming that we currently know how much of each is in shared memory. ----------- Hannu
В списке pgsql-hackers по дате отправления: