New/Revised TODO? Gathering actual read performance data for use by planner

Поиск

Список

Период

Сортировка

От	Michael Nolan
Тема	New/Revised TODO? Gathering actual read performance data for use by planner
Дата	24 мая 2011 г. 17:34:32
Msg-id	BANLkTi=tNr6EBAObv_t-KLTwREZWKAhTYw@mail.gmail.com обсуждение исходный текст
Ответы	Re: New/Revised TODO? Gathering actual read performance data for use by planner
Список	pgsql-hackers

Дерево обсуждения

In the TODO list is this item:<br /><br /><b>Modify the planner to better estimate caching effects <br /></b><br />Tom
mentionedthis in his presentation at PGCON, and I also chatted with Tom about it briefly afterwards.<br /><br />Based
onlast year's discussion of this TODO item, it seems thoughts have been focused on estimating how much data is<br />
beingsatisfied from PG's shared buffers.  However, I think that's only part of the problem.   <br /><br />Specifically,
readperformance is going to be affected by:<br /><br />1.  Reads fulfilled from shared buffers.<br /> 2.  Reads
fulfilledfrom system cache.<br />3.  Reads fulfilled from disk controller cache.<br />4.  Reads from physical media.<br
/><br/>#4 is further complicated by the type of physical media for that specific block.  For example, reads that can<br
/>be fulfilled from a SSD are going to be much faster than ones that access hard drives (or even slower types of
media.)<br/><br />System load is going to impact all of these as well.<br /><br />Therefore, I suggest that an
alternativeto the above TODO may be to gather performance data without knowing <br /> (or more importantly without
needingto know) which of the above sources fulfilled the read.  <br /><br />This data would probably need to be kept
separatelyfor each table or index, as some tables or indexes <br />may be mostly or fully in cache or on faster
physicalmedia than others, although in the absence of other <br /> data about a specific table or index, data about
otherrelations in the same tablespace might be of some use.  <br /><br />Tom mentioned that the cost of doing multiple
systemtime-of-day calls for each block read might be <br /> prohibitive, it may also be that the data may also be too
coarseon some systems to be truly useful <br />(eg, the epoch time in seconds.)  <br /><br />If this data were
available,that could mean that successive plans for the same query could have <br /> significantly different plans (and
thusactual performance), based on what has happened recently, <br />so these statistics would have to be relatively
shortterm and updated frequently, but without becoming <br />computational bottlenecks.  <br /><br />The problem is one
I'minterested in working on.<br />--<br />Mike Nolan<br />

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

New/Revised TODO? Gathering actual read performance data for use by planner