Re: Questions about indexes?
От | Daniel Kalchev |
---|---|
Тема | Re: Questions about indexes? |
Дата | |
Msg-id | 200302170954.h1H9seY14536@dcave.digsys.bg обсуждение исходный текст |
Ответ на | Re: Questions about indexes? (Ryan Bradetich <rbradetich@uswest.net>) |
Ответы |
Re: Questions about indexes?
(Christopher Kings-Lynne <chriskl@familyhealth.com.au>)
|
Список | pgsql-hackers |
>>>Ryan Bradetich said:> the table would look like:> 1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user x has an invalid shell.>1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user y has an invalid shell.> 1 | Mon Feb 17 00:34:24 MST 2003 | p101 |user y has expired password.> 2 | Mon Feb 17 00:34:24 MST 2003 | f101 | file /foo has improper owner.> etc...> > So I donot need the anomaly to be part of the index, I only need it to > > I agree with you, that I would not normally add theanomally to the> index, except for the unique row requirement. Thinking about it now,> maybe I should guarentee uniquerows via a check constraint...> > Thanks for making me think about this in a different way! (sorry this is a bit long) Ryan, I use somewhat similarly structured data (archived records of various events) and when the database was setup (back when this baby was called postgres95), I too used indexes on all possible fields. My database consists of an 'operations' table, which holds for the last x days period (example) and several tables with archived records (per month, or per-year - see later, The operations table can have frequent updates, which add new data. Data is never modified but often lookups are made. The archived tables are generated once and forever from the operations table (possibly merging in the future, but I haven't yet made my mind on this) - then access is read-only, although sufficiently frequent. What I found for the many years of operating this database on different PostgreSQL versions and hardware is that indexes have considerable cost. :) So does the need to not miss anything from the operations table (that is, collect data from many places and have have it all it there). I ended up with few only indexes on the operations table, because the processes that fill it up do minimal lookups to see if data is already in the table, if not do inserts. Then at regular intervals, the table is cleaned up - that is, a process to remove the duplicate is run. This unfortunately costs OIDs, but I found no other reasonable way to do the fast inserts. Perhaps the best way is to create the table without OIDs (but wouldn't this still waste OIDs?) use COPY and then clean afterwards? The archived tables are generated, then cleaned up. Then, as Tom suggested indexes are put on the archived tables, only for the fields that are used in queries. Once the table is created, there is no way duplicated data will exist, as it will not be inserted into. Therefore no need for UNIQUE index enforcement. If you need to have one large 'history' table, then perhaps you will just have to do (slow :) selects for each record before each insert, or just insert the data and then run the cleanup process. Daniel
В списке pgsql-hackers по дате отправления: