Invalid indexes should not consume update overhead
От | Tomasz Ostrowski |
---|---|
Тема | Invalid indexes should not consume update overhead |
Дата | |
Msg-id | 8b57a306-7951-8442-df16-8c581d61b4e6@ato.waw.pl обсуждение исходный текст |
Ответы |
Re: Invalid indexes should not consume update overhead
Re: Invalid indexes should not consume update overhead |
Список | pgsql-bugs |
When index is build concurrently and this build fails the index is left in invalid state. It's basically unusable for anything, but, according to documentation: > If a problem arises while scanning the table, such as a deadlock or > a uniqueness violation in a unique index, the CREATE INDEX command > will fail but leave behind an "invalid" index. This index will be > ignored for querying purposes because it might be incomplete; however > it *will* *still* *consume* *update* *overhead*. The psql \d command > will report such an index as INVALID I think this update overhead is actually wasted - there's no way to make use of it, as the only way to make the index usable again is to reindex it or drop and recreate. In the other hand if invalid indexes would have no update overhead then they may actually be useful. Please consider the following situation (taken from the real world): - I have a very large table with a large number of indexes on a server with large number of CPUs. - I need to add a new not null column and I need to do this in a fairly small maintenance window. - I tried to simply "alter table tablename add column columnname int not null default 0", but it did not end in 24 hours - it updates these tens of indexes in single process, using 100% of a single CPU - unacceptable. - I mark all the indexes as invalid (currently I'm just saving and dropping them). - This time adding of the column takes an hour. - I vacuum full or cluster the table, as it has now bloated at least to 200% - this is also fast, as indexes are not updated. - I'm reindexing (currently recreating) all indexes, but instead of calculating them on a single CPU I use all available CPU's (I can have like 40 of them on Amazon AWS RDS cheaply if I need them only for a few hours). This also ends in an hour. - The world is saved and everybody celebrate. Dropping and recreating indexes is not a very safe operation - their definitions need to be saved somewhere out of the database and if anything goes wrong it would not be easy to restore them. If it would be possible to just set them invalid (feature request in passing) and if invalid indexed would not add cost on updates, then this would be much easier and safer. -- Tomasz "Tometzky" Ostrowski
В списке pgsql-bugs по дате отправления: