Re: Invalid indexes should not consume update overhead
От | Amit Kapila |
---|---|
Тема | Re: Invalid indexes should not consume update overhead |
Дата | |
Msg-id | CAA4eK1JNmxL=kKB210GfA14q4T4yAC_U6XPo6yhi9VuBhxEPog@mail.gmail.com обсуждение исходный текст |
Ответ на | Invalid indexes should not consume update overhead (Tomasz Ostrowski <tometzky+pg@ato.waw.pl>) |
Список | pgsql-bugs |
On Wed, Jul 13, 2016 at 4:40 PM, Tomasz Ostrowski <tometzky+pg@ato.waw.pl> wrote: > When index is build concurrently and this build fails the index is left in > invalid state. It's basically unusable for anything, but, according to > documentation: > >> If a problem arises while scanning the table, such as a deadlock or >> a uniqueness violation in a unique index, the CREATE INDEX command >> will fail but leave behind an "invalid" index. This index will be >> ignored for querying purposes because it might be incomplete; however >> it *will* *still* *consume* *update* *overhead*. The psql \d command >> will report such an index as INVALID > > > I think this update overhead is actually wasted - there's no way to make use > of it, as the only way to make the index usable again is to reindex it or > drop and recreate. > > In the other hand if invalid indexes would have no update overhead then they > may actually be useful. Please consider the following situation (taken from > the real world): > > - I have a very large table with a large number of indexes on a server with > large number of CPUs. > - I need to add a new not null column and I need to do this in a fairly > small maintenance window. > - I tried to simply "alter table tablename add column columnname int not > null default 0", but it did not end in 24 hours - it updates these tens of > indexes in single process, using 100% of a single CPU - unacceptable. > - I mark all the indexes as invalid (currently I'm just saving and dropping > them). > - This time adding of the column takes an hour. > - I vacuum full or cluster the table, as it has now bloated at least to 200% > - this is also fast, as indexes are not updated. > - I'm reindexing (currently recreating) all indexes, but instead of > calculating them on a single CPU I use all available CPU's (I can have like > 40 of them on Amazon AWS RDS cheaply if I need them only for a few hours). > This also ends in an hour. > - The world is saved and everybody celebrate. > > Dropping and recreating indexes is not a very safe operation - their > definitions need to be saved somewhere out of the database and if anything > goes wrong it would not be easy to restore them. If it would be possible to > just set them invalid (feature request in passing) and if invalid indexed > would not add cost on updates, then this would be much easier and safer. > How can you consider marking invalid index as valid index without reindexing it? It is quite possible that in the meantime the table has been updated. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
В списке pgsql-bugs по дате отправления: