Обсуждение: Optimization of partial index creation for a new column
Hello hackers, I have a feature suggestion to optimize the creation of a partial index. In some cases, it is possible to skip a full table scan. For example: BEGIN; ALTER TABLE test ADD COLUMN (newcol int); CREATE INDEX newindex ON test(newcol) WHERE newcol IS NOT NULL; COMMIT; In this case, the values of all columns in the new table will be NULL, and there is no point in checking the conditions for each row. The same optimization can be done for default values in NOT NULL columns. How complex would a patch with this optimization be? Will it be necessary to add any attributes to the table metadata, or can we check that the previous command was CREATE TABLE when creating an index?
Hi, On 2025-12-14 23:13:49 +0300, Александр Асафов wrote: > I have a feature suggestion to optimize the creation of a partial > index. In some cases, it is possible to skip a full table scan. For > example: > > BEGIN; > ALTER TABLE test ADD COLUMN (newcol int); > CREATE INDEX newindex ON test(newcol) WHERE newcol IS NOT NULL; > COMMIT; > > In this case, the values of all columns in the new table will be NULL, > and there is no point in checking the conditions for each row. The > same optimization can be done for default values in NOT NULL columns. > > How complex would a patch with this optimization be? Will it be > necessary to add any attributes to the table metadata, or can we check > that the previous command was CREATE TABLE when creating an index? I doubt this is feasible, or at least that it's going to be worth the complexity. You'd have to make sure that the transaction didn't actually insert any rows, and update some in-memory or catalog state if it did. With subtransactions etc that could get somewhat complicated. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes:
> On 2025-12-14 23:13:49 +0300, Александр Асафов wrote:
>> I have a feature suggestion to optimize the creation of a partial
>> index. In some cases, it is possible to skip a full table scan. For
>> example:
>> ...
>> How complex would a patch with this optimization be? Will it be
>> necessary to add any attributes to the table metadata, or can we check
>> that the previous command was CREATE TABLE when creating an index?
> I doubt this is feasible, or at least that it's going to be worth the
> complexity. You'd have to make sure that the transaction didn't actually
> insert any rows, and update some in-memory or catalog state if it did. With
> subtransactions etc that could get somewhat complicated.
Even if it could be made to work, I think it likely would be a net
negative for performance. You're suggesting that we add mechanism to
track whether any insertion/update has happened since an ALTER ADD
COLUMN, which means new overhead for every INSERT and UPDATE. Maybe
not a huge amount of overhead, but probably not trivial. That
overhead would only be repaid in the rather unlikely case where a
suitable index gets added soon after the ALTER. Unless you can find
some substantially-more-common scenarios where the overhead would
produce a benefit, it's hard to believe we'd accept such a patch.
regards, tom lane