Обсуждение: Re: Proposal to Enable/Disable Index using ALTER INDEX (with patch)
Hello! One more thing (maybe I missed it in the patch, but anyway) - should we add some migration to ensure what old databases will get enabled=true by default after upgrade? Best regards, Mikhail.
> Should this not behave like if you drop (or create) an index > during a prepared statement? I have not yet looked closely at > this code to see what could be done. > > Regards, I looked at this a bit more and ATExecEnableDisableIndex needs some tweaks. What should be getting invalidated in the heap relation that the index is on and not the index relation as it is in the current patch. You can retrieve the heap relation oid IndexGetRelation(indexOid, false) and the CacheInvalidateRelcache should be on the heap relation. The planner needs to only care about the heap relation invalidation to re-plan across multiple executions of a prepared statement. There should be a test for this scenario as well. Regards, Sami
On Mon, Dec 30, 2024 at 3:48 PM Michail Nikolaev <michail.nikolaev@gmail.com> wrote:
Hello!
One more thing (maybe I missed it in the patch, but anyway) - should we
add some migration to ensure what old databases will get enabled=true by
default after upgrade?
Hi!
Thank you! I tested this by manually upgrading (using pg_upgrade) from master to the build from the branch, which ensures that post-upgrade the column for indisenabled is true by default. I also backed it up with bool indisenabled BKI_DEFAULT(t); in pg_index.h. Additionally, I tested upgrading from an old data directory to the new one (both on this patch) to ensure indexes with DISABLE properties are carried over as well on the new data directory/upgrade. For reference the latest patch now is in [1].
Given this is working as expected, would we still need a migration step? (Let me know if I missed something ofc).
For reference here is the setup from my local testing (for reference)
rm -Rf /tmp/pg_data && rm -Rf /tmp/pg_data_new
./configure --prefix=/tmp/pg_install_old && make clean && make -j8 && make install
# Create and init old cluster
/tmp/pg_install_old/bin/initdb -D /tmp/pg_data
/tmp/pg_install_old/bin/pg_ctl -D /tmp/pg_data start
# Create test data
/tmp/pg_install_old/bin/createdb test
/tmp/pg_install_old/bin/psql test -c "CREATE TABLE foo (id int); CREATE INDEX idx_foo ON foo(id) DISABLE;"
# Stop old cluster
/tmp/pg_install_old/bin/pg_ctl -D /tmp/pg_data stop
# Switch branch and build new version
git checkout s/enable-disable-index
./configure --prefix=/tmp/pg_install_new && make clean && make -j8 && make install
# Create new cluster directory
/tmp/pg_install_new/bin/initdb -D /tmp/pg_data_new
# Now run upgrade with different binary locations
/tmp/pg_install_new/bin/pg_upgrade \
-b /tmp/pg_install_old/bin \
-B /tmp/pg_install_new/bin \
-d /tmp/pg_data \
-D /tmp/pg_data_new \
-p 5432 \
-P 5433
rm -Rf /tmp/pg_data && rm -Rf /tmp/pg_data_new
./configure --prefix=/tmp/pg_install_old && make clean && make -j8 && make install
# Create and init old cluster
/tmp/pg_install_old/bin/initdb -D /tmp/pg_data
/tmp/pg_install_old/bin/pg_ctl -D /tmp/pg_data start
# Create test data
/tmp/pg_install_old/bin/createdb test
/tmp/pg_install_old/bin/psql test -c "CREATE TABLE foo (id int); CREATE INDEX idx_foo ON foo(id) DISABLE;"
# Stop old cluster
/tmp/pg_install_old/bin/pg_ctl -D /tmp/pg_data stop
# Switch branch and build new version
git checkout s/enable-disable-index
./configure --prefix=/tmp/pg_install_new && make clean && make -j8 && make install
# Create new cluster directory
/tmp/pg_install_new/bin/initdb -D /tmp/pg_data_new
# Now run upgrade with different binary locations
/tmp/pg_install_new/bin/pg_upgrade \
-b /tmp/pg_install_old/bin \
-B /tmp/pg_install_new/bin \
-d /tmp/pg_data \
-D /tmp/pg_data_new \
-p 5432 \
-P 5433
/tmp/pg_install_new/bin/pg_ctl -D /tmp/pg_data_new start
$ SELECT * FROM pg_index WHERE indexrelid = 'idx_foo'::regclass;
$ SELECT * FROM pg_index WHERE indexrelid = 'idx_foo'::regclass;
Thank you
Shayon
Hello!
> Given this is working as expected, would we still need a migration step?
No, it is clear now. Thanks for explaining.
Best regards,
Mikhail.
+ This is the + default state for newly created indexes. This is not needed in the ALTER INDEX docs, IMO.ss + <para> + Disable the specified index. A disabled index is not used for queries, but it + is still updated when the underlying table data changes and will still be + used to enforce constraints (such as UNIQUE, or PRIMARY KEY constraints). + This can be useful for testing query performance with and without specific + indexes. If performance degrades after disabling an index, it can be easily + re-enabled using <literal>ENABLE</literal>. Before disabling, it's recommended + to check <structname>pg_stat_user_indexes</structname>.<structfield>idx_scan</structfield> + to identify potentially unused indexes. + </para> This got me thinking if dropping the index is the only use case we really care about. For example, you may want to prevent an index that is enforcing a constraint from being used by the planner, but you probably don't want to drop it. In fact, I also think that you may want the index from being used in one part of your application but could potentially benefit other parts of your application. In that case, I can see a GUC that allows you to force the use of a an index that has been CREATED or ALTERED as DISABLED. UNlike the GUC suggested earlier in the thread, this GUC can simply be a boolean to allow the force usage of a DISABLED index. FWIW, Oracle has a similar parameter called OPTIMIZER_USE_INVISIBLE_INDEXES. + underlying table data changes. This can be useful when you want to create + an index without immediately impacting query performance, allowing you to c/performance/planning ?? I have also been thinking about DISABLE as the keyword, and I really don't like it. DISABLE indicates, at least ot me, that the index is not available for either reads or writes. Looking at other engines, Sqlserver uses DISABLE to drop the index data, but keeps the index metadata around. Oracle uses INVISIBLE and MariabDB uses IGNORABLE to provide similar functionality to that being discussed here. I find those keywords to be more appropriate for this purpose. What about if we use HIDDEN instead of DISABLE as the keyword? Regards, Sami
Thanks for the updates patch! >> This got me thinking if dropping the index is the only >> use case we really care about. For example, you may want >> to prevent an index that is enforcing a constraint from >> being used by the planner, but you probably don't want to >> drop it. In fact, I also think that you may want the index >> from being used in one part of your application but could >> potentially benefit other parts of your application. In that >> case, I can see a GUC that allows you to force the use of a >> an index that has been CREATED or ALTERED as DISABLED. >> UNlike the GUC suggested earlier in the thread, this GUC >> can simply be a boolean to allow the force usage of a >> DISABLED index. FWIW, Oracle has a similar parameter called >> OPTIMIZER_USE_INVISIBLE_INDEXES. > > > I totally see where you are coming from. Some rough thoughts/notes: > > - The patch/proposed feature today doesn't disable constraints, like uniqueness. It only impacts query planning. Maybeit should ? > - I was imagining this feature as being short-lived in production - that is, you disable a potential index to collect dataon query performance and then make a decision on whether you need the index permanently. However, yes, one can alwayskeep an index disabled for longer, and conditionally use it in another part of an application in which case a GUC tobypass the disabled/invisible index would come in handy as you mentioned. > - I don't have a strong opinion either way, but I do wonder - considering that this GUC is an additive feature - if it'ssomething worth implementing once we have more feedback from the usage (in v18 pre release, alpha, ec) of marking anindex as disabled/invisible first? Or perhaps as a follow-up patch? > > If we do go with a GUC - is FORCE_INVISIBLE_INDEX a good name? > >> Here is a use-case where the GUC may be useful. I can see a user wanting to try out the index before committing to using it across the board. They can create the index as invisible and force using it in a specific part of the application. If they are happy with the results, they can make it visible. This is similar to but not exactly what HypoPG [1] does. HypoPG does not actually create the index and can only be used with EXPLAIN ( not EXPLAIN ANALYZE ) in a specific session. I see the ability to test on a real index may be more useful. Maybe others have other thoughts on this? > I agree. DISABLE doesn't sit right. I noticed INVISIBLE in MariaDB. I like HIDDEN/VISIBLE or ACTIVE/INACTIVE as well, sinceit impacts query planning. Let's see if other have an opinion on this, but VISIBLE/INVISIBLE seem the best way to indicate that the indexes are visible or invisible from the optimizer. ACTIVE/INACTIVE sound a lot like ENABLE/DISABLE. [1] https://github.com/HypoPG/hypopg Regards, Sami
Hi, Thank you for the patch! I've had a need for this feature several times, so I appreciate the work you’ve put into it. I like the new name VISIBLE/INVISIBLE and the fact that it's a separate flag in pg_index (it's easy to monitor). I don’t feel qualified to provide an opinion on the code itself just yet. I did notice something in the command prototype: +ALTER INDEX [ IF EXISTS ] <replaceable class="parameter">name</replaceable> VISIBLE +ALTER INDEX [ IF EXISTS ] <replaceable class="parameter">name</replaceable> INVISIBLE it would probably be better as: +ALTER INDEX [ IF EXISTS ] <replaceable class="parameter">name</replaceable> {VISIBLE|INVISIBLE} The completion for the INVISIBLE / VISIBLE keyword is missing in psql. I also tested the ALTER command within a transaction, and it worked as I expected: the changes are transactional (possibly because you didn’t use systable_inplace_update_begin?). Additionally, I tried using the ALTER command on an index that supports a foreign key. As expected, delete and update operations on the referenced table became significantly slower. I was wondering if this behavior should be documented here. + Make the specified index invisible. The index will not be used for queries. + This can be useful for testing query performance with and without specific + indexes. Maybe something like : The index will not be used for user or system queries (e.g., an index supporting foreign keys). I noticed that you mentionned checking pg_stat_user_indexes before using the query but it might not be enough?
On Fri, Jan 24, 2025 at 11:32 AM Benoit Lobréau <benoit.lobreau@gmail.com> wrote: > The completion for the INVISIBLE / VISIBLE keyword is missing in psql. I think this should to the trick ? diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c index 81cbf10aa28..43ea8e55fd0 100644 --- a/src/bin/psql/tab-complete.in.c +++ b/src/bin/psql/tab-complete.in.c @@ -2393,7 +2393,8 @@ match_previous_words(int pattern_id, else if (Matches("ALTER", "INDEX", MatchAny)) COMPLETE_WITH("ALTER COLUMN", "OWNER TO", "RENAME TO", "SET", "RESET", "ATTACH PARTITION", - "DEPENDS ON EXTENSION", "NO DEPENDS ON EXTENSION"); + "DEPENDS ON EXTENSION", "NO DEPENDS ON EXTENSION", + "INVISIBLE", "VISIBLE"); else if (Matches("ALTER", "INDEX", MatchAny, "ATTACH")) COMPLETE_WITH("PARTITION"); else if (Matches("ALTER", "INDEX", MatchAny, "ATTACH", "PARTITION"))
I did some additional testing with this command within transactions. I had the "BEGIN; ALTER INDEX; EXPLAIN; ROLLBACK;" scenario in mind, but didn't realise we acquire an AccessExclusiveLock on the index. Therefore, it's not possible to change the visibility within a single transaction.... unless you don’t mind blocking all access to the relation. I read the comments at the top of "AlterTableGetLockLevel" and in the documentation and I understand that this behavior seems unavoidable. I suppose this is what was meant by the "globally visible effects" of an ALTER INDEX in the old discussion ? [1] Being able to rollback the changes is nice, but in this case there is not much to alter back anyway. This is probably not the intended use case (hence the discussion about GUCs and hints). [1] https://www.postgresql.org/message-id/30558.1529359929%40sss.pgh.pa.us
> I had the "BEGIN; ALTER INDEX; EXPLAIN; ROLLBACK;" scenario in mind, but > didn't realise we acquire an AccessExclusiveLock on the index. Therefore, it's > not possible to change the visibility within a single transaction.... > unless you don’t mind blocking all access to the relation. > > I read the comments at the top of "AlterTableGetLockLevel" and in the > documentation and I understand that this behavior seems unavoidable. > I suppose this is what was meant by the "globally visible effects" of an ALTER > INDEX in the old discussion ? [1] What is being discussed here is different from what I can tell. This is referring to the index changing status ( visible/invisible ) and those changes being visible by another transaction. However, the current patch may be too restrictive. Why do we need an AccessExclusiveLock on the table to perform the change. We are only changing the catalog and not the underlying data. This is a lot like ALTER INDEX RENAME, which only takes a ShareUpdateExclusiveLock. Can we do the same here? I am also still reviewing and have a few comments on v9 1/ Missing ATSimpleRecursion call in PrepCmd for case AT_SetIndexVisible: case AT_SetIndexInvisible: Without the recursion call, the visibility changes on a parent will not apply to the partitions. We are also missing tests for partitions. 2/ In ATExecSetIndexVisibility Change: elog(ERROR, "could not find tuple for index %u", indexOid); To: elog(ERROR, "cache lookup failed for index %u", indexOid); I see both message formats are used all over the place, but in tablecmds.c, the "cache lookup" variant is the one used, so let's do that for consistency. 3/ In ATExecSetIndexVisibility if (indexForm->indcheckxmin) + { + heap_freetuple(indexTuple); + table_close(pg_index, RowExclusiveLock); + ereport(ERROR, There is no need to close the table or free the tuple, as these are "cleaned up" when the transaction aborts. 4/ In ATExecSetIndexVisibility I have a suggestion below: What about we open both heapOid and IndexRelationId at the start and close them at the end. Also, inside the block for indexForm->indisvisible != visible, do the work to invalidate the cache, invoke the post alter hook and increment the command counter. This will also allow us to get rid of the update boolean as well. What do you think? Regards, Sami
On Fri, Jan 24, 2025 at 4:03 PM Benoit Lobréau <benoit.lobreau@gmail.com> wrote:
I did notice something in the command prototype:
+ALTER INDEX [ IF EXISTS ] <replaceable
class="parameter">name</replaceable> VISIBLE
+ALTER INDEX [ IF EXISTS ] <replaceable
class="parameter">name</replaceable> INVISIBLE
it would probably be better as:
+ALTER INDEX [ IF EXISTS ] <replaceable
class="parameter">name</replaceable> {VISIBLE|INVISIBLE}
Thank you for your review, really appreciate it! I have updated with your feedback in v10 patch [1]
The completion for the INVISIBLE / VISIBLE keyword is missing in psql.
Also updated in v10 patch [1]
Additionally, I tried using the ALTER command on an index that supports
a foreign key. As expected, delete and update operations on the referenced
table became significantly slower. I was wondering if this behavior should
be documented here.
+ Make the specified index invisible. The index will not be used
for queries.
+ This can be useful for testing query performance with and
without specific
+ indexes.
Maybe something like :
The index will not be used for user or system queries (e.g., an index
supporting foreign keys).
I noticed that you mentionned checking pg_stat_user_indexes before using
the query but it might not be enough?
This part of the documentation has gone through some changes, and I have sensed it's hard to convey the details without complicating or breaking precedence. By saying "The index will not be used for queries", I (as a PostgreSQL user) was assuming this would apply to both user and system queries, and hence the distinction was implicit. However, I don't have a strong opinion and am happy to make the changes. I am also curious if anyone else has thoughts on this as well?
Thank you
Shayon
On Fri, Jan 24, 2025 at 8:56 PM Benoit Lobréau <benoit.lobreau@gmail.com> wrote:
I also noticed that \d on an index doesn't warn about the invisible state
whereas \d on a table does:
Thank you for the review + patch (v9-002) [1]. Your patch looks good to me. I have not incorporated this in my v10 patch [2]. Mostly to make sure you are credited and also being new here and not knowing whether or not I should :) (to help with the reviewing process, etc). Open to suggestions and feedback.
Thank you
Shayon
hi. the following reviews based on v10-0001-Introduce-the-ability-to-set-index-visibility-us.patch. in src/test/regress/sql/create_index.sql seems there are no sql tests for "create index ... invisible"? <varlistentry> <term><literal>VISIBLE</literal></term> <listitem> <para> Make the specified index visible. The index will be used for queries. </para> </listitem> </varlistentry> here it should be "Make the specified index visible. The index can be used for query planning" ? Do we need to add GUC use_invisible_index to postgresql.conf.sample? CREATE TABLE t(id INT PRIMARY KEY, data TEXT,num INT, vector INT[], range INT4RANGE); ALTER INDEX t_pkey INVISIBLE; alter table t alter column id set data type bigint; \d t after ALTER TABLE SET DATA TYPE, the "visible" status should not change? but here it changed. you may check ATPostAlterTypeParse to make the "visible" status not change. @@ -3449,6 +3451,7 @@ typedef struct IndexStmt bool if_not_exists; /* just do nothing if index already exists? */ bool reset_default_tblspc; /* reset default_tablespace prior to * executing */ + bool isvisible; /* true if VISIBLE (default), false if INVISIBLE */ } IndexStmt; the indentation level is not right? +opt_index_visibility: + VISIBLE_P { $$ = true; } + | INVISIBLE_P { $$ = false; } + | /*EMPTY*/ { $$ = true; } + ; + the indentation level seems also not right? + createFlags = INDEX_CREATE_SKIP_BUILD | INDEX_CREATE_CONCURRENT; + if (indexForm->indisvisible) + createFlags |= INDEX_CREATE_VISIBLE; the indentation level seems also not right? INVISIBLE, VISIBLE is not special words, in gram.y, you don't need "VISIBLE_P", "INVISIBLE_P", you can just use "INVISIBLE", "VISIBLE" ? \d t3 Table "public.t3" Column | Type | Collation | Nullable | Default --------+-----------+-----------+----------+--------- id | integer | | not null | data | text | | | num | integer | | | vector | integer[] | | | range | int4range | | | a | box | | | Indexes: "t3_pkey" PRIMARY KEY, btree (id) INVISIBLE "grect2ind" gist (a) INVISIBLE "t3_1" gist (a) INVISIBLE "t3_2" gin (vector) WITH (fastupdate='on', gin_pending_list_limit='128') INVISIBLE "t3_4" spgist (data) INVISIBLE "t3_6" hash (id) INVISIBLE pg_dump will dump as -- -- Name: t3 t3_pkey; Type: CONSTRAINT; Schema: public; Owner: jian -- ALTER TABLE ONLY public.t3 ADD CONSTRAINT t3_pkey PRIMARY KEY (id); after dump, restore index (primary key: t3_pkey) INVISIBLE will not be restored. We need extra work for restoring the INVISIBLE flag for the primary key index. I am not sure if we need to change index_concurrently_swap or not. but many other pg_index columns changed.
On Sat, Feb 8, 2025 at 12:41 AM jian he <jian.universality@gmail.com> wrote:
hi.
```
drop table if exists idxpart;
create table idxpart (a int, b int, c text) partition by range (a);
create table idxpart1 (like idxpart);
alter table idxpart attach partition idxpart1 for values from (0) to (10);
create index idxpart_c on only idxpart (c) invisible;
create index idxpart1_c on idxpart1 (c);
alter index idxpart_c attach partition idxpart1_c;
```
In this case, should ALTER INDEX ATTACH PARTITION change the attached
partition(idxpart1_c)'s "visible" status?
Hi,
That is a great question and I have really gone back and forth on this one and here's my reasoning so far
1. When you don't use ONLY:
- The index of child table inherits the visibility of the parent table's index
- This applies whether the parent index is set as INVISIBLE or VISIBLE
- This automatic inheritance is expected behavior and feels natural
2. When you use ONLY:
- You as a user/developer are explicitly taking control of index management
- Creating an index for parent as INVISIBLE and another for child as VISIBLE represents conscious, deliberate choices
- When attaching these indexes, it makes sense to respect these explicit visibility settings
- Silently overriding the child index's visibility could violate the Principle of Least Surprise
- Lastly, this model also allows more granular control over index visibility for each partition
I am not strongly tied to either of these options and very much open to changing my mind. Also happy to try and document this for more clarity.
I have rebased the patch on top of master (resolving some merge conflicts), along with the meson changes (thank you for that).
Thanks,
Shayon
That is a great question and I have really gone back and forth on this one and here's my reasoning so far
1. When you don't use ONLY:
- The index of child table inherits the visibility of the parent table's index
- This applies whether the parent index is set as INVISIBLE or VISIBLE
- This automatic inheritance is expected behavior and feels natural
2. When you use ONLY:
- You as a user/developer are explicitly taking control of index management
- Creating an index for parent as INVISIBLE and another for child as VISIBLE represents conscious, deliberate choices
- When attaching these indexes, it makes sense to respect these explicit visibility settings
- Silently overriding the child index's visibility could violate the Principle of Least Surprise
- Lastly, this model also allows more granular control over index visibility for each partition
I am not strongly tied to either of these options and very much open to changing my mind. Also happy to try and document this for more clarity.
I have rebased the patch on top of master (resolving some merge conflicts), along with the meson changes (thank you for that).
Thanks,
Shayon
Вложения
On Sun, Feb 23, 2025 at 3:41 PM Shayon Mukherjee <shayonj@gmail.com> wrote:
I have rebased the patch on top of master (resolving some merge conflicts), along with the meson changes (thank you for that).
Thank you
Shayon
Вложения
Hi Hackers,
Just leaving a quick note: I know this patch has through a lot of variations. I am keen on shipping this for v18 if possible, and while it’s a learning process for me, I am more than happy to iterate on any feedback. Let me know if there is anything I can do to help to keep the momentum going on this (absolutely no rush however). Just seeking feedback :).
Thank you
Shayon
On Mar 7, 2025, at 2:21 AM, Shayon Mukherjee <shayonj@gmail.com> wrote:<v13-0001-Introduce-the-ability-to-enable-disable-indexes-.patch>On Sun, Feb 23, 2025 at 3:41 PM Shayon Mukherjee <shayonj@gmail.com> wrote:I have rebased the patch on top of master (resolving some merge conflicts), along with the meson changes (thank you for that).Rebased against the latest master and attaching the v13 patch.
Thank youShayon
I went back to look at this patch and a few things. I noticed it did not have correct indentation, so I ran pgindent. I also removed some extra lines added and made some slight adjustments to the docs. Attached my edited patch as a txt. If you agree, please revise into a v14. I also noticed that between v12 and v13, the GUC use_invisible_index was removed, but I don't see a discussion as to why. I feel it's a good GUC to have [0], and we should at least have it as a separate patch as part of this set. I will continue reviewing the patch, but i feel this may be close to be marked RFC, although not sure if it will get a committer review before code freeze. [0] https://www.postgresql.org/message-id/flat/CAA5RZ0udzydObMDi65C59-oq54B9ZmjSZ1wVH3h%2Bv4XiVm6QDA%40mail.gmail.com#cfea240ffd73e947f9edd1ef1c762dae -- Sami Imseih Amazon Web Services (AWS)
Вложения
On Tue, Apr 1, 2025 at 2:41 PM Sami Imseih <samimseih@gmail.com> wrote:
I went back to look at this patch and a few things. I noticed it did
not have correct
indentation, so I ran pgindent. I also removed some extra lines added and made
some slight adjustments to the docs. Attached my edited patch as a txt. If you
agree, please revise into a v14.
I also noticed that between v12 and v13, the GUC use_invisible_index
was removed,
but I don't see a discussion as to why. I feel it's a good GUC to have
[0], and we should
at least have it as a separate patch as part of this set.
My apologies, I rebased off an old commit and that's how we lost the GUC change and also the rename from ENABLE/DISABLE to VISIBLE/INVISIBLE. I have brought it all back, in addition to the specs and other changes mentioned in v12. Let me know if you have any feedback, more than happy to incorporate them.
Also, thank you for letting me know of pgindent, very handy!
I will continue reviewing the patch, but i feel this may be close to
be marked RFC, although
not sure if it will get a committer review before code freeze.
I really appreciate the reviews and guidance, thank you! Would love if this can become part of v18 release. No worries if not of course. I am definitely around to help get this production ready and also if any issues arise after.
Thank you
Shayon