Re: Add parallelism and glibc dependent only options to reindexdb
От | Julien Rouhaud |
---|---|
Тема | Re: Add parallelism and glibc dependent only options to reindexdb |
Дата | |
Msg-id | CAOBaU_ZuLr8YY=Uso+q3k7pOtTYfSLMM+JVsCxGQCnOp=aK=aQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Add parallelism and glibc dependent only options to reindexdb (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Add parallelism and glibc dependent only options to reindexdb
Re: Add parallelism and glibc dependent only options to reindexdb |
Список | pgsql-hackers |
On Mon, Jul 1, 2019 at 4:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Michael Paquier <michael@paquier.xyz> writes: > > - 0003 begins to be the actual fancy thing with the addition of a > > --jobs option into reindexdb. The main issue here which should be > > discussed is that when it comes to reindex of tables, you basically > > are not going to have any conflicts between the objects manipulated. > > However if you wish to do a reindex on a set of indexes then things > > get more tricky as it is necessary to list items per-table so as > > multiple connections do not conflict with each other if attempting to > > work on multiple indexes of the same table. What this patch does is > > to select the set of indexes which need to be worked on (see the > > addition of cell in ParallelSlot), and then does a kind of > > pre-planning of each item into the connection slots so as each > > connection knows from the beginning which items it needs to process. > > This is quite different from vacuumdb where a new item is distributed > > only on a free connection from a unique list. I'd personally prefer > > if we keep the facility in parallel.c so as it is only > > execution-dependent and that we have no pre-planning. This would > > require keeping within reindexdb.c an array of lists, with one list > > corresponding to one connection instead which feels more natural. > > Couldn't we make this enormously simpler and less bug-prone by just > dictating that --jobs applies only to reindex-table operations? That would also mean that we'll have to fallback on doing reindex at table-level, even if we only want to reindex indexes that depends on glibc. I'm afraid that this will often add a huge penalty. > > - 0004 is the part where the concurrent additions really matter as > > this consists in applying an extra filter to the indexes selected so > > as only the glibc-sensitive indexes are chosen for the processing. > > I think you'd be better off to define and document this as "reindex > only collation-sensitive indexes", without any particular reference > to a reason why somebody might want to do that. We should still document that indexes based on ICU would be exluded? I also realize that I totally forgot to update reindexdb.sgml. Sorry about that, I'll fix with the next versions.
В списке pgsql-hackers по дате отправления: