On Wed, Aug 24, 2011 at 03:38:09PM -0400, Tom Lane wrote:
> Merlin Moncure <mmoncure@gmail.com> writes:
> > On Wed, Aug 24, 2011 at 1:24 PM, Daniel Farina <daniel@heroku.com> wrote:
> >> At Heroku we use CREATE INDEX CONCURRENTLY with great success, but
> >> recently when frobbing around some indexes I realized that there is no
> >> equivalent for DROP INDEX, and this is a similar but lesser problem
> >> (as CREATE INDEX takes much longer), as DROP INDEX takes an ACCESS
> >> EXCLUSIVE lock on the parent table while doing the work to unlink
> >> files, which nominally one would think to be trivial, but I assure you
> >> it is not at times for even indexes that are a handful of gigabytes
> >> (let's say ~=< a dozen).
>
> > Are you sure that you are really waiting on the time to unlink the
> > file? there's other stuff going on in there like waiting for lock,
> > plan invalidation, etc. Point being, maybe the time consuming stuff
> > can't really be deferred which would make the proposal moot.
>
> Assuming the issue really is the physical unlinks (which I agree I'd
> like to see some evidence for), I wonder whether the problem could be
I'd believe it. With a cold cache (sudo sysctl -w vm.drop_caches=3), my
desktop takes 2.6s to "rm" a 985 MiB file.
> addressed by moving smgrDoPendingDeletes() to after locks are released,
> instead of before, in CommitTransaction/AbortTransaction. There does
> not seem to be any strong reason why we have to do that before lock
> release, since incoming potential users of a table should not be trying
> to access the old physical storage after that anyway.
Agreed. We now have $OLD_SUBJECT, but this is a win independently. I have
reviewed the code that runs between the old and new call sites, and I did not
identify a hazard of moving it as you describe.
nm