Re: Eliminating VACUUM FULL WAS: remove flatfiles.c
От | Tom Lane |
---|---|
Тема | Re: Eliminating VACUUM FULL WAS: remove flatfiles.c |
Дата | |
Msg-id | 19750.1252094460@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: Eliminating VACUUM FULL WAS: remove flatfiles.c (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Eliminating VACUUM FULL WAS: remove flatfiles.c
Re: Eliminating VACUUM FULL WAS: remove flatfiles.c |
Список | pgsql-hackers |
I wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> No problem, just CLUSTER that table same as today. > Uh, no, that was Josh's point: you can't CLUSTER pg_class, because you > can't change its relfilenode. If you do, backends won't know where to > read pg_class to find out its relfilenode. > I was wondering whether maintenance operations like "vacuum rewrite" > could get away with filling a new table file and then moving it into > place with rename(2), which is guaranteed atomic (at least on sane > filesystems). The idea doesn't work right off because (1) you need > to atomically install the updated indexes too, and (2) the table > might span more than one segment file. But maybe we could think of > something. Hmm ... reading that over again, it seems like there is a pretty obvious solution. The sticking point --- not only for pg_class, but for shared catalogs such as pg_database --- is the lack of a way to track relfilenode if it ever changes. What if we keep the relfilenode of these critical tables someplace else? For instance, we could have a "map" file in each database holding the relfilenode of pg_class, and one in $PGDATA/global holding the relfilenodes of the shared catalogs and indexes. It'd be possible to update a map file atomically via the rename(2) trick. Then we teach relcache or some similar place to believe the map files over the contents of pg_class. This looks sort of like a reversion to flat files, but it has a couple of saving graces: 1. The set of entries is fixed and small, so there's no performance issue looming with database growth. 2. We could not synchronize updates with transaction commit, which was always the real Achilles' heel of the flat files. But I think we don't need to, if we restrict the set of operations that can change the relfilenodes of critical tables to "maintenance" operations that only rewrite the table contents and don't make any logical changes in the contents. Given that restriction, transaction commit isn't actually important; it will be the act of moving the updated map file into place that effectively is commit for these operations. If you crash after that, your change is still effective. Thoughts? regards, tom lane
В списке pgsql-hackers по дате отправления: