Re: Re: We have got a serious problem with pg_clog/WAL synchronization
От | Kenneth Marshall |
---|---|
Тема | Re: Re: We have got a serious problem with pg_clog/WAL synchronization |
Дата | |
Msg-id | 20040812132117.GB16756@it.is.rice.edu обсуждение исходный текст |
Ответ на | Why hash indexes suck (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
> "Min Xu (Hsu)" <xu@cs.wisc.edu> writes: > > It seems to me this is an interesting phenomena of interactions between > > frequent events of transaction commits and infrequent events of system > > checkpoints. A potential alternative solution to adding a new shared > > lock to the frequent commit operation is to let the infrequent > > checkpoint operation take more overhead. I suppose acquiring/releasing > > an extra lock for each commit would incur extra performance overhead, > > even when the lock is not contented. On the other hand, let the > > checkpoing operation acquire some existing locks (exclusively) to > > effectively disallowing committing transactions to interfere with the > > checkpoint process might be a better solution since it incur higher > > overhead only when necessary. > > Unfortunately, there isn't any pre-existing lock that will serve. > A transaction that is between XLogInsert'ing its COMMIT record and > updating the shared pg_clog data area does not hold any lock that > could be used to prevent a checkpoint from starting. (Or it didn't > until yesterday's patch, anyway.) > > I looked briefly at reorganizing the existing code so that we'd do the > COMMIT XLogInsert while we're holding lock on the shared pg_clog data, > which would solve the problem without adding any new lock acquisition. > But this seemed extremely messy to do. Also it would be optimizing > transaction commit at the cost of pessimizing other uses of pg_clog, > which might have to wait longer to get at the shared data. Adding the > new lock has the advantage that we can be sure it's not blocking > anything we don't want it to block. > > Thanks for thinking about the problem though ... > > regards, tom lane > One problem with a high-traffic LWLock is that they require a write to shared memory for both the shared lock and the exclusive lock. On the increasingly prevalent SMP machines, this will cause the invalidation of the cache-line containing the lock and the consequent reload and its inherent delay. Would it be possible to use a latch + version number in this case to minimize this problem by allowing all but the checkpoint to perform a read-only action on the latch? This should eliminate the cache-line shenanigans on SMP machines. Ken Marshall
В списке pgsql-hackers по дате отправления: