Re: LogwrtResult contended spinlock
От | Bharath Rupireddy |
---|---|
Тема | Re: LogwrtResult contended spinlock |
Дата | |
Msg-id | CALj2ACVM6GaqqNNB5mHmryNnnHjXkB7UyRXWmytkm1dXw7PU3A@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: LogwrtResult contended spinlock (Jeff Davis <pgsql@j-davis.com>) |
Ответы |
Re: LogwrtResult contended spinlock
(Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
|
Список | pgsql-hackers |
Thanks for looking into this. On Thu, Feb 22, 2024 at 1:54 AM Jeff Davis <pgsql@j-davis.com> wrote: > > > 3. > > @@ -6371,7 +6373,9 @@ GetFlushRecPtr(TimeLineID *insertTLI) > > If at all, a read > > barrier is warranted here, we can use atomic read with full barrier > > I don't think we need a full barrier but I'm fine with using > pg_atomic_read_membarrier_u64() if it's better for whatever reason. For the sake of clarity and correctness, I've used pg_atomic_read_membarrier_u64 everywhere for reading XLogCtl->LogwrtResult.Write and XLogCtl->LogwrtResult.Flush. > > 5. I guess we'd better use pg_atomic_read_u64_impl and > > pg_atomic_compare_exchange_u64_impl in > > pg_atomic_monotonic_advance_u64 > > to reduce one level of function indirections. > > Aren't they inlined? Yes, all of them are inlined. But, it seems like XXX_impl functions are being used in implementing exposed functions as a convention. Therefore, having pg_atomic_read_u64_impl and pg_atomic_compare_exchange_u64_impl doesn't sound bad IMV. > > 6. > > + * Full barrier semantics (even when value is unchanged). > > > > + currval = pg_atomic_read_u64(ptr); > > + if (currval >= target_) > > + { > > + pg_memory_barrier(); > > I don't think they are exactly equivalent: in the current patch, the > first pg_atomic_read_u64() could be reordered with earlier reads; > whereas that wouldn't work if using pg_atomic_read_membarrier_u64() it > could not be. I'm not sure whether that could create a performance > problem or not. I left it as-is. > > 9. > > + copyptr = pg_atomic_read_u64(&XLogCtl->LogwrtResult.Copy); > > + if (startptr + count > copyptr) > > + ereport(WARNING, > > + (errmsg("request to read past end of generated WAL; > > request %X/%X, current position %X/%X", > > + LSN_FORMAT_ARGS(startptr + count), > > LSN_FORMAT_ARGS(copyptr)))); > > > > Any specific reason for this to be a WARNING rather than an ERROR? > > Good question. WaitXLogInsertionsToFinish() uses a LOG level message > for the same situation. They should probably be the same log level, and > I would think it would be either PANIC or WARNING. I have no idea why > LOG was chosen. WaitXLogInsertionsToFinish adjusts upto after LOG so that the wait is never past the current insert position even if a caller asks for reading WAL that doesn't yet exist. And the comment there says "Here we just assume that to mean that all WAL that has been reserved needs to be finished." In contrast, WALReadFromBuffers kind of enforces callers to do WaitXLogInsertionsToFinish (IOW asks callers to send in the WAL that exists in the server). Therefore, an ERROR seems a reasonable choice to me, if PANIC sounds rather strong affecting all the postgres processes. FWIW, a PANIC when requested to flush past the end of WAL in WaitXLogInsertionsToFinish instead of LOG seems to be good. CF bot animals don't complain - https://github.com/BRupireddy2/postgres/tree/be_harsh_when_request_to_flush_past_end_of_WAL_WIP. > 0001: > > * The comments on the two versions of the functions are redundant, and > the style in that header seems to be to omit the comment from the u64 > version. Removed comments atop 64-bit version. > * I'm not sure the AssertPointerAlignment is needed in the u32 version? Borrowed them from pg_atomic_read_u32 and pg_atomic_compare_exchange_u32, just like how they assert before calling XXX_impl versions. I don't see any problem with them. > 0002: > > * All updates to the non-shared LogwrtResult should update both values. > It's confusing to update those local values independently, because it > violates the invariant that LogwrtResult.Flush <= LogwrtResult.Write. > > > 2. I guess we need to update both the Write and Flush local copies in > > AdvanceXLInsertBuffer. > > I agree. Whenever we update the non-shared LogwrtResult, let's update > the whole thing. Otherwise it's confusing. Yes, it's done that way now with a macro XLogUpdateLocalLogWrtResult using pg_atomic_read_membarrier_u64 to read both Write and Flush ptrs. > * pg_memory_barrier() is not needed right before a spinlock Got rid of it as we read both Flush and Write local copies with pg_atomic_read_membarrier_u64. > * As mentioned above, I think GetFlushRecPtr() needs two read barriers. > Also, I think the same for GetXLogWriteRecPtr(). Yes, it's done that way now with a macro XLogUpdateLocalLogWrtResult using pg_atomic_read_membarrier_u64 to read both Write and Flush ptrs. > * In general, for any place using both Write and Flush, I think Flush > should be loaded first, followed by a read barrier, followed by a load > of the Write pointer. Why read Flush first rather than Write? I think it's enough to do {read Write, read barrier, read Flush}. This works because Write is monotonically advanced first before Flush using full barriers and we don't get reordering issues between the readers and writers no? Am I missing anything here? > And I think in most of those places there should > be a read barrier before the load of Flush, too, to avoid a stale value > in places that might matter. Yes, using pg_atomic_read_membarrier_u64 for both Write and Flush makes it easy. > 0003: > > * We need to maintain the invariant that Copy >= Write >= Flush. I > believe that's always satisfied, because the > XLogWaitInsertionsToFinish() is always called before XLogWrite(). But > we should add an assert or runtime check of this invariant somewhere. Yes, that invariant is already maintained by the server. Although, I'm not fully agree, I added an assertion to WaitXLogInsertionsToFinish after updating XLogCtl->LogwrtResult.Copy. CF bot is happy with it - https://github.com/BRupireddy2/postgres/tree/atomic_LogwrtResult_v13. Please see the attached v13 patch set for further review. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
В списке pgsql-hackers по дате отправления: