Обсуждение: Re: [COMMITTERS] pgsql: Make group commit more effective.

Поиск
Список
Период
Сортировка

Re: [COMMITTERS] pgsql: Make group commit more effective.

От
Robert Haas
Дата:
On Mon, Jan 30, 2012 at 9:55 AM, Heikki Linnakangas
<heikki.linnakangas@iki.fi> wrote:
> Make group commit more effective.
>
> When a backend needs to flush the WAL, and someone else is already flushing
> the WAL, wait until it releases the WALInsertLock and check if we still need
> to do the flush or if the other backend already did the work for us, before
> acquiring WALInsertLock. This helps group commit, because when the WAL flush
> finishes, all the backends that were waiting for it can be woken up in one
> go, and the can all concurrently observe that they're done, rather than
> waking them up one by one in a cascading fashion.
>
> This is based on a new LWLock function, LWLockWaitUntilFree(), which has
> peculiar semantics. If the lock is immediately free, it grabs the lock and
> returns true. If it's not free, it waits until it is released, but then
> returns false without grabbing the lock. This is used in XLogFlush(), so
> that when the lock is acquired, the backend flushes the WAL, but if it's
> not, the backend first checks the current flush location before retrying.
>
> Original patch and benchmarking by Peter Geoghegan and Simon Riggs, although
> this patch as committed ended up being very different from that.

Either this patch, or something else committed this morning, is
causing "make check" to hang or run extremely slowly for me.  I think
it's this patch, because I attached to a backend and stopped it a few
times, and all the backtraces look like this:

#0  0x00007fff8a545b22 in semop ()
#1  0x00000001001ff8df in PGSemaphoreLock (sema=0x103d7de70,
interruptOK=0 '\0') at pg_sema.c:418
#2  0x000000010024d7dd in LWLockWaitUntilFree (lockid=<value
temporarily unavailable, due to optimizations>, mode=<value
temporarily unavailable, due to optimizations>) at lwlock.c:666
#3  0x000000010005d3b3 in XLogFlush (record=<value temporarily
unavailable, due to optimizations>) at xlog.c:2148
#4  0x00000001000506bb in CommitTransaction () at xact.c:1113
#5  0x0000000100050b35 in CommitTransactionCommand () at xact.c:2613
#6  0x000000010025a403 in finish_xact_command () at postgres.c:2388
#7  0x000000010025d525 in exec_simple_query (query_string=0x101055638
"CREATE INDEX wowidx ON test_tsvector USING gin (a);") at
postgres.c:1052
#8  0x000000010025dfc1 in PostgresMain (argc=2, argv=<value
temporarily unavailable, due to optimizations>, username=<value
temporarily unavailable, due to optimizations>) at postgres.c:3881
#9  0x000000010020c258 in ServerLoop () at postmaster.c:3587
#10 0x000000010020d167 in PostmasterMain (argc=6, argv=0x100d08f40) at
postmaster.c:1110
#11 0x000000010019e745 in main (argc=6, argv=0x100d08f40) at main.c:199

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [COMMITTERS] pgsql: Make group commit more effective.

От
Heikki Linnakangas
Дата:
On 30.01.2012 20:27, Robert Haas wrote:
> Either this patch, or something else committed this morning, is
> causing "make check" to hang or run extremely slowly for me.  I think
> it's this patch, because I attached to a backend and stopped it a few
> times, and all the backtraces look like this:

Yeah, sure looks like it's the group commit commit. It works for me, and
staring at the code, I have no idea what could be causing it. The
buildfarm seems happy too, so this is pretty mysterious.

I did find one bug, see attached, but AFAICS it should only cause
unnecessary wakeups in some corner cases, which is harmless.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Вложения

Re: [COMMITTERS] pgsql: Make group commit more effective.

От
Heikki Linnakangas
Дата:
On 30.01.2012 22:50, Heikki Linnakangas wrote:
> On 30.01.2012 20:27, Robert Haas wrote:
>> Either this patch, or something else committed this morning, is
>> causing "make check" to hang or run extremely slowly for me. I think
>> it's this patch, because I attached to a backend and stopped it a few
>> times, and all the backtraces look like this:
>
> Yeah, sure looks like it's the group commit commit. It works for me, and
> staring at the code, I have no idea what could be causing it. The
> buildfarm seems happy too, so this is pretty mysterious.

And just after sending that, I succeeded to reproduce this. I had to 
lower wal_buffers to a small value to make it happen. I'm debugging this 
now..

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: [COMMITTERS] pgsql: Make group commit more effective.

От
Heikki Linnakangas
Дата:
On 30.01.2012 23:06, Heikki Linnakangas wrote:
> On 30.01.2012 22:50, Heikki Linnakangas wrote:
>> On 30.01.2012 20:27, Robert Haas wrote:
>>> Either this patch, or something else committed this morning, is
>>> causing "make check" to hang or run extremely slowly for me. I think
>>> it's this patch, because I attached to a backend and stopped it a few
>>> times, and all the backtraces look like this:
>>
>> Yeah, sure looks like it's the group commit commit. It works for me, and
>> staring at the code, I have no idea what could be causing it. The
>> buildfarm seems happy too, so this is pretty mysterious.
>
> And just after sending that, I succeeded to reproduce this. I had to
> lower wal_buffers to a small value to make it happen. I'm debugging this
> now..

It was a bug in the LWLockRelease code, after all. Fixed. Unfortunately 
this added a couple more instructions to that critical codepath, but I 
think it should still go without notice. Let me know if this doesn't fix 
the hang on your laptop.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com