Обсуждение: Only first XLogRecData is visible to rm_desc with WAL_DEBUG
As we all know, when you compile with WAL_DEBUG, and enable wal_debug GUC, you get output like this in the log for every inserted WAL record: LOG: INSERT @ 0/5407E578: prev 0/5407E4D0; xid 0; len 32: Standby - running xacts: nextXid 774 latestCompletedXid 771 oldestRunningXid 772; 2 xacts: 7877929 0 What I did *not* realize before is that the redo routine only gets passed the first XLogRecData struct in the chain. Because of that, the "xacts: 7877929 0" information above is garbage. That's probably not the only rm_desc routine that didn't get the memo. There are a few alternatives on how to fix that: 1. Do nothing. Add a comment somewhere explaining that rm_redo cannot safely look at data beyond what's inserted in the first XLogRecData. 2. Reconstruct the WAL data from the XLogRecData entries in a palloc'd buffer, and pass that to rm_redo. That would be fairly expensive, but you probably don't care about that if you've enabled wal_debug. 3. Remove the feature altogether, so that enabling wal_debug doesn't cause all insertions to be logged anymore (no changes to the logging during replay). It's a lot less interesting now that we have pg_xlogdump. Thoughts? - Heikki
Heikki Linnakangas <hlinnakangas@vmware.com> writes: > As we all know, when you compile with WAL_DEBUG, and enable wal_debug > GUC, you get output like this in the log for every inserted WAL record: > LOG: INSERT @ 0/5407E578: prev 0/5407E4D0; xid 0; len 32: Standby - > running xacts: nextXid 774 latestCompletedXid 771 oldestRunningXid 772; > 2 xacts: 7877929 0 > What I did *not* realize before is that the redo routine only gets > passed the first XLogRecData struct in the chain. Because of that, the > "xacts: 7877929 0" information above is garbage. That's probably not the > only rm_desc routine that didn't get the memo. Yeah, I recall having fixed at least one such bug in the past. > There are a few alternatives on how to fix that: > 1. Do nothing. Add a comment somewhere explaining that rm_redo cannot > safely look at data beyond what's inserted in the first XLogRecData. We'll probably just make the same mistake again :-(. And I think there already is such a comment someplace. > 2. Reconstruct the WAL data from the XLogRecData entries in a palloc'd > buffer, and pass that to rm_redo. That would be fairly expensive, but > you probably don't care about that if you've enabled wal_debug. > 3. Remove the feature altogether, so that enabling wal_debug doesn't > cause all insertions to be logged anymore (no changes to the logging > during replay). It's a lot less interesting now that we have pg_xlogdump. I think the main use-case for rm_desc anymore is making CONTEXT lines for errors reported during WAL replay. I guess that situation does not have the same problem, since we've already loaded the complete WAL record. However, I'm not sure how easy it's going to be for WAL_DEBUG to make the data look the same as the replay case: in particular, substitution of full-page-images for data would be tough to predict in advance (and moving the printout into the critical section seems like a bad answer). I'd be okay with removing WAL_DEBUG, I think, particularly in view of the fact that there have been no requests to make it a compiled-by-default feature. regards, tom lane
On Mon, Mar 24, 2014 at 7:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Heikki Linnakangas <hlinnakangas@vmware.com> writes: >> As we all know, when you compile with WAL_DEBUG, and enable wal_debug >> GUC, you get output like this in the log for every inserted WAL record: > >> LOG: INSERT @ 0/5407E578: prev 0/5407E4D0; xid 0; len 32: Standby - >> running xacts: nextXid 774 latestCompletedXid 771 oldestRunningXid 772; >> 2 xacts: 7877929 0 > >> What I did *not* realize before is that the redo routine only gets >> passed the first XLogRecData struct in the chain. Because of that, the >> "xacts: 7877929 0" information above is garbage. That's probably not the >> only rm_desc routine that didn't get the memo. > > Yeah, I recall having fixed at least one such bug in the past. > >> There are a few alternatives on how to fix that: > >> 1. Do nothing. Add a comment somewhere explaining that rm_redo cannot >> safely look at data beyond what's inserted in the first XLogRecData. > > We'll probably just make the same mistake again :-(. And I think there > already is such a comment someplace. > >> 2. Reconstruct the WAL data from the XLogRecData entries in a palloc'd >> buffer, and pass that to rm_redo. That would be fairly expensive, but >> you probably don't care about that if you've enabled wal_debug. > >> 3. Remove the feature altogether, so that enabling wal_debug doesn't >> cause all insertions to be logged anymore (no changes to the logging >> during replay). It's a lot less interesting now that we have pg_xlogdump. > > I think the main use-case for rm_desc anymore is making CONTEXT lines > for errors reported during WAL replay. I guess that situation does not > have the same problem, since we've already loaded the complete WAL record. > > However, I'm not sure how easy it's going to be for WAL_DEBUG to make the > data look the same as the replay case: in particular, substitution of > full-page-images for data would be tough to predict in advance (and moving > the printout into the critical section seems like a bad answer). > > I'd be okay with removing WAL_DEBUG, I think, particularly in view of the > fact that there have been no requests to make it a compiled-by-default > feature. I've found WAL_DEBUG quite useful in the past, when working on scalability, and have indeed wished for it to be compiled-in-by-default. I don't know whether I'm the only one, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas wrote: > On Mon, Mar 24, 2014 at 7:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> 3. Remove the feature altogether, so that enabling wal_debug doesn't > >> cause all insertions to be logged anymore (no changes to the logging > >> during replay). It's a lot less interesting now that we have pg_xlogdump. > > > > I think the main use-case for rm_desc anymore is making CONTEXT lines > > for errors reported during WAL replay. I guess that situation does not > > have the same problem, since we've already loaded the complete WAL record. > > > > However, I'm not sure how easy it's going to be for WAL_DEBUG to make the > > data look the same as the replay case: in particular, substitution of > > full-page-images for data would be tough to predict in advance (and moving > > the printout into the critical section seems like a bad answer). > > > > I'd be okay with removing WAL_DEBUG, I think, particularly in view of the > > fact that there have been no requests to make it a compiled-by-default > > feature. > > I've found WAL_DEBUG quite useful in the past, when working on > scalability, and have indeed wished for it to be > compiled-in-by-default. > > I don't know whether I'm the only one, though. You are not. I would rather have it fixed than removed, if possible. I don't really care too much about getting a performance hit to palloc the records, really; being able to actually read what's happening is much more useful. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 03/25/2014 02:13 AM, Alvaro Herrera wrote: > Robert Haas wrote: >> On Mon, Mar 24, 2014 at 7:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >>>> 3. Remove the feature altogether, so that enabling wal_debug doesn't >>>> cause all insertions to be logged anymore (no changes to the logging >>>> during replay). It's a lot less interesting now that we have pg_xlogdump. >>> >>> I think the main use-case for rm_desc anymore is making CONTEXT lines >>> for errors reported during WAL replay. I guess that situation does not >>> have the same problem, since we've already loaded the complete WAL record. >>> >>> However, I'm not sure how easy it's going to be for WAL_DEBUG to make the >>> data look the same as the replay case: in particular, substitution of >>> full-page-images for data would be tough to predict in advance (and moving >>> the printout into the critical section seems like a bad answer). >>> >>> I'd be okay with removing WAL_DEBUG, I think, particularly in view of the >>> fact that there have been no requests to make it a compiled-by-default >>> feature. >> >> I've found WAL_DEBUG quite useful in the past, when working on >> scalability, and have indeed wished for it to be >> compiled-in-by-default. >> >> I don't know whether I'm the only one, though. > > You are not. I would rather have it fixed than removed, if possible. I > don't really care too much about getting a performance hit to palloc the > records, really; being able to actually read what's happening is much > more useful. I find it useful too, but I think pg_xlogdump can serve the same purpose. One thing missing from pg_xlogdump is the capability to keep tracking the inserted WAL, instead of dumping to the end of WAL and stopping there. If we add an option to pg_xlogdump to poll the WAL instead of bailing out at an error, I think it's a good replacement. - Heikki
Heikki Linnakangas wrote: > On 03/25/2014 02:13 AM, Alvaro Herrera wrote: > >You are not. I would rather have it fixed than removed, if possible. I > >don't really care too much about getting a performance hit to palloc the > >records, really; being able to actually read what's happening is much > >more useful. > > I find it useful too, but I think pg_xlogdump can serve the same purpose. > > One thing missing from pg_xlogdump is the capability to keep > tracking the inserted WAL, instead of dumping to the end of WAL and > stopping there. If we add an option to pg_xlogdump to poll the WAL > instead of bailing out at an error, I think it's a good replacement. "tail -f"-style? If we can make that work I'm okay with removing it from the backend. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Mar 25, 2014 at 12:30 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: >>> I've found WAL_DEBUG quite useful in the past, when working on >>> scalability, and have indeed wished for it to be >>> compiled-in-by-default. >>> >>> I don't know whether I'm the only one, though. >> >> You are not. I would rather have it fixed than removed, if possible. I >> don't really care too much about getting a performance hit to palloc the >> records, really; being able to actually read what's happening is much >> more useful. > > I find it useful too, but I think pg_xlogdump can serve the same purpose. > > One thing missing from pg_xlogdump is the capability to keep tracking the > inserted WAL, instead of dumping to the end of WAL and stopping there. If we > add an option to pg_xlogdump to poll the WAL instead of bailing out at an > error, I think it's a good replacement. Well, one nice thing about wal_debug is that the WAL is at that point still associated with the session that generated it. But I grant that's not a huge issue. How much work are we talking about to fix this, though? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2014-03-25 10:49:54 -0700, Robert Haas wrote: > On Tue, Mar 25, 2014 at 12:30 AM, Heikki Linnakangas > <hlinnakangas@vmware.com> wrote: > >>> I've found WAL_DEBUG quite useful in the past, when working on > >>> scalability, and have indeed wished for it to be > >>> compiled-in-by-default. > >>> > >>> I don't know whether I'm the only one, though. > >> > >> You are not. I would rather have it fixed than removed, if possible. I > >> don't really care too much about getting a performance hit to palloc the > >> records, really; being able to actually read what's happening is much > >> more useful. > > > > I find it useful too, but I think pg_xlogdump can serve the same purpose. > > > > One thing missing from pg_xlogdump is the capability to keep tracking the > > inserted WAL, instead of dumping to the end of WAL and stopping there. If we > > add an option to pg_xlogdump to poll the WAL instead of bailing out at an > > error, I think it's a good replacement. > > Well, one nice thing about wal_debug is that the WAL is at that point > still associated with the session that generated it. But I grant > that's not a huge issue. How much work are we talking about to fix > this, though? It's not entirely trivial, we'd essentially need to copy the loop in CopyXLogRecordToWAL(). And do so while still holding the WALInsertLock(). Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 03/25/2014 08:05 PM, Andres Freund wrote: > On 2014-03-25 10:49:54 -0700, Robert Haas wrote: >> On Tue, Mar 25, 2014 at 12:30 AM, Heikki Linnakangas >> <hlinnakangas@vmware.com> wrote: >>>>> I've found WAL_DEBUG quite useful in the past, when working on >>>>> scalability, and have indeed wished for it to be >>>>> compiled-in-by-default. >>>>> >>>>> I don't know whether I'm the only one, though. >>>> >>>> You are not. I would rather have it fixed than removed, if possible. I >>>> don't really care too much about getting a performance hit to palloc the >>>> records, really; being able to actually read what's happening is much >>>> more useful. >>> >>> I find it useful too, but I think pg_xlogdump can serve the same purpose. >>> >>> One thing missing from pg_xlogdump is the capability to keep tracking the >>> inserted WAL, instead of dumping to the end of WAL and stopping there. If we >>> add an option to pg_xlogdump to poll the WAL instead of bailing out at an >>> error, I think it's a good replacement. >> >> Well, one nice thing about wal_debug is that the WAL is at that point >> still associated with the session that generated it. But I grant >> that's not a huge issue. How much work are we talking about to fix >> this, though? > > It's not entirely trivial, we'd essentially need to copy the loop in > CopyXLogRecordToWAL(). And do so while still holding the > WALInsertLock(). Oh, no, there's no need to do it while holding WALInsertLock. It's quite trivial, actually, see attached. So it's not difficult to fix this if we want to. I just committed a patch to add a -f/--follow flag to pg_xlogdump. That seems very handy, even if we decide to fix the wal_debug code. It doesn't require compiling with wal_debug, and pg_xlogdump allows filtering by rmgr id etc. - Heikki
Вложения
On Wed, Mar 26, 2014 at 5:08 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > Oh, no, there's no need to do it while holding WALInsertLock. It's quite > trivial, actually, see attached. So it's not difficult to fix this if we > want to. Well is there any reason not to just commit that patch? I mean, it seems odd to rip this out if the fix is straightforward and already written. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 03/26/2014 04:51 PM, Robert Haas wrote: > On Wed, Mar 26, 2014 at 5:08 AM, Heikki Linnakangas > <hlinnakangas@vmware.com> wrote: >> Oh, no, there's no need to do it while holding WALInsertLock. It's quite >> trivial, actually, see attached. So it's not difficult to fix this if we >> want to. > > Well is there any reason not to just commit that patch? I mean, it > seems odd to rip this out if the fix is straightforward and already > written. I guess so. Committed.. - Heikki