Обсуждение: Pg-Upgrade standbys via rsync... and avoid sending UNlogged data?
Has anyone found a clever way to $subject that doesn't involved calculating which are all of the files unnecessary to send and then running rsync with an exclude list? I'm pondering this right now. ISTM for each unlogged table/index/toast in pg_class where relpersistence='u'... rsync --exclude-from... (and exclude-from-file contains) */base/$dbid/$relfilenode */base/$dbid/$relfilenode.* */base/$dbid/$relfilenode_* ... The leading * will match the old/new data directoryes being sent to the standby. MAster upgrade is done using hard-links and the rsync method generally as per the pg docs. I have among others, a 5TB system w/about 40G of unlogged data. The rsync method of upgrade takes only several seconds during a snapshot based test however as the snap was off a live system, the crash recovery startup trunc'd all the unlogged data. This will not be the case during the real upgrade and I just as soon avoid sending the unlogged data only to have it trunc'd away as soon as the standby is started after the upgrade. Thanks -- Jerry Sievers Postgres DBA/Development Consulting e: postgres.consulting@comcast.net p: 312.241.7800
On Fri, Jul 22, 2016 at 06:29:05PM -0500, Jerry Sievers wrote: > Has anyone found a clever way to $subject that doesn't involved > calculating which are all of the files unnecessary to send and then > running rsync with an exclude list? > > I'm pondering this right now. > > ISTM for each unlogged table/index/toast in pg_class where > relpersistence='u'... > > rsync --exclude-from... (and exclude-from-file contains) > > */base/$dbid/$relfilenode > */base/$dbid/$relfilenode.* > */base/$dbid/$relfilenode_* > ... > > The leading * will match the old/new data directoryes being sent to the > standby. > > MAster upgrade is done using hard-links and the rsync method generally > as per the pg docs. > > I have among others, a 5TB system w/about 40G of unlogged data. > > The rsync method of upgrade takes only several seconds during a snapshot > based test however as the snap was off a live system, the crash recovery > startup trunc'd all the unlogged data. > > This will not be the case during the real upgrade and I just as soon > avoid sending the unlogged data only to have it trunc'd away as soon as > the standby is started after the upgrade. Good question --- I don't think there is an easy answer except to truncate the unlogged tables before the upgrade. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Bruce Momjian <bruce@momjian.us> writes: > On Fri, Jul 22, 2016 at 06:29:05PM -0500, Jerry Sievers wrote: > >> Has anyone found a clever way to $subject that doesn't involved >> calculating which are all of the files unnecessary to send and then >> running rsync with an exclude list? >> >> I'm pondering this right now. >> >> ISTM for each unlogged table/index/toast in pg_class where >> relpersistence='u'... >> >> rsync --exclude-from... (and exclude-from-file contains) >> >> */base/$dbid/$relfilenode >> */base/$dbid/$relfilenode.* >> */base/$dbid/$relfilenode_* >> ... >> >> The leading * will match the old/new data directoryes being sent to the >> standby. >> >> MAster upgrade is done using hard-links and the rsync method generally >> as per the pg docs. >> >> I have among others, a 5TB system w/about 40G of unlogged data. >> >> The rsync method of upgrade takes only several seconds during a snapshot >> based test however as the snap was off a live system, the crash recovery >> startup trunc'd all the unlogged data. >> >> This will not be the case during the real upgrade and I just as soon >> avoid sending the unlogged data only to have it trunc'd away as soon as >> the standby is started after the upgrade. > > Good question --- I don't think there is an easy answer except to > truncate the unlogged tables before the upgrade. Yup, as I suspected. My data warehouse team *rather* not have to dirive the unlogged data again since it takes time... and I rather not needlessly replicate it during upgrade (er, because that takes time.) At my company, if we choose to not avoid replicating unlogged data *and* run for a short interval in production without a standby, snapshot the halted master and take the rsync off of that meanwhile the master is back online. I'll figure something out. Thanks! -- Jerry Sievers e: jerry.sievers@comcast.net p: 312.241.7800
* Jerry Sievers (gsievers19@comcast.net) wrote: > Has anyone found a clever way to $subject that doesn't involved > calculating which are all of the files unnecessary to send and then > running rsync with an exclude list? > > I'm pondering this right now. Certainly an interesting issue to consider. I have to admit that I hadn't considered unlogged tables when working out how to do the hard-link-based approach. > ISTM for each unlogged table/index/toast in pg_class where > relpersistence='u'... > > rsync --exclude-from... (and exclude-from-file contains) > > */base/$dbid/$relfilenode > */base/$dbid/$relfilenode.* > */base/$dbid/$relfilenode_* > ... > > The leading * will match the old/new data directoryes being sent to the > standby. You would have to address tablespaces too, if you have any. > I have among others, a 5TB system w/about 40G of unlogged data. Honestly, sending an extra 40G out of 5TB (we're talking less than 1% here...) might be worth it to keep the process simple. Thanks! Stephen
Вложения
* Stephen Frost (sfrost@snowman.net) wrote: > > I have among others, a 5TB system w/about 40G of unlogged data. > > Honestly, sending an extra 40G out of 5TB (we're talking less than 1% > here...) might be worth it to keep the process simple. Of course, I realize right after sending this that you aren't sending the 5TB. :) Depends on your network speed if the 40G would be an issue or not then. In some cases, it might only add a bit of time but keep the process simple, other cases, it might take longer. I'm not really thrilled with the idea of complicating the rsync in that way, but I can understand why you're considering it. Thanks! Stephen
Вложения
Stephen Frost <sfrost@snowman.net> writes: > * Stephen Frost (sfrost@snowman.net) wrote: > >> > I have among others, a 5TB system w/about 40G of unlogged data. >> >> Honestly, sending an extra 40G out of 5TB (we're talking less than 1% >> here...) might be worth it to keep the process simple. > > Of course, I realize right after sending this that you aren't sending > the 5TB. :) Roger that! Really, the hard link pg-upgrade option and rsync method for standbys is pure beauty :-) We have only the post-analyze and this unlogged data problem to deal with. The warehouse is our only system with any substantial amount of unlogged data. Reducing post-analyze delay was solved by me here with our groovy threaded AnalyzMaTron. I can get a >5TB system done in about 45 minutes w/20 threads. Thanks! > Depends on your network speed if the 40G would be an issue or not then. > In some cases, it might only add a bit of time but keep the process > simple, other cases, it might take longer. I'm not really thrilled with > the idea of complicating the rsync in that way, but I can understand why > you're considering it. > > Thanks! > > Stephen -- Jerry Sievers e: jerry.sievers@comcast.net p: 312.241.7800
On Mon, Jul 25, 2016 at 12:34:45PM -0500, Jerry Sievers wrote: > Stephen Frost <sfrost@snowman.net> writes: > > > * Stephen Frost (sfrost@snowman.net) wrote: > > > >> > I have among others, a 5TB system w/about 40G of unlogged data. > >> > >> Honestly, sending an extra 40G out of 5TB (we're talking less than 1% > >> here...) might be worth it to keep the process simple. > > > > Of course, I realize right after sending this that you aren't sending > > the 5TB. :) > > Roger that! Really, the hard link pg-upgrade option and rsync method > for standbys is pure beauty :-) > > We have only the post-analyze and this unlogged data problem to deal > with. > > The warehouse is our only system with any substantial amount of unlogged > data. > > Reducing post-analyze delay was solved by me here with our groovy > threaded AnalyzMaTron. I can get a >5TB system done in about 45 minutes > w/20 threads. The only thing I can suggest is to document this limitation, if people think it is worth it. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Bruce Momjian <bruce@momjian.us> writes: > On Mon, Jul 25, 2016 at 12:34:45PM -0500, Jerry Sievers wrote: > >> Stephen Frost <sfrost@snowman.net> writes: >> >> > * Stephen Frost (sfrost@snowman.net) wrote: >> > >> >> > I have among others, a 5TB system w/about 40G of unlogged data. >> >> >> >> Honestly, sending an extra 40G out of 5TB (we're talking less than 1% >> >> here...) might be worth it to keep the process simple. >> > >> > Of course, I realize right after sending this that you aren't sending >> > the 5TB. :) >> >> Roger that! Really, the hard link pg-upgrade option and rsync method >> for standbys is pure beauty :-) >> >> We have only the post-analyze and this unlogged data problem to deal >> with. >> >> The warehouse is our only system with any substantial amount of unlogged >> data. >> >> Reducing post-analyze delay was solved by me here with our groovy >> threaded AnalyzMaTron. I can get a >5TB system done in about 45 minutes >> w/20 threads. > > The only thing I can suggest is to document this limitation, if people > think it is worth it. +1 A couple sentences right near the rsync piece in pg_upgrade docs would do it. Caveat:UNNLogged data will be unnecessarily transfered via rsync and then immediately discarded from standbys when the standbys are started unless those tables are truncated on the master before the upgrade. -- Jerry Sievers e: jerry.sievers@comcast.net p: 312.241.7800
* Bruce Momjian (bruce@momjian.us) wrote: > On Mon, Jul 25, 2016 at 12:34:45PM -0500, Jerry Sievers wrote: > > Stephen Frost <sfrost@snowman.net> writes: > > > > > * Stephen Frost (sfrost@snowman.net) wrote: > > > > > >> > I have among others, a 5TB system w/about 40G of unlogged data. > > >> > > >> Honestly, sending an extra 40G out of 5TB (we're talking less than 1% > > >> here...) might be worth it to keep the process simple. > > > > > > Of course, I realize right after sending this that you aren't sending > > > the 5TB. :) > > > > Roger that! Really, the hard link pg-upgrade option and rsync method > > for standbys is pure beauty :-) > > > > We have only the post-analyze and this unlogged data problem to deal > > with. > > > > The warehouse is our only system with any substantial amount of unlogged > > data. > > > > Reducing post-analyze delay was solved by me here with our groovy > > threaded AnalyzMaTron. I can get a >5TB system done in about 45 minutes > > w/20 threads. > > The only thing I can suggest is to document this limitation, if people > think it is worth it. It'd probably be good to document what happens with unlogged tables. Thanks! Stephen
Вложения
On Tue, Jul 26, 2016 at 07:11:29PM -0400, Stephen Frost wrote: > * Bruce Momjian (bruce@momjian.us) wrote: > > On Mon, Jul 25, 2016 at 12:34:45PM -0500, Jerry Sievers wrote: > > > Stephen Frost <sfrost@snowman.net> writes: > > > > > > > * Stephen Frost (sfrost@snowman.net) wrote: > > > > > > > >> > I have among others, a 5TB system w/about 40G of unlogged data. > > > >> > > > >> Honestly, sending an extra 40G out of 5TB (we're talking less than 1% > > > >> here...) might be worth it to keep the process simple. > > > > > > > > Of course, I realize right after sending this that you aren't sending > > > > the 5TB. :) > > > > > > Roger that! Really, the hard link pg-upgrade option and rsync method > > > for standbys is pure beauty :-) > > > > > > We have only the post-analyze and this unlogged data problem to deal > > > with. > > > > > > The warehouse is our only system with any substantial amount of unlogged > > > data. > > > > > > Reducing post-analyze delay was solved by me here with our groovy > > > threaded AnalyzMaTron. I can get a >5TB system done in about 45 minutes > > > w/20 threads. > > > > The only thing I can suggest is to document this limitation, if people > > think it is worth it. > > It'd probably be good to document what happens with unlogged tables. Attached patch applied to head. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Вложения
On 8/4/16 6:55 PM, Bruce Momjian wrote: > <para> > + Unfortunately, <application>rsync</> also needlessly copies the > + files associated with temporary and unlogged tables. > If you have tablespaces, you will need to run a similar > <application>rsync</> command for each tablespace directory. If you > have relocated <filename>pg_xlog</> outside the data directories, That seems like an odd location to insert that. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Aug 5, 2016 at 09:04:19AM -0400, Peter Eisentraut wrote: > On 8/4/16 6:55 PM, Bruce Momjian wrote: > > <para> > > + Unfortunately, <application>rsync</> also needlessly copies the > > + files associated with temporary and unlogged tables. > > If you have tablespaces, you will need to run a similar > > <application>rsync</> command for each tablespace directory. If you > > have relocated <filename>pg_xlog</> outside the data directories, > > That seems like an odd location to insert that. Would you like to suggest another location as I could not find out. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On Fri, Aug 5, 2016 at 10:05:21AM -0400, Bruce Momjian wrote: > On Fri, Aug 5, 2016 at 09:04:19AM -0400, Peter Eisentraut wrote: > > On 8/4/16 6:55 PM, Bruce Momjian wrote: > > > <para> > > > + Unfortunately, <application>rsync</> also needlessly copies the > > > + files associated with temporary and unlogged tables. > > > If you have tablespaces, you will need to run a similar > > > <application>rsync</> command for each tablespace directory. If you > > > have relocated <filename>pg_xlog</> outside the data directories, > > > > That seems like an odd location to insert that. > > Would you like to suggest another location as I could not find out. Sorry, typo: :-( Would you like to suggest another location as I could not find one. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On 8/5/16 10:06 AM, Bruce Momjian wrote: > On Fri, Aug 5, 2016 at 10:05:21AM -0400, Bruce Momjian wrote: >> On Fri, Aug 5, 2016 at 09:04:19AM -0400, Peter Eisentraut wrote: >>> On 8/4/16 6:55 PM, Bruce Momjian wrote: >>>> <para> >>>> + Unfortunately, <application>rsync</> also needlessly copies the >>>> + files associated with temporary and unlogged tables. >>>> If you have tablespaces, you will need to run a similar >>>> <application>rsync</> command for each tablespace directory. If you >>>> have relocated <filename>pg_xlog</> outside the data directories, >>> >>> That seems like an odd location to insert that. >> >> Would you like to suggest another location as I could not find out. > > Sorry, typo: :-( > > Would you like to suggest another location as I could not find one. I think it belongs to the previous paragraph, so I moved it there. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sun, Aug 7, 2016 at 09:29:20PM -0400, Peter Eisentraut wrote: > On 8/5/16 10:06 AM, Bruce Momjian wrote: > > On Fri, Aug 5, 2016 at 10:05:21AM -0400, Bruce Momjian wrote: > >> On Fri, Aug 5, 2016 at 09:04:19AM -0400, Peter Eisentraut wrote: > >>> On 8/4/16 6:55 PM, Bruce Momjian wrote: > >>>> <para> > >>>> + Unfortunately, <application>rsync</> also needlessly copies the > >>>> + files associated with temporary and unlogged tables. > >>>> If you have tablespaces, you will need to run a similar > >>>> <application>rsync</> command for each tablespace directory. If you > >>>> have relocated <filename>pg_xlog</> outside the data directories, > >>> > >>> That seems like an odd location to insert that. > >> > >> Would you like to suggest another location as I could not find out. > > > > Sorry, typo: :-( > > > > Would you like to suggest another location as I could not find one. > > I think it belongs to the previous paragraph, so I moved it there. OK, fine. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +