Обсуждение: pg_basebackup vs. Windows and tablespaces
A "pg_basebackup -Fp" running on the same system as the target cluster will fail in the presence of tablespaces; it would backup each tablespace to its original path, and those paths are in use locally for the very originals we're copying. "pg_basebackup -Ft" does not exhibit that hazard, and I typically recommend it for folks using tablespaces. On Windows, we populate pg_tblspc with NTFS junction points. "pg_basebackup -Fp" reproduces them, and "pg_basebackup -Ft" stores them in the tar archive as symbolic links. Trouble arises for -Ft backups: no Windows tar expander that I've found will recreate the junction points. While -Fp backups are basically usable, commands that copy files on Windows are inconsistent about their support for junction points; duplicating a base backup after the fact is error-prone. Windows users of tablespaces are left with limited options: use "pg_basebackup -Fp" on a different system, or use -Ft but manually recreate the junction points. We can do better; I see a few options: 1. Include in the base backup a file listing symbolic links/junction points, then have archive recovery recreate them. This file would be managed like the backup label file; exclusive backups would actually write it to the master data directory, and non-exclusive backups would incorporate it on the fly. pg_basebackup could also omit the actual links from its backup. Nearly any tar or file copy utility would then suffice. 2. Add a pg_basebackup option like "--destdir" or "--sysroot", meaningful only with -Fp; tablespace backups will be stored relative to it. So if the actual tablespace path is c:/foo, --destdir=c:/backups/today would backup that tablespace to c:/backups/today/c/foo. This facilitates same-server use of -Fp on all platforms. 3. Use path concatenation instead of symbolic links/junction points for tablespaces. More invasive, no doubt. For example, we would need to devise a way for recovery to get the tablespace path. I think #1 is a good bet; it's self-contained and fully heals the situation for Windows users. By itself, #2 helps less than #1 on Windows. It may have independent value. Other ideas, opinions? Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com
Noah Misch <noah@leadboat.com> writes: > A "pg_basebackup -Fp" running on the same system as the target cluster will > fail in the presence of tablespaces; it would backup each tablespace to its I'd like to see that fixed, +1. > 1. Include in the base backup a file listing symbolic links/junction points, > then have archive recovery recreate them. This file would be managed like the > backup label file; exclusive backups would actually write it to the master > data directory, and non-exclusive backups would incorporate it on the fly. > pg_basebackup could also omit the actual links from its backup. Nearly any > tar or file copy utility would then suffice. > > 2. Add a pg_basebackup option like "--destdir" or "--sysroot", meaningful only > with -Fp; tablespace backups will be stored relative to it. So if the actual > tablespace path is c:/foo, --destdir=c:/backups/today would backup that > tablespace to c:/backups/today/c/foo. This facilitates same-server use of -Fp > on all platforms. My understanding is that the second option here would be useful also when you want to create a standby with a different file layout than the master, which in some cases is what you want to do (not HA strictly). Another defect of pg_basebackup is its lack of shandling of tablespaces mounted within $PGDATA, which happens often enough at customers sites, whatever we think about that option. Would your work be extended to cover that too? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On 08/01/2013 12:15 PM, Noah Misch wrote: > A "pg_basebackup -Fp" running on the same system as the target cluster will > fail in the presence of tablespaces; it would backup each tablespace to its > original path, and those paths are in use locally for the very originals we're > copying. "pg_basebackup -Ft" does not exhibit that hazard, and I typically > recommend it for folks using tablespaces. > > On Windows, we populate pg_tblspc with NTFS junction points. "pg_basebackup > -Fp" reproduces them, and "pg_basebackup -Ft" stores them in the tar archive > as symbolic links. Trouble arises for -Ft backups: no Windows tar expander > that I've found will recreate the junction points. While -Fp backups are > basically usable, commands that copy files on Windows are inconsistent about > their support for junction points; duplicating a base backup after the fact is > error-prone. Windows users of tablespaces are left with limited options: use > "pg_basebackup -Fp" on a different system, or use -Ft but manually recreate > the junction points. We can do better; I see a few options: > > 1. Include in the base backup a file listing symbolic links/junction points, > then have archive recovery recreate them. This file would be managed like the > backup label file; exclusive backups would actually write it to the master > data directory, and non-exclusive backups would incorporate it on the fly. > pg_basebackup could also omit the actual links from its backup. Nearly any > tar or file copy utility would then suffice. > > 2. Add a pg_basebackup option like "--destdir" or "--sysroot", meaningful only > with -Fp; tablespace backups will be stored relative to it. So if the actual > tablespace path is c:/foo, --destdir=c:/backups/today would backup that > tablespace to c:/backups/today/c/foo. This facilitates same-server use of -Fp > on all platforms. > > 3. Use path concatenation instead of symbolic links/junction points for > tablespaces. More invasive, no doubt. For example, we would need to devise a > way for recovery to get the tablespace path. > > I think #1 is a good bet; it's self-contained and fully heals the situation > for Windows users. By itself, #2 helps less than #1 on Windows. It may have > independent value. Other ideas, opinions? > Thanks for raising this. I agree it's an area that needs work. I like #1, it seems nice and workable. I also like the concept of #2, but I think we need to think about it a bit more. One of the things I like about barman backups is that on recovery you can map where tablespaces go, on a per tablespace basis (it's not very well documented, or wasn't when I last looked, but it does work). I think something like that would be awesome to have for pg_basebackup. So allowing multiple options of the form --map-tablespace c:/foo/bar=d:/baz/blurfl or some such would be great. cheers andrew
On Thu, Aug 01, 2013 at 06:44:41PM +0200, Dimitri Fontaine wrote: > Noah Misch <noah@leadboat.com> writes: > > 2. Add a pg_basebackup option like "--destdir" or "--sysroot", meaningful only > > with -Fp; tablespace backups will be stored relative to it. So if the actual > > tablespace path is c:/foo, --destdir=c:/backups/today would backup that > > tablespace to c:/backups/today/c/foo. This facilitates same-server use of -Fp > > on all platforms. > > My understanding is that the second option here would be useful also > when you want to create a standby with a different file layout than the > master, which in some cases is what you want to do (not HA strictly). The way I was envisioning it, you would still need to place the tablespace directories in their ordinary locations before recovering the base backup. This was just a way to relocate the backup itself. I can see value in both capabilities, though. > Another defect of pg_basebackup is its lack of shandling of tablespaces > mounted within $PGDATA, which happens often enough at customers sites, > whatever we think about that option. Would your work be extended to > cover that too? Not that I had in mind. My latest thinking on that topic is along the lines of helping folks stop doing it, not making it work better: http://www.postgresql.org/message-id/flat/20121205010442.GA16472@tornado.leadboat.com Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com
On Thu, Aug 01, 2013 at 01:04:42PM -0400, Andrew Dunstan wrote: > On 08/01/2013 12:15 PM, Noah Misch wrote: >> 1. Include in the base backup a file listing symbolic links/junction points, >> then have archive recovery recreate them. This file would be managed like the >> backup label file; exclusive backups would actually write it to the master >> data directory, and non-exclusive backups would incorporate it on the fly. >> pg_basebackup could also omit the actual links from its backup. Nearly any >> tar or file copy utility would then suffice. >> >> 2. Add a pg_basebackup option like "--destdir" or "--sysroot", meaningful only >> with -Fp; tablespace backups will be stored relative to it. So if the actual >> tablespace path is c:/foo, --destdir=c:/backups/today would backup that >> tablespace to c:/backups/today/c/foo. This facilitates same-server use of -Fp >> on all platforms. > I like #1, it seems nice and workable. Agreed. I'll lean in that direction for resolving the proximate problem. > I also like the concept of #2, but I think we need to think about it a > bit more. One of the things I like about barman backups is that on > recovery you can map where tablespaces go, on a per tablespace basis > (it's not very well documented, or wasn't when I last looked, but it > does work). I think something like that would be awesome to have for > pg_basebackup. So allowing multiple options of the form > > --map-tablespace c:/foo/bar=d:/baz/blurfl > > or some such would be great. Good point. I see now that the syntax I floated covered just one slice of a whole range of things folks might want in that area. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com
On Mon, Aug 5, 2013 at 10:03 PM, Noah Misch <noah@leadboat.com> wrote: > On Thu, Aug 01, 2013 at 01:04:42PM -0400, Andrew Dunstan wrote: >> On 08/01/2013 12:15 PM, Noah Misch wrote: >>> 1. Include in the base backup a file listing symbolic links/junction points, >>> then have archive recovery recreate them. This file would be managed like the >>> backup label file; exclusive backups would actually write it to the master >>> data directory, and non-exclusive backups would incorporate it on the fly. >>> pg_basebackup could also omit the actual links from its backup. Nearly any >>> tar or file copy utility would then suffice. >>> >>> 2. Add a pg_basebackup option like "--destdir" or "--sysroot", meaningful only >>> with -Fp; tablespace backups will be stored relative to it. So if the actual >>> tablespace path is c:/foo, --destdir=c:/backups/today would backup that >>> tablespace to c:/backups/today/c/foo. This facilitates same-server use of -Fp >>> on all platforms. > >> I like #1, it seems nice and workable. > > Agreed. I'll lean in that direction for resolving the proximate problem. +1. >> I also like the concept of #2, but I think we need to think about it a >> bit more. One of the things I like about barman backups is that on >> recovery you can map where tablespaces go, on a per tablespace basis >> (it's not very well documented, or wasn't when I last looked, but it >> does work). I think something like that would be awesome to have for >> pg_basebackup. So allowing multiple options of the form >> >> --map-tablespace c:/foo/bar=d:/baz/blurfl >> >> or some such would be great. > > Good point. I see now that the syntax I floated covered just one slice of a > whole range of things folks might want in that area. I think I have an old patch around for doing just the map-tablespace thing that I never quite finished. That one was mostly useful for setting up replicas really, since that's when you know at backup time where you want to the new tablespaces to go. For a regular backup, you want it to happen at restore time. And in this case, you're quite likely working off the tarfiles, and we don't really have a program dealing with the restore of those at all - you're just supposed to do it manually. A trivial tool that worked off the directory of tarfiles and allowed remapping of the tablespace locations (by updating the symlinks/junctions restored out of the base.tar flie) might be an easier and less invasive way to do it than to put it in the actual recovery code? -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On 08/12/2013 01:40 PM, Magnus Hagander wrote: >>> I also like the concept of #2, but I think we need to think about it a >>> bit more. One of the things I like about barman backups is that on >>> recovery you can map where tablespaces go, on a per tablespace basis >>> (it's not very well documented, or wasn't when I last looked, but it >>> does work). I think something like that would be awesome to have for >>> pg_basebackup. So allowing multiple options of the form >>> >>> --map-tablespace c:/foo/bar=d:/baz/blurfl >>> >>> or some such would be great. >> Good point. I see now that the syntax I floated covered just one slice of a >> whole range of things folks might want in that area. > I think I have an old patch around for doing just the map-tablespace > thing that I never quite finished. That one was mostly useful for > setting up replicas really, since that's when you know at backup time > where you want to the new tablespaces to go. For a regular backup, you > want it to happen at restore time. And in this case, you're quite > likely working off the tarfiles, and we don't really have a program > dealing with the restore of those at all - you're just supposed to do > it manually. > > A trivial tool that worked off the directory of tarfiles and allowed > remapping of the tablespace locations (by updating the > symlinks/junctions restored out of the base.tar flie) might be an > easier and less invasive way to do it than to put it in the actual > recovery code? > > What barman does is to dissolve the symlink when making its backup (i.e pg_tblspc/12345 becomes a directory in the backup instead of a symlink), and store the info relating to the source symlink in its metadata file. On restore it recreates the symlink. It's at that stage that you can modify its default behaviour by specifying where the link should go. cheers andrew
On Mon, Aug 12, 2013 at 8:14 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > > On 08/12/2013 01:40 PM, Magnus Hagander wrote: >>>> >>>> I also like the concept of #2, but I think we need to think about it a >>>> bit more. One of the things I like about barman backups is that on >>>> recovery you can map where tablespaces go, on a per tablespace basis >>>> (it's not very well documented, or wasn't when I last looked, but it >>>> does work). I think something like that would be awesome to have for >>>> pg_basebackup. So allowing multiple options of the form >>>> >>>> --map-tablespace c:/foo/bar=d:/baz/blurfl >>>> >>>> or some such would be great. >>> >>> Good point. I see now that the syntax I floated covered just one slice >>> of a >>> whole range of things folks might want in that area. >> >> I think I have an old patch around for doing just the map-tablespace >> thing that I never quite finished. That one was mostly useful for >> setting up replicas really, since that's when you know at backup time >> where you want to the new tablespaces to go. For a regular backup, you >> want it to happen at restore time. And in this case, you're quite >> likely working off the tarfiles, and we don't really have a program >> dealing with the restore of those at all - you're just supposed to do >> it manually. >> >> A trivial tool that worked off the directory of tarfiles and allowed >> remapping of the tablespace locations (by updating the >> symlinks/junctions restored out of the base.tar flie) might be an >> easier and less invasive way to do it than to put it in the actual >> recovery code? >> >> > > > What barman does is to dissolve the symlink when making its backup (i.e > pg_tblspc/12345 becomes a directory in the backup instead of a symlink), and > store the info relating to the source symlink in its metadata file. On > restore it recreates the symlink. It's at that stage that you can modify its > default behaviour by specifying where the link should go. Something like that makes sense for a plain format dump - but maybe not for a tar one. And in either case, that also requires there to be a pg_baserestore (or whatever you'd want to call it, please come up with a better name :D) to do the restore process, to make sure it's properly mapped. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On 08/12/2013 02:22 PM, Magnus Hagander wrote: > On Mon, Aug 12, 2013 at 8:14 PM, Andrew Dunstan <andrew@dunslane.net> wrote: >> On 08/12/2013 01:40 PM, Magnus Hagander wrote: >>>>> I also like the concept of #2, but I think we need to think about it a >>>>> bit more. One of the things I like about barman backups is that on >>>>> recovery you can map where tablespaces go, on a per tablespace basis >>>>> (it's not very well documented, or wasn't when I last looked, but it >>>>> does work). I think something like that would be awesome to have for >>>>> pg_basebackup. So allowing multiple options of the form >>>>> >>>>> --map-tablespace c:/foo/bar=d:/baz/blurfl >>>>> >>>>> or some such would be great. >>>> Good point. I see now that the syntax I floated covered just one slice >>>> of a >>>> whole range of things folks might want in that area. >>> I think I have an old patch around for doing just the map-tablespace >>> thing that I never quite finished. That one was mostly useful for >>> setting up replicas really, since that's when you know at backup time >>> where you want to the new tablespaces to go. For a regular backup, you >>> want it to happen at restore time. And in this case, you're quite >>> likely working off the tarfiles, and we don't really have a program >>> dealing with the restore of those at all - you're just supposed to do >>> it manually. >>> >>> A trivial tool that worked off the directory of tarfiles and allowed >>> remapping of the tablespace locations (by updating the >>> symlinks/junctions restored out of the base.tar flie) might be an >>> easier and less invasive way to do it than to put it in the actual >>> recovery code? >>> >>> >> >> What barman does is to dissolve the symlink when making its backup (i.e >> pg_tblspc/12345 becomes a directory in the backup instead of a symlink), and >> store the info relating to the source symlink in its metadata file. On >> restore it recreates the symlink. It's at that stage that you can modify its >> default behaviour by specifying where the link should go. > Something like that makes sense for a plain format dump - but maybe > not for a tar one. And in either case, that also requires there to be > a pg_baserestore (or whatever you'd want to call it, please come up > with a better name :D) to do the restore process, to make sure it's > properly mapped. > I'm not saying to do it exactly as barman does. I'm just describing how they do it. I'd be quite happy if we just provided, at least to begin with, tablespace mapping for plain format pg_basebackup dumps. cheers andrew
>On Mon, Aug 5, 2013 at 10:03 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
>> On Thu, Aug 01, 2013 at 01:04:42PM -0400, Andrew Dunstan wrote:
>>> On 08/01/2013 12:15 PM, Noah Misch wrote:
>>>> 1. Include in the base backup a file listing symbolic links/junction points,
>>>> then have archive recovery recreate them. This file would be managed like the
>>>> backup label file; exclusive backups would actually write it to the master
>>>> data directory, and non-exclusive backups would incorporate it on the fly.
>>>> pg_basebackup could also omit the actual links from its backup. Nearly any
>>>> tar or file copy utility would then suffice.
>
>>> I like #1, it seems nice and workable.
>
>> Agreed. I'll lean in that direction for resolving the proximate problem.
>+1.
Вложения
<div class="WordSection1"><p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">15 July 201419:29 Amit Kapila Wrote,</span><p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><pclass="MsoNormal">>Implementation details:<p class="MsoNormal">-----------------------------------<pclass="MsoNormal">>1. This feature is implemented only for tarformat in windows<p class="MsoNormal">>as native windows utilites are not able to create symlinks while<p class="MsoNormal">>extractingfiles from tar (It might be possible to create symlinks<p class="MsoNormal">>if cygwinis installed on your system, however I feel we need this<p class="MsoNormal">>feature to work for native windowsas well). Another reason to not<p class="MsoNormal">>create it for non-tar (plain) format is that plain formatcan update the<p class="MsoNormal">>symlinks via -T option and backing up symlink file during that<p class="MsoNormal">>operationcan lead to spurious symlinks after archive recovery. <p class="MsoNormal"> <p class="MsoNormal">Ihave reviewed the patch and did not find any major comments.<p class="MsoNormal"> <p class="MsoNormal">Thereare some comments I would like to share with you<p class="MsoNormal"> <p class="MsoListParagraph"style="text-indent:-18.0pt;mso-list:l0 level1 lfo1"><span style="mso-list:Ignore">1.<span style="font:7.0pt"Times New Roman""> </span></span>Rebase the patch to current GIT head.<p class="MsoListParagraph"> <pclass="MsoListParagraph" style="text-indent:-18.0pt;mso-list:l0 level1 lfo1"><span style="mso-list:Ignore">2.<spanstyle="font:7.0pt "Times New Roman""> </span></span>+ * Constructsymlink file<p class="MsoListParagraph">+ */<p class="MsoListParagraph">+ initStringInfo(&symlinkfbuf);<p class="MsoNormal"> I thinkdeclaration and initialization of symlinkfbuf string can be moved under #ifdef WIN32 compile time macro, <p class="MsoNormal"style="text-indent:36.0pt">for other platform it’s simply allocated and freed which can be avoided. <p class="MsoNormal"style="text-indent:36.0pt"> <p class="MsoListParagraph" style="text-indent:-18.0pt;mso-list:l0 level1 lfo1"><spanstyle="mso-list:Ignore">3.<span style="font:7.0pt "Times New Roman""> </span></span>+ /*<p class="MsoListParagraph">+ * nativewindows utilites are not able create symlinks while<p class="MsoListParagraph">+ *extracting files from tar.<p class="MsoListParagraph">+ */<p class="MsoNormal" style="text-indent:36.0pt"> <pclass="MsoNormal"> Rephrase the above sentence and fix spelling mistake (<b>utilities</b>are not able <b>to</b> create)<p class="MsoNormal"> <p class="MsoNormal">I haven’t done the testing yet,once I finish with testing i will share the result with you.<p class="MsoNormal"> <p class="MsoNormal"> <p class="MsoNormal">Regards,<pclass="MsoNormal">Dilip<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span></div>
>
> I have reviewed the patch and did not find any major comments.
Thanks for the review.
> There are some comments I would like to share with you
>
>
>
> 1. Rebase the patch to current GIT head.
Done.
> 2. + * Construct symlink file
>
> + */
>
> + initStringInfo(&symlinkfbuf);
>
> I think declaration and initialization of symlinkfbuf string can be moved under #ifdef WIN32 compile time macro,
>
> for other platform it’s simply allocated and freed which can be avoided.
>
> 3. + /*
>
> + * native windows utilites are not able create symlinks while
>
> + * extracting files from tar.
>
> + */
>
>
>
> Rephrase the above sentence and fix spelling mistake (utilities are not able to create)
Вложения
<div class="WordSection1"><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">On</span><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">20August 2014 19:49, Amit Kapila Wrote</span><p class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><pclass="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><pclass="MsoNormal">> There are some comments I wouldlike to share with you<br /> ><br /> > <br /> ><br /> > 1. Rebase the patch to current GIT head.<pclass="MsoNormal">>Done.<p class="MsoNormal"> <p class="MsoNormal" style="margin-bottom:12.0pt">>> + initStringInfo(&symlinkfbuf);<br /> >><br /> >> I think declaration and initializationof symlinkfbuf string can be moved under #ifdef WIN32 compile time macro,<br /> >><br /> >> forother platform it’s simply allocated and freed which can be avoided.<p class="MsoNormal">>Agreed, I have changed thepatch as per your suggestion.<p class="MsoNormal" style="margin-bottom:12.0pt"> <br /><span style="color:#1F497D">I havedone the testing and behavior is as per expectation, </span><p class="MsoNormal" style="margin-bottom:12.0pt"><span style="color:#1F497D">Dowe need to do some document change? I mean is this limitation on windows is mentioned anywhere ?</span><pclass="MsoNormal" style="margin-bottom:12.0pt"><span style="color:#1F497D">If no change then i will move the patchto “Ready For Committer”.</span><p class="MsoNormal" style="margin-bottom:12.0pt"><span style="color:#1F497D"> </span><pclass="MsoNormal" style="margin-bottom:12.0pt"><span style="color:#1F497D">Thanks & Regards,</span><pclass="MsoNormal" style="margin-bottom:12.0pt"><span style="color:#1F497D">Dilip</span><p class="MsoNormal"style="margin-bottom:12.0pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span></div>
>
>
> I have done the testing and behavior is as per expectation,
>
> Do we need to do some document change? I mean is this limitation on windows is mentioned anywhere ?
I don't think currently such a limitation is mentioned in docs,
With Regards,
Amit Kapila.
On 11 September 2014 10:21, Amit kapila Wrote,
>I don't think currently such a limitation is mentioned in docs,
>however I think we can update the docs at below locations:
>1. In description of pg_start_backup in below page:
>2. In Explanation of Base Backup, basically under heading
>Making a Base Backup Using the Low Level API at below
>page:
>In general, we can explain about symlink_label file where ever
>we are explaining about backup_label file.
>If you think it is sufficient to explain about symlink_label in
>above places, then I can update the patch.
I think this will be sufficient….
Regards,
Dilip
> On 11 September 2014 10:21, Amit kapila Wrote,
> >I don't think currently such a limitation is mentioned in docs,
>
> >however I think we can update the docs at below locations:
> >1. In description of pg_start_backup in below page:
> >http://www.postgresql.org/docs/devel/static/functions-admin.html#FUNCTIONS-ADMIN-BACKUP
>
> >2. In Explanation of Base Backup, basically under heading
> >Making a Base Backup Using the Low Level API at below
> >page:
> >http://www.postgresql.org/docs/devel/static/continuous-archiving.html#BACKUP-BASE-BACKUP
> >In general, we can explain about symlink_label file where ever
> >we are explaining about backup_label file.
>
> >If you think it is sufficient to explain about symlink_label if
> >above places, then I can update the patch.
>
> I think this will be sufficient….
Вложения
On 12 September 2014 14:34, Amit Kapila Wrote
>Please find updated patch to include those documentation changes.
Looks fine, Moved to Ready for committer.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
>
> On 12 September 2014 14:34, Amit Kapila Wrote
> >Please find updated patch to include those documentation changes.
>
>
>
> Looks fine, Moved to Ready for committer.
Thanks a lot for the review.
>
> On Fri, Sep 12, 2014 at 5:07 PM, Dilip kumar <dilip.kumar@huawei.com> wrote:
> >
> > On 12 September 2014 14:34, Amit Kapila Wrote
>
> > >Please find updated patch to include those documentation changes.
> >
> >
> >
> > Looks fine, Moved to Ready for committer.
>
> Thanks a lot for the review.
This patch is in "Ready for committer" stage for more than 1.5 months.
Amit Kapila wrote: > This patch is in "Ready for committer" stage for more than 1.5 months. > I believe this is an important functionality such that without this tar > format of pg_basebackup is not usable on Windows. I feel this > will add a value to pg_basebackup utility and moreover the need > and design has been agreed upon the list before development. > > Can any Committer please have a look at this patch? Is this still relevant after this commit? commit fb05f3ce83d225dd0f39f8860ce04082753e9e98 Author: Peter Eisentraut <peter_e@gmx.net> Date: Sat Feb 22 13:38:06 2014 -0500 pg_basebackup: Add support for relocating tablespaces -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 11/13/14 11:52 AM, Alvaro Herrera wrote: > Amit Kapila wrote: > >> This patch is in "Ready for committer" stage for more than 1.5 months. >> I believe this is an important functionality such that without this tar >> format of pg_basebackup is not usable on Windows. I feel this >> will add a value to pg_basebackup utility and moreover the need >> and design has been agreed upon the list before development. >> >> Can any Committer please have a look at this patch? > > Is this still relevant after this commit? > > commit fb05f3ce83d225dd0f39f8860ce04082753e9e98 > Author: Peter Eisentraut <peter_e@gmx.net> > Date: Sat Feb 22 13:38:06 2014 -0500 > > pg_basebackup: Add support for relocating tablespaces I believe so. The commit only applies to "plain" output. Amit's complaint is that tar utilities on Windows don't unpack symlinks, so the "tar" format isn't useful on Windows when tablespaces are used. So he wants the recovery mechanism to restore the symlinks. I'm not fully on board with that premise. (Get a better tar tool. Submit a patch.) But this also ties in with the recent discovery that the tar format cannot handle symlinks longer than 99 bytes. So this patch could also fix that problem by putting the untruncated name of the symlink in the WAL data.
On Thu, Nov 13, 2014 at 4:33 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > I'm not fully on board with that premise. (Get a better tar tool. > Submit a patch.) Noah was unable to find one that works: http://www.postgresql.org/message-id/20130801161519.GA334956@tornado.leadboat.com If most tar tools worked, and there was one that didn't, I think that'd be a reasonable argument. But telling people to get a better tool when they'd have to write it first seems rather unfriendly. > But this also ties in with the recent discovery that the tar format > cannot handle symlinks longer than 99 bytes. So this patch could also > fix that problem by putting the untruncated name of the symlink in the > WAL data. Yeah, seems like a chance to kill two birds with one stone. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
>
> On 11/13/14 11:52 AM, Alvaro Herrera wrote:
> > Amit Kapila wrote:
> >
> >> This patch is in "Ready for committer" stage for more than 1.5 months.
> >> I believe this is an important functionality such that without this tar
> >> format of pg_basebackup is not usable on Windows. I feel this
> >> will add a value to pg_basebackup utility and moreover the need
> >> and design has been agreed upon the list before development.
> >>
> >> Can any Committer please have a look at this patch?
> >
> > Is this still relevant after this commit?
> >
> > commit fb05f3ce83d225dd0f39f8860ce04082753e9e98
> > Author: Peter Eisentraut <peter_e@gmx.net>
> > Date: Sat Feb 22 13:38:06 2014 -0500
> >
> > pg_basebackup: Add support for relocating tablespaces
>
> I believe so.
>
> The commit only applies to "plain" output. Amit's complaint is that tar
> utilities on Windows don't unpack symlinks, so the "tar" format isn't
> useful on Windows when tablespaces are used. So he wants the recovery
> mechanism to restore the symlinks.
>
> I'm not fully on board with that premise. (Get a better tar tool.
> Submit a patch.)
>
> But this also ties in with the recent discovery that the tar format
> cannot handle symlinks longer than 99 bytes. So this patch could also
> fix that problem by putting the untruncated name of the symlink in the
> WAL data.
>
On Thu, Nov 13, 2014 at 10:37 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Fri, Nov 14, 2014 at 3:03 AM, Peter Eisentraut <peter_e@gmx.net> wrote: >> >> On 11/13/14 11:52 AM, Alvaro Herrera wrote: >> > Amit Kapila wrote: >> > >> >> This patch is in "Ready for committer" stage for more than 1.5 months. >> >> I believe this is an important functionality such that without this tar >> >> format of pg_basebackup is not usable on Windows. I feel this >> >> will add a value to pg_basebackup utility and moreover the need >> >> and design has been agreed upon the list before development. >> >> >> >> Can any Committer please have a look at this patch? >> > >> > Is this still relevant after this commit? >> > >> > commit fb05f3ce83d225dd0f39f8860ce04082753e9e98 >> > Author: Peter Eisentraut <peter_e@gmx.net> >> > Date: Sat Feb 22 13:38:06 2014 -0500 >> > >> > pg_basebackup: Add support for relocating tablespaces >> >> I believe so. >> >> The commit only applies to "plain" output. Amit's complaint is that tar >> utilities on Windows don't unpack symlinks, so the "tar" format isn't >> useful on Windows when tablespaces are used. So he wants the recovery >> mechanism to restore the symlinks. >> >> I'm not fully on board with that premise. (Get a better tar tool. >> Submit a patch.) >> > > For native Windows environment, I have checked all the tools I could find > (Winrar, tar, 7-zip, etc...) and none of them is working and even checked > a lot on google to try to find some workaround for this, but it seems there > is no way to reliably handle this issue. Refer link : > http://sourceforge.net/p/mingw/bugs/2002/ > > Then I started discussion in tar community to see if they can suggest > some way, but there also I could not find a reliable solution except that > it might work in some cases if cygwin is installed. You can refer below > thread: > https://lists.gnu.org/archive/html/bug-tar/2014-07/msg00007.html > > After spending good amount of time for finding a workaround or alternative, > only I decided that it is important to write this patch to make tar format > for pg_basebackup usable for Windows users. > >> But this also ties in with the recent discovery that the tar format >> cannot handle symlinks longer than 99 bytes. So this patch could also >> fix that problem by putting the untruncated name of the symlink in the >> WAL data. >> > > I have mentioned that this can be usable for Linux users as well on that > thread, however I think we might want to provide it with an option for > linux users. In general, I think it is good to have this patch for Windows > users and later if we find that Linux users can also get the benefit with > this functionality, we can expose the same with an additional option. Why make it an option instead of just always doing it this way? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> On Thu, Nov 13, 2014 at 10:37 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>
> >
> > I have mentioned that this can be usable for Linux users as well on that
> > thread, however I think we might want to provide it with an option for
> > linux users. In general, I think it is good to have this patch for Windows
> > users and later if we find that Linux users can also get the benefit with
> > this functionality, we can expose the same with an additional option.
>
> Why make it an option instead of just always doing it this way?
>
To avoid extra work during archive recovery if it is not required. I
On Fri, Nov 14, 2014 at 2:55 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Fri, Nov 14, 2014 at 9:11 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Thu, Nov 13, 2014 at 10:37 PM, Amit Kapila <amit.kapila16@gmail.com> >> wrote: >> >> >> > >> > I have mentioned that this can be usable for Linux users as well on that >> > thread, however I think we might want to provide it with an option for >> > linux users. In general, I think it is good to have this patch for >> > Windows >> > users and later if we find that Linux users can also get the benefit >> > with >> > this functionality, we can expose the same with an additional option. >> >> Why make it an option instead of just always doing it this way? >> > To avoid extra work during archive recovery if it is not required. I > understand that this might not create any measurable difference, but > still there is addition I/O involved (read from file) which can be avoided. Yeah, but it's trivial. We're not going create enough tablespaces on one cluster for the cost of dropping a few extra symlinks in place to matter. > OTOH, if that is okay, then I think we can avoid few #ifdef WIN32 that > this patch introduces and can have consistency for this operation on > both linux and Windows. Having one code path for everything seems appealing to me, but what do others think? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, Nov 14, 2014 at 2:55 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: >> OTOH, if that is okay, then I think we can avoid few #ifdef WIN32 that >> this patch introduces and can have consistency for this operation on >> both linux and Windows. > Having one code path for everything seems appealing to me, but what do > others think? Generally I'd be in favor of avoiding platform-dependent code where possible, but that doesn't represent a YES vote for this particular patch. It looks pretty messy in a quick look, even granting that the #ifdef WIN32's would all go away. A larger question here is about forward/backward compatibility of the basebackup files. Changing the representation of symlinks like this would break that. Maybe we don't care, not sure (is there already a catversion check for these things?). Changing the file format for only some platforms seems like definitely a bad idea though. regards, tom lane
On Fri, Nov 14, 2014 at 1:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Generally I'd be in favor of avoiding platform-dependent code where > possible, but that doesn't represent a YES vote for this particular > patch. It looks pretty messy in a quick look, even granting that the > #ifdef WIN32's would all go away. Hmm, OK. I have not read the patch. Hopefully that's something that could be fixed. > A larger question here is about forward/backward compatibility of the > basebackup files. Changing the representation of symlinks like this > would break that. Maybe we don't care, not sure (is there already a > catversion check for these things?). Changing the file format for only > some platforms seems like definitely a bad idea though. What are the practical consequences of changing the file format? I think that an old backup containing symlinks could be made to work on a new server that knows how to create them, and we should probably design it that way, but a physical backup isn't compatible across major versions anyway, so it doesn't have the same kinds of repercussions as changing something like the pg_dump file format. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Amit Kapila wrote: > 2. Symlink file format: > <oid> <linkpath> > 16387 E:\PostgreSQL\tbs > > Symlink file will contain entries for all the tablspaces > under pg_tblspc directory. I have kept the file name as > symlink_label (suggestion are welcome if you want some > different name for this file). I think symlink_label isn't a very good name. This file is not a label in the sense that backup_label is; it seems more a "catalog" to me. And it's not, in essence, about symlinks either, but rather about tablespaces. I would name it following the term "tablespace catalog" or some variation thereof. I know we don't expect that users would have to look at the file or edit it in normal cases, but it seems better to make it be human-readable. I would think that the file needs to have tablespace names too, then, not just OIDs. Maybe we don't use the tablespace name for anything other than "documentation" purposes if someone decides to look at the file, so perhaps it should look like a comment: <oid> <link path> ; <tablespace name> We already do this in pg_restore -l output IIRC. One use case mentioned upthread is having the clone be created in the same machine as the source server. Does your proposal help with it? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 11/13/14 4:33 PM, Peter Eisentraut wrote: >> Is this still relevant after this commit? >> > >> > commit fb05f3ce83d225dd0f39f8860ce04082753e9e98 >> > Author: Peter Eisentraut <peter_e@gmx.net> >> > Date: Sat Feb 22 13:38:06 2014 -0500 >> > >> > pg_basebackup: Add support for relocating tablespaces > I believe so. > > The commit only applies to "plain" output. Amit's complaint is that tar > utilities on Windows don't unpack symlinks, so the "tar" format isn't > useful on Windows when tablespaces are used. So he wants the recovery > mechanism to restore the symlinks. Um, wouldn't accepting this patch break the above-mentioned tablespace-relocation feature, because pg_basebackup wouldn't see any more symlinks sent down?
>
> On 11/13/14 4:33 PM, Peter Eisentraut wrote:
> >> Is this still relevant after this commit?
> >> >
> >> > commit fb05f3ce83d225dd0f39f8860ce04082753e9e98
> >> > Author: Peter Eisentraut <peter_e@gmx.net>
> >> > Date: Sat Feb 22 13:38:06 2014 -0500
> >> >
> >> > pg_basebackup: Add support for relocating tablespaces
> > I believe so.
> >
> > The commit only applies to "plain" output. Amit's complaint is that tar
> > utilities on Windows don't unpack symlinks, so the "tar" format isn't
> > useful on Windows when tablespaces are used. So he wants the recovery
> > mechanism to restore the symlinks.
>
> Um, wouldn't accepting this patch break the above-mentioned
> tablespace-relocation feature, because pg_basebackup wouldn't see any
> more symlinks sent down?
>
> Amit Kapila wrote:
>
> > 2. Symlink file format:
> > <oid> <linkpath>
> > 16387 E:\PostgreSQL\tbs
> >
> > Symlink file will contain entries for all the tablspaces
> > under pg_tblspc directory. I have kept the file name as
> > symlink_label (suggestion are welcome if you want some
> > different name for this file).
>
> I think symlink_label isn't a very good name. This file is not a label
> in the sense that backup_label is; it seems more a "catalog" to me. And
> it's not, in essence, about symlinks either, but rather about
> tablespaces. I would name it following the term "tablespace catalog" or
> some variation thereof.
>
> I know we don't expect that users would have to look at the file or edit
> it in normal cases, but it seems better to make it be human-readable. I
> would think that the file needs to have tablespace names too, then, not
> just OIDs. Maybe we don't use the tablespace name for anything other
> than "documentation" purposes if someone decides to look at the file, so
> perhaps it should look like a comment:
>
> <oid> <link path> ; <tablespace name>
>
> We already do this in pg_restore -l output IIRC.
>
> One use case mentioned upthread is having the clone be created in the
> same machine as the source server. Does your proposal help with it?
>
Sorry, but I am not getting which proposal exactly you are referring here,
>
> On Fri, Nov 14, 2014 at 1:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Generally I'd be in favor of avoiding platform-dependent code where
> > possible, but that doesn't represent a YES vote for this particular
> > patch. It looks pretty messy in a quick look, even granting that the
> > #ifdef WIN32's would all go away.
>
> Hmm, OK. I have not read the patch. Hopefully that's something that
> could be fixed.
>
> > A larger question here is about forward/backward compatibility of the
> > basebackup files. Changing the representation of symlinks like this
> > would break that. Maybe we don't care, not sure (is there already a
> > catversion check for these things?). Changing the file format for only
> > some platforms seems like definitely a bad idea though.
>
> What are the practical consequences of changing the file format? I
> think that an old backup containing symlinks could be made to work on
> a new server that knows how to create them,
Amit Kapila wrote: > On Sat, Nov 15, 2014 at 12:03 AM, Alvaro Herrera <alvherre@2ndquadrant.com> > wrote: > > > > Amit Kapila wrote: > > I think symlink_label isn't a very good name. This file is not a label > > in the sense that backup_label is; it seems more a "catalog" to me. And > > it's not, in essence, about symlinks either, but rather about > > tablespaces. I would name it following the term "tablespace catalog" or > > some variation thereof. > > This file is going to provide the symlink path for each tablespace, so > it not be bad to have that in file name. I agree with you that it's more > about tablespaces. So how about: > > tablespace_symlink > symlink_tablespace > tablespace_info I think the fact that we use symlinks is an implementation detail; aren't them actually junction points, not symlinks, in some Windows cases? The The pg_tablespace catalog uses (or used to use) "spclocation" for this, not "spcsymlink". > > One use case mentioned upthread is having the clone be created in the > > same machine as the source server. Does your proposal help with it? > > Sorry, but I am not getting which proposal exactly you are referring here, > Could you explain in more detail? In the first message of this thread[1], Noah said: : A "pg_basebackup -Fp" running on the same system as the target cluster will : fail in the presence of tablespaces; it would backup each tablespace to its : original path, and those paths are in use locally for the very originals we're : copying. > In general, if user took the backup (in tar format) using pg_basebackup, > this > patch will be able to restore such a backup even on the same server. I must be misunderstanding either you or Noah. [1] http://www.postgresql.org/message-id/20130801161519.GA334956@tornado.leadboat.com -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
> Amit Kapila wrote:
> > On Sat, Nov 15, 2014 at 12:03 AM, Alvaro Herrera <alvherre@2ndquadrant.com>
> > wrote:
> > >
> > > Amit Kapila wrote:
>
> > > I think symlink_label isn't a very good name. This file is not a label
> > > in the sense that backup_label is; it seems more a "catalog" to me. And
> > > it's not, in essence, about symlinks either, but rather about
> > > tablespaces. I would name it following the term "tablespace catalog" or
> > > some variation thereof.
> >
> > This file is going to provide the symlink path for each tablespace, so
> > it not be bad to have that in file name. I agree with you that it's more
> > about tablespaces. So how about:
> >
> > tablespace_symlink
> > symlink_tablespace
> > tablespace_info
>
> I think the fact that we use symlinks is an implementation detail;
> aren't them actually junction points, not symlinks, in some Windows
> cases?
> > > One use case mentioned upthread is having the clone be created in the
> > > same machine as the source server. Does your proposal help with it?
> >
> > Sorry, but I am not getting which proposal exactly you are referring here,
> > Could you explain in more detail?
>
> In the first message of this thread[1], Noah said:
>
> : A "pg_basebackup -Fp" running on the same system as the target cluster will
> : fail in the presence of tablespaces; it would backup each tablespace to its
> : original path, and those paths are in use locally for the very originals we're
> : copying.
>
> > In general, if user took the backup (in tar format) using pg_basebackup,
> > this
> > patch will be able to restore such a backup even on the same server.
>
> I must be misunderstanding either you or Noah.
>
Does the above information addressed your question?
Amit Kapila wrote: > On Sun, Nov 16, 2014 at 6:15 AM, Alvaro Herrera <alvherre@2ndquadrant.com> > wrote: > > I think the fact that we use symlinks is an implementation detail; > > aren't them actually junction points, not symlinks, in some Windows > > cases? > > Right, but they provide same functionality as symlinks and now we > are even planing to provide this feature for both linux and windows as > both Tom and Robert seems to feel, it's better that way. Anyhow, > I think naming any entity generally differs based on individual's > perspective, so we can go with the name which appeals to more people. > In case, nobody else has any preference, I will change it to what both > of us can agree upon (either 'tablespace catalog', 'tablespace_info' ...). Well, I have made my argument. Since you're the submitter, feel free to select what you think is the best name. > > > > One use case mentioned upthread is having the clone be created in the > > > > same machine as the source server. Does your proposal help with it? > That use case got addressed with -T option with which user can relocate > tablespace directory (Commit: fb05f3c; pg_basebackup: Add support for > relocating tablespaces) Okay. As far as I know, -T only works for plain mode, right? I wonder if we should make -T modify the tablespace catalog, so that the resulting file in tar output outputs names mangled per the map; that would make -T work in tar mode too. Does that make sense? (Maybe it already works that way? I didn't research.) -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
>
> Amit Kapila wrote:
> > On Sun, Nov 16, 2014 at 6:15 AM, Alvaro Herrera <alvherre@2ndquadrant.com>
> > wrote:
>
>
> > > > > One use case mentioned upthread is having the clone be created in the
> > > > > same machine as the source server. Does your proposal help with it?
>
> > That use case got addressed with -T option with which user can relocate
> > tablespace directory (Commit: fb05f3c; pg_basebackup: Add support for
> > relocating tablespaces)
>
> Okay. As far as I know, -T only works for plain mode, right?
> if we should make -T modify the tablespace catalog, so that the
> resulting file in tar output outputs names mangled per the map; that
> would make -T work in tar mode too. Does that make sense?
On Tue, Nov 18, 2014 at 9:19 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >> Right, but they provide same functionality as symlinks and now we >> are even planing to provide this feature for both linux and windows as >> both Tom and Robert seems to feel, it's better that way. Anyhow, >> I think naming any entity generally differs based on individual's >> perspective, so we can go with the name which appeals to more people. >> In case, nobody else has any preference, I will change it to what both >> of us can agree upon (either 'tablespace catalog', 'tablespace_info' ...). > > Well, I have made my argument. Since you're the submitter, feel free to > select what you think is the best name. For what it's worth, I, too, dislike having symlink in the name. Maybe "tablespace_map"? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> On Tue, Nov 18, 2014 at 9:19 AM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
> >> Right, but they provide same functionality as symlinks and now we
> >> are even planing to provide this feature for both linux and windows as
> >> both Tom and Robert seems to feel, it's better that way. Anyhow,
> >> I think naming any entity generally differs based on individual's
> >> perspective, so we can go with the name which appeals to more people.
> >> In case, nobody else has any preference, I will change it to what both
> >> of us can agree upon (either 'tablespace catalog', 'tablespace_info' ...).
> >
> > Well, I have made my argument. Since you're the submitter, feel free to
> > select what you think is the best name.
>
> For what it's worth, I, too, dislike having symlink in the name.
> Maybe "tablespace_map"?
Вложения
On 11/20/2014 02:27 AM, Amit Kapila wrote: > On Wed, Nov 19, 2014 at 11:46 PM, Robert Haas <robertmhaas@gmail.com > <mailto:robertmhaas@gmail.com>> wrote: > > On Tue, Nov 18, 2014 at 9:19 AM, Alvaro Herrera > > <alvherre@2ndquadrant.com <mailto:alvherre@2ndquadrant.com>> wrote: > > >> Right, but they provide same functionality as symlinks and now we > > >> are even planing to provide this feature for both linux and > windows as > > >> both Tom and Robert seems to feel, it's better that way. Anyhow, > > >> I think naming any entity generally differs based on individual's > > >> perspective, so we can go with the name which appeals to more people. > > >> In case, nobody else has any preference, I will change it to what > both > > >> of us can agree upon (either 'tablespace catalog', > 'tablespace_info' ...). > > > > > > Well, I have made my argument. Since you're the submitter, feel > free to > > > select what you think is the best name. > > > > For what it's worth, I, too, dislike having symlink in the name. > > Maybe "tablespace_map"? > > Sounds good to me as well. > > To summarize the situation of this patch, I have received below comments > on which I am planning to work: > > 1. Change the name of file containing tablespace path information. > 2. Store tablespace name as well along with oid and path to make the > information Human readable. > 3. Make the code generic (Remove #ifdef Win32 macro's and change > comments referring this functionality for windows and see if any more > changes are required to make it work on linux.) > > Now the part where I would like to receive feedback before revising the > patch is on the coding style. It seems to me from Tom's comments that > he is not happy with the code, now I am not sure which part of the patch > he thinks needs change. Tom if possible, could you be slightly more > specific about your concern w.r.t code? > > I have attached a rebased (on top of commit-8d7af8f) patch, just incase > some one wants to apply and check it. > In view of the request above for comments from Tom, I have moved this back to "Needs Review". cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > On 11/20/2014 02:27 AM, Amit Kapila wrote: >> Now the part where I would like to receive feedback before revising the >> patch is on the coding style. It seems to me from Tom's comments that >> he is not happy with the code, now I am not sure which part of the patch >> he thinks needs change. Tom if possible, could you be slightly more >> specific about your concern w.r.t code? > In view of the request above for comments from Tom, I have moved this > back to "Needs Review". Sorry, I was not paying very close attention to this thread and missed the request for comments. A few such: 1. The patch is completely naive about what might be in the symlink path string; eg embedded spaces in the path would break it. On at least some platforms, newlines could be in the path as well. I'm not sure about how to guard against this while maintaining human readability of the file. 2. There seems to be more going on here than what is advertised, eg why do we need to add an option to the BASE_BACKUP command (and if we do need it, doesn't it need to be documented in protocol.sgml)? And why is the RelationCacheInitFileRemove call relocated? 3. Not terribly happy with the changes made to the API of do_pg_start_backup, eg having to be able to parse "DIR *" in its arguments seems like a lot of #include creep. xlog.h is pretty central so I'm not happy about plastering more #includes in it. 4. In the same vein, publicly declaring a struct with a name as generic as "tablespaceinfo" doesn't seem like a great idea, when its usage is far from generic. regards, tom lane
>
> Andrew Dunstan <andrew@dunslane.net> writes:
> > On 11/20/2014 02:27 AM, Amit Kapila wrote:
> >> Now the part where I would like to receive feedback before revising the
> >> patch is on the coding style. It seems to me from Tom's comments that
> >> he is not happy with the code, now I am not sure which part of the patch
> >> he thinks needs change. Tom if possible, could you be slightly more
> >> specific about your concern w.r.t code?
>
> > In view of the request above for comments from Tom, I have moved this
> > back to "Needs Review".
>
> Sorry, I was not paying very close attention to this thread and missed
> the request for comments. A few such:
>
> 1. The patch is completely naive about what might be in the symlink
> path string; eg embedded spaces in the path would break it. On at
> least some platforms, newlines could be in the path as well. I'm not
> sure about how to guard against this while maintaining human readability
> of the file.
>
> 2. There seems to be more going on here than what is advertised, eg
> why do we need to add an option to the BASE_BACKUP command
> we do need it, doesn't it need to be documented in protocol.sgml)?
> And why is the RelationCacheInitFileRemove call relocated?
>
> 3. Not terribly happy with the changes made to the API of
> do_pg_start_backup, eg having to be able to parse "DIR *" in its
> arguments seems like a lot of #include creep. xlog.h is pretty
> central so I'm not happy about plastering more #includes in it.
>
> 4. In the same vein, publicly declaring a struct with a name as
> generic as "tablespaceinfo" doesn't seem like a great idea, when
> its usage is far from generic.
>
>
> On Sat, Dec 13, 2014 at 10:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Andrew Dunstan <andrew@dunslane.net> writes:
> > > On 11/20/2014 02:27 AM, Amit Kapila wrote:
> > >> Now the part where I would like to receive feedback before revising the
> > >> patch is on the coding style. It seems to me from Tom's comments that
> > >> he is not happy with the code, now I am not sure which part of the patch
> > >> he thinks needs change. Tom if possible, could you be slightly more
> > >> specific about your concern w.r.t code?
> >
> > > In view of the request above for comments from Tom, I have moved this
> > > back to "Needs Review".
> >
> > Sorry, I was not paying very close attention to this thread and missed
> > the request for comments. A few such:
> >
> > 1. The patch is completely naive about what might be in the symlink
> > path string; eg embedded spaces in the path would break it. On at
> > least some platforms, newlines could be in the path as well. I'm not
> > sure about how to guard against this while maintaining human readability
> > of the file.
>
> I will look into this and see what best can be done.
>
Amit Kapila wrote: > One way to deal with this could be to append a delimiter(which is not > allowed > in tablespace path like quote (\')) at the end of tablespace path while > writing the same to symlink label file and then use that as end marker while > reading it from file. Some GNU tools such as xargs and find use a null char as item delimiter; see find -print0 and xargs -0. IIRC one of our tools also allow that (psql?). Doing the same here would make human reading a bit more difficult, but not completely impossible. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Amit Kapila <amit.kapila16@gmail.com> writes: >> On Sat, Dec 13, 2014 at 10:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> 1. The patch is completely naive about what might be in the symlink >>> path string; eg embedded spaces in the path would break it. On at >>> least some platforms, newlines could be in the path as well. I'm not >>> sure about how to guard against this while maintaining human readability >>> of the file. > One way to deal with this could be to append a delimiter(which is not > allowed > in tablespace path like quote (\')) at the end of tablespace path while > writing the same to symlink label file and then use that as end marker while > reading it from file. What makes you think quote isn't allowed in tablespace paths? Even if we were to disallow it at the SQL level, there'd be nothing stopping a DBA from changing the path after the fact by redefining the symlink outside SQL --- something I believe we specifically meant to allow, considering we went to the trouble of getting rid of the pg_tablespace.spclocation column. Pretty much the only character we can be entirely certain is not in a symlink's value is \0. As Alvaro mentioned, using that in the file is a possible alternative, although it could easily confuse some users and/or text editors. The only other alternatives I can see are: * Go over to a byte-count-then-value format. Also possible, also rather unfriendly from a user's standpoint. * Establish an escaping convention, eg backslash before any funny characters. Unfortunately backslash wouldn't be too nicefrom the viewpoint of Windows users. * Make pg_basebackup check for and fail on symlinks containing characters it can't handle. Pretty icky, though I supposethere's some argument that things like newlines wouldn't be in any rational tablespace path. But I doubt you can makethat argument for spaces, quotes, or backslashes. regards, tom lane
On 12/14/2014 07:09 PM, Tom Lane wrote: > Amit Kapila <amit.kapila16@gmail.com> writes: >>> On Sat, Dec 13, 2014 at 10:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> 1. The patch is completely naive about what might be in the symlink >>>> path string; eg embedded spaces in the path would break it. On at >>>> least some platforms, newlines could be in the path as well. I'm not >>>> sure about how to guard against this while maintaining human readability >>>> of the file. >> One way to deal with this could be to append a delimiter(which is not >> allowed >> in tablespace path like quote (\')) at the end of tablespace path while >> writing the same to symlink label file and then use that as end marker while >> reading it from file. > What makes you think quote isn't allowed in tablespace paths? Even if we > were to disallow it at the SQL level, there'd be nothing stopping a DBA > from changing the path after the fact by redefining the symlink outside > SQL --- something I believe we specifically meant to allow, considering > we went to the trouble of getting rid of the pg_tablespace.spclocation > column. > > Pretty much the only character we can be entirely certain is not in a > symlink's value is \0. As Alvaro mentioned, using that in the file > is a possible alternative, although it could easily confuse some users > and/or text editors. The only other alternatives I can see are: > > * Go over to a byte-count-then-value format. Also possible, also rather > unfriendly from a user's standpoint. > > * Establish an escaping convention, eg backslash before any funny > characters. Unfortunately backslash wouldn't be too nice from the > viewpoint of Windows users. > > * Make pg_basebackup check for and fail on symlinks containing characters > it can't handle. Pretty icky, though I suppose there's some argument > that things like newlines wouldn't be in any rational tablespace path. > But I doubt you can make that argument for spaces, quotes, or backslashes. > > Using an escaping convention makes by far the most sense to me. It's what occurred to me earlier today even before I read the above. We could adopt the URL convention of %xx for escapable characters - that would avoid \ nastiness. cheers andrew
>
> Amit Kapila <amit.kapila16@gmail.com> writes:
> >> On Sat, Dec 13, 2014 at 10:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>> 1. The patch is completely naive about what might be in the symlink
> >>> path string; eg embedded spaces in the path would break it. On at
> >>> least some platforms, newlines could be in the path as well. I'm not
> >>> sure about how to guard against this while maintaining human readability
> >>> of the file.
>
> > One way to deal with this could be to append a delimiter(which is not
> > allowed
> > in tablespace path like quote (\')) at the end of tablespace path while
> > writing the same to symlink label file and then use that as end marker while
> > reading it from file.
>
> What makes you think quote isn't allowed in tablespace paths?
CreateTableSpace(CreateTableSpaceStmt *stmt)
{
..
/* disallow quotes, else CREATE DATABASE would be at risk */
if (strchr(location, '\''))
ereport(ERROR,
(errcode(ERRCODE_INVALID_NAME),
errmsg("tablespace location cannot contain single quotes")));
}
> were to disallow it at the SQL level, there'd be nothing stopping a DBA
> from changing the path after the fact by redefining the symlink outside
> SQL --- something I believe we specifically meant to allow, considering
> we went to the trouble of getting rid of the pg_tablespace.spclocation
> column.
>
> Pretty much the only character we can be entirely certain is not in a
> symlink's value is \0. As Alvaro mentioned, using that in the file
> is a possible alternative, although it could easily confuse some users
> and/or text editors. The only other alternatives I can see are:
>
> * Go over to a byte-count-then-value format. Also possible, also rather
> unfriendly from a user's standpoint.
>
> * Establish an escaping convention, eg backslash before any funny
> characters. Unfortunately backslash wouldn't be too nice from the
> viewpoint of Windows users.
>
> * Make pg_basebackup check for and fail on symlinks containing characters
> it can't handle. Pretty icky, though I suppose there's some argument
> that things like newlines wouldn't be in any rational tablespace path.
{
..
location_with_version_dir = psprintf("%s/%s", location,
TABLESPACE_VERSION_DIRECTORY);
..
}
> But I doubt you can make that argument for spaces, quotes, or backslashes.
>
Amit Kapila <amit.kapila16@gmail.com> writes: > On Mon, Dec 15, 2014 at 5:39 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> What makes you think quote isn't allowed in tablespace paths? > Below part of code makes me think that quote is not allowed. > Oid > CreateTableSpace(CreateTableSpaceStmt *stmt) > { > .. > /* disallow quotes, else CREATE DATABASE would be at risk */ > if (strchr(location, '\'')) > ereport(ERROR, > (errcode(ERRCODE_INVALID_NAME), > errmsg("tablespace location cannot contain single quotes"))); > } Hm, I think that's left over from defending a *very* ancient version of CREATE DATABASE. In any case, as I mentioned, any limitations we might be putting on tablespace paths during SQL-level operation are pretty much a dead letter. regards, tom lane
> On 11/20/2014 02:27 AM, Amit Kapila wrote:
>>
>> On Wed, Nov 19, 2014 at 11:46 PM, Robert Haas <robertmhaas@gmail.com <mailto:robertmhaas@gmail.com>> wrote:
>> > On Tue, Nov 18, 2014 at 9:19 AM, Alvaro Herrera
>> > <alvherre@2ndquadrant.com <mailto:alvherre@2ndquadrant.com>> wrote:
>> > >> Right, but they provide same functionality as symlinks and now we
>> > >> are even planing to provide this feature for both linux and windows as
>> > >> both Tom and Robert seems to feel, it's better that way. Anyhow,
>> > >> I think naming any entity generally differs based on individual's
>> > >> perspective, so we can go with the name which appeals to more people.
>> > >> In case, nobody else has any preference, I will change it to what both
>> > >> of us can agree upon (either 'tablespace catalog', 'tablespace_info' ...).
>> > >
>> > > Well, I have made my argument. Since you're the submitter, feel free to
>> > > select what you think is the best name.
>> >
>> > For what it's worth, I, too, dislike having symlink in the name.
>> > Maybe "tablespace_map"?
>>
>> Sounds good to me as well.
>>
>> To summarize the situation of this patch, I have received below comments
>> on which I am planning to work:
>>
>> 1. Change the name of file containing tablespace path information.
>> 2. Store tablespace name as well along with oid and path to make the
>> information Human readable.
>> 3. Make the code generic (Remove #ifdef Win32 macro's and change
>> comments referring this functionality for windows and see if any more
>> changes are required to make it work on linux.)
>>
>> Now the part where I would like to receive feedback before revising the
>> patch is on the coding style. It seems to me from Tom's comments that
>> he is not happy with the code, now I am not sure which part of the patch
>> he thinks needs change. Tom if possible, could you be slightly more
>> specific about your concern w.r.t code?
>>
>> I have attached a rebased (on top of commit-8d7af8f) patch, just incase
>> some one wants to apply and check it.
>>
>
>
> In view of the request above for comments from Tom, I have moved this back to "Needs Review".
>
I am working on fixing the review comments, but I think I won't be
So I have moved this patch to CF (2014-12).
> On Sat, Dec 13, 2014 at 10:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Andrew Dunstan <andrew@dunslane.net> writes:
> > > On 11/20/2014 02:27 AM, Amit Kapila wrote:
> > >> Now the part where I would like to receive feedback before revising the
> > >> patch is on the coding style. It seems to me from Tom's comments that
> > >> he is not happy with the code, now I am not sure which part of the patch
> > >> he thinks needs change. Tom if possible, could you be slightly more
> > >> specific about your concern w.r.t code?
> >
> > > In view of the request above for comments from Tom, I have moved this
> > > back to "Needs Review".
I have updated the patch and handled the review comments as below:
> > Sorry, I was not paying very close attention to this thread and missed
> > the request for comments. A few such:
> >
> > 1. The patch is completely naive about what might be in the symlink
> > path string; eg embedded spaces in the path would break it. On at
> > least some platforms, newlines could be in the path as well. I'm not
> > sure about how to guard against this while maintaining human readability
> > of the file.
> >
>
> I will look into this and see what best can be done.
>
> > 2. There seems to be more going on here than what is advertised, eg
> > why do we need to add an option to the BASE_BACKUP command
>
> This is to ensure that symlink file is generated only for tar format;
> server is not aware of whether the backup is generated for plain format
> or tar format. We don't want to do it for plain format as for that
> client (pg_basebackup) can update the symlinks via -T option and backing
> up symlink file during that operation can lead to spurious symlinks after
> archive recovery. I have given the reason why we want to accomplish it
> only for tar format in my initial mail.
>
> > (and if
> > we do need it, doesn't it need to be documented in protocol.sgml)?
>
> I shall take care of it in next version of patch.
>
> > And why is the RelationCacheInitFileRemove call relocated?
> >
>
> Because it assumes that tablespace directory pg_tblspc is in
> place and it tries to remove the files by reading pg_tblspc
> directory as well. Now as we setup the symlinks in pg_tblspc
> after reading symlink file, so we should remove relcache init
> file once the symlinks are setup in pg_tblspc directory.
>
> > arguments seems like a lot of #include creep. xlog.h is pretty
> > central so I'm not happy about plastering more #includes in it.
> >
>
> The reason of adding new include in xlog.c is for use of tablespaceinfo
> structure which I have now kept in basebackup.h.
>
> The reason why I have done this way is because do_pg_start_backup has
> some functionality common to both non-exclusive and exclusive backups and
> for this feature we have to do some work common for both non-exclusive
> and exclusive backup which is to generate the symlink label file for
> non-exclusive backups and write the symlink label file for exclusive
> backups using that information. Doing this way seems right to me
> as we are already doing something like that for backup label file.
>
> Another possible way could be to write a new function in xlogutils.c
> to do the symlink label stuff and then use the same in xlog.c, I think
> that way we could avoid any new include in xlog.c. However for this we
> need to have include in xlogutils.c to make it aware of tablespaceinfo
> structure.
>
Вложения
>
> On Sun, Dec 14, 2014 at 11:54 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Sat, Dec 13, 2014 at 10:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > Sorry, I was not paying very close attention to this thread and missed
> > > the request for comments. A few such:
> > >
> > > 1. The patch is completely naive about what might be in the symlink
> > > path string; eg embedded spaces in the path would break it. On at
> > > least some platforms, newlines could be in the path as well. I'm not
> > > sure about how to guard against this while maintaining human readability
> > > of the file.
> > >
> >
> > I will look into this and see what best can be done.
> >
>
> I have chosen #3 (Make pg_basebackup check for and fail on symlinks
> containing characters (currently newline only) it can't handle) from the
> different options suggested by Tom. This keeps the format same as
> previous and human readable.
>
On 12/16/2014 04:34 AM, Amit Kapila wrote: > On Tue, Dec 16, 2014 at 12:58 PM, Amit Kapila <amit.kapila16@gmail.com > <mailto:amit.kapila16@gmail.com>> wrote: > > > > On Sun, Dec 14, 2014 at 11:54 AM, Amit Kapila > <amit.kapila16@gmail.com <mailto:amit.kapila16@gmail.com>> wrote: > > > On Sat, Dec 13, 2014 at 10:48 PM, Tom Lane <tgl@sss.pgh.pa.us > <mailto:tgl@sss.pgh.pa.us>> wrote: > > > > Sorry, I was not paying very close attention to this thread and > missed > > > > the request for comments. A few such: > > > > > > > > 1. The patch is completely naive about what might be in the symlink > > > > path string; eg embedded spaces in the path would break it. On at > > > > least some platforms, newlines could be in the path as well. > I'm not > > > > sure about how to guard against this while maintaining human > readability > > > > of the file. > > > > > > > > > > I will look into this and see what best can be done. > > > > > > > I have chosen #3 (Make pg_basebackup check for and fail on symlinks > > containing characters (currently newline only) it can't handle) from the > > different options suggested by Tom. This keeps the format same as > > previous and human readable. > > > > Actually, here instead of an error a warning is issued and that particular > path (containing new line) will be skipped. This is similar to what > is already done for the cases when there is any problem in reading > link paths. > I'm not clear why human readability is the major criterion here. As for that, it will be quite difficult for a human to distinguish a name with a space at the end from one without. I really think a simple encoding scheme would be much the best. For normal cases it will preserve readability completely, and for special cases it will preserve lack of any ambiguity. cheers andrew
On 12/16/2014 06:30 PM, Andrew Dunstan wrote: > I'm not clear why human readability is the major criterion here. As for > that, it will be quite difficult for a human to distinguish a name with > a space at the end from one without. I really think a simple encoding > scheme would be much the best. For normal cases it will preserve > readability completely, and for special cases it will preserve lack of > any ambiguity. Agreed. Besides, this: 16387 E:\\Program\ Files\\PostgreSQL\\tbs is almost as human-readable as this: 16387 E:\Program Files\PostgreSQL\tbs It's obvious how the escaping works, just by looking at the file. - Heikki
>
> On 12/16/2014 06:30 PM, Andrew Dunstan wrote:
>>
>> I'm not clear why human readability is the major criterion here. As for
>> that, it will be quite difficult for a human to distinguish a name with
>> a space at the end from one without. I really think a simple encoding
>> scheme would be much the best.
> Agreed. Besides, this:
>
> 16387 E:\\Program\ Files\\PostgreSQL\\tbs
>
> is almost as human-readable as this:
>
> 16387 E:\Program Files\PostgreSQL\tbs
>
> On Tue, Dec 16, 2014 at 10:11 PM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
> >
> > On 12/16/2014 06:30 PM, Andrew Dunstan wrote:
> >>
> >> I'm not clear why human readability is the major criterion here. As for
> >> that, it will be quite difficult for a human to distinguish a name with
> >> a space at the end from one without. I really think a simple encoding
> >> scheme would be much the best.
>
> Yeah that could work, but we need the special encoding mainly for newline,
> other's would work with current patch. However it might be worth to do
> it for all kind of spaces. Currently it just reads the line upto newline using
> fscanf, but if we use special encoding, we might need to read the file
> character by character and check for newline without backslash(or other
> special encoding character); do you have something like that in mind?
>
> Another thing is that we need to take care that we encode/decode link
> path for tar format, as plain format might already be working.
>
4. Updation of docs.
Вложения
<div class="WordSection1"><p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">On 20 December2014 16:30, Amit Kapila Wrote,</span><p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><pclass="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span><pclass="MsoNormal">>Summarization oflatest changes:<p class="MsoNormal">>1. Change file name from symlink_label to tablespace_map and changed<p class="MsoNormal">>thesame every where in comments and variable names.<p class="MsoNormal">>2. This feature will besupportted for both windows and linux; tablespace_map<p class="MsoNormal">>file will be generated on both windows andlinux to restore tablespace links<p class="MsoNormal">>during archive recovery.<p class="MsoNormal">>3. Handlingfor special characters in tablesapce path name.<br /> >4. Updation of docs.<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span><pclass="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">Idid not followed this patch for quite some time, I have seenall the threads regarding this patch and reviewed from those perspective.</span><p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><pclass="MsoListParagraph" style="text-indent:-18.0pt;mso-list:l0level1 lfo1"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""><spanstyle="mso-list:Ignore">1.<span style="font:7.0pt "TimesNew Roman""> </span></span></span><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">I have donethe testing and behavior is fine</span><p class="MsoListParagraph" style="text-indent:-18.0pt;mso-list:l0 level1 lfo1"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif""><span style="mso-list:Ignore">2.<span style="font:7.0pt"Times New Roman""> </span></span></span><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">Forhandling special character like new line character, I saw discussionmostly for two option, </span><p class="MsoListParagraph" style="margin-left:72.0pt;text-indent:-18.0pt;mso-list:l0level2 lfo1"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""><spanstyle="mso-list:Ignore">a.<span style="font:7.0pt "TimesNew Roman""> </span></span></span><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">Don’t supportsuch table space name in tablespace map file and skip those tablespace.</span><p class="MsoListParagraph" style="margin-left:72.0pt;text-indent:-18.0pt;mso-list:l0level2 lfo1"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""><spanstyle="mso-list:Ignore">b.<span style="font:7.0pt "TimesNew Roman""> </span></span></span><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">Add ‘\’ characterwhen there is new line in the tablespace name.</span><p class="MsoListParagraph" style="margin-left:54.0pt"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><p class="MsoListParagraph"style="margin-left:54.0pt"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">And youhave selected option 2, I don’t see any problem in this because it is maintaining human readability, I just want ask isthis as per the consensus ?</span><p class="MsoListParagraph" style="margin-left:54.0pt"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><pclass="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">Otherthan that patch seems fine to me..</span><p class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">Regards,</span><pclass="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">Dilip</span><pclass="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span></div>
> On 20 December 2014 16:30, Amit Kapila Wrote,
> >Summarization of latest changes:
>
> >1. Change file name from symlink_label to tablespace_map and changed
>
> >the same every where in comments and variable names.
>
> >2. This feature will be supportted for both windows and linux; tablespace_map
>
> >file will be generated on both windows and linux to restore tablespace links
>
> >during archive recovery.
>
> >3. Handling for special characters in tablesapce path name.
> >4. Updation of docs.
>
> I did not followed this patch for quite some time, I have seen all the threads regarding this patch and reviewed from those perspective.
>
> 1. I have done the testing and behavior is fine
> 2. For handling special character like new line character, I saw discussion mostly for two option,
> a. Don’t support such table space name in tablespace map file and skip those tablespace.
>
> b. Add ‘\’ character when there is new line in the tablespace name.
>
> And you have selected option 2, I don’t see any problem in this because it is maintaining human readability, I just want ask is this as per the consensus ?
>
> Other than that patch seems fine to me..
>
Thanks for reviewing it.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
<div class="WordSection1"><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">On</span><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">07January 2015 11:21, Amit Kapila Wrote,</span><p class="MsoNormal"> <pclass="MsoNormal">>Tom has spotted this problem and suggested 3 different options<p class="MsoNormal">>tohandle this issue, apart from above 2, third one is "<span style="font-size:10.0pt">Go over to</span><pclass="MsoNormal"><span style="font-size:10.0pt">>a byte-count-then-value format". Then Andrew and Heikki</span><pclass="MsoNormal"><span style="font-size:10.0pt">>supported/asked to follow option 2 (as is followed inpatch) and no</span><p class="MsoNormal"><span style="font-size:10.0pt">>one objected, so I used the same to fix theissue.</span><p class="MsoNormal"> <p class="MsoNormal">>Based on above, I would say we have a consensus to followthis<p class="MsoNormal">>approach.<p class="MsoNormal"> <p class="MsoNormal">Moved to Ready For Committer<p class="MsoNormal"> <pclass="MsoNormal">Regards,<p class="MsoNormal">Dilip<p class="MsoNormal"> <p class="MsoNormal"> </div>
On 12/20/2014 05:59 AM, Amit Kapila wrote: > On Wed, Dec 17, 2014 at 11:32 AM, Amit Kapila <amit.kapila16@gmail.com > <mailto:amit.kapila16@gmail.com>> wrote: > > On Tue, Dec 16, 2014 at 10:11 PM, Heikki Linnakangas > <hlinnakangas@vmware.com <mailto:hlinnakangas@vmware.com>> wrote: > > > > > > On 12/16/2014 06:30 PM, Andrew Dunstan wrote: > > >> > > >> I'm not clear why human readability is the major criterion here. > As for > > >> that, it will be quite difficult for a human to distinguish a > name with > > >> a space at the end from one without. I really think a simple encoding > > >> scheme would be much the best. > > > > Yeah that could work, but we need the special encoding mainly for > newline, > > other's would work with current patch. However it might be worth to do > > it for all kind of spaces. Currently it just reads the line upto > newline using > > fscanf, but if we use special encoding, we might need to read the file > > character by character and check for newline without backslash(or other > > special encoding character); do you have something like that in mind? > > > > Another thing is that we need to take care that we encode/decode link > > path for tar format, as plain format might already be working. > > > > Attached patch handles the newline and other characters that are allowed > in tablespace path, as we need escape character only for newline, I have > added the same only for newline. So after patch, the tablespace_map > file will look like below for different kind of paths, as you can see for > tablespace id 16393 which contains newline, there is additional escape > sequence "\" before each newline where as other paths containing space > works as it is. > > 16391 /home/akapila/mywork/workspace_pg/master/tbs1 > 16393 /home/akapila/mywork/workspace_pg/master/tbs\ > a\ > b\ > > 16392 /home/akapila/mywork/workspace_pg/master/tbs 2 > > So with this, I have handled all review comments raised for this patch > and it is ready for review, as the status of this patch is changed from > "Ready for Committer" to "Waiting on Author", so ideally I think it > should go back to "Ready for Committer", however as I am not very sure > about this point, I will change it to "Needs Review" (correct me if I am > wrong). > > Summarization of latest changes: > 1. Change file name from symlink_label to tablespace_map and changed > the same every where in comments and variable names. > 2. This feature will be supportted for both windows and linux; > tablespace_map > file will be generated on both windows and linux to restore tablespace > links > during archive recovery. > 3. Handling for special characters in tablesapce path name. > 4. Updation of docs. > This generally looks good, but I have a couple of questions before I commit it. First, why is the new option for the BASE_BACKUP command of the Streaming Replication protcol "TAR"? It seems rather misleading. Shouldn't it be something like "TABLESPACEMAP"? I realize we ask for it when pg_basebackup is operating in TAR format mode, but the backend has no notion of that, does it? The only thing this does is trigger the sending of the tablespace map, so surely that's what the protocol option should suggest. Second, these lines in xlog.c seem wrong: else if ((ch == '\n' || ch == '\r') && prev_ch == '\\') str[i-1] = '\n'; It looks to me like we should be putting ch in the string, not arbitrarily transforming \r into \n. cheers andrew
>
>
>
> This generally looks good, but I have a couple of questions before I commit it.
>
> First, why is the new option for the BASE_BACKUP command of the Streaming Replication protcol "TAR"? It seems rather misleading. Shouldn't it be something like "TABLESPACEMAP"?
> Second, these lines in xlog.c seem wrong:
>
> else if ((ch == '\n' || ch == '\r') && prev_ch == '\\')
> str[i-1] = '\n';
>
> It looks to me like we should be putting ch in the string, not arbitrarily transforming \r into \n.
>
You are right, I have changed it as per your suggestion.
Вложения
On 05/11/2015 02:02 AM, Amit Kapila wrote: > On Sun, May 10, 2015 at 6:01 AM, Andrew Dunstan <andrew@dunslane.net > <mailto:andrew@dunslane.net>> wrote: > > > > > > > > This generally looks good, but I have a couple of questions before I > commit it. > > > > First, why is the new option for the BASE_BACKUP command of the > Streaming Replication protcol "TAR"? It seems rather misleading. > Shouldn't it be something like "TABLESPACEMAP"? > > > > The reason to keep new option's name as TAR was that tablespace_map > was generated for that format type, but I agree with you that something > like "TABLESPACEMAP" suits better, so I have changed it to > "TABLESPACE_MAP". Putting '_' in name makes it somewhat consistent > with other names and filename it generates with this new option. > > > > Second, these lines in xlog.c seem wrong: > > > > else if ((ch == '\n' || ch == '\r') && prev_ch == '\\') > > str[i-1] = '\n'; > > > > It looks to me like we should be putting ch in the string, not > arbitrarily transforming \r into \n. > > > > You are right, I have changed it as per your suggestion. > > OK, I have cleaned this up a bit - I had already started so I didn't take your latest patch but instead applied relevant changes to my changeset. Here is my latest version. In testing I notice that now "pg_baseback -F t" leaves it completely up to the user on all platforms to create the relevant links in pg_tblspc/. It includes the tablespace_map file in base.tar, but that's really just informational. I think we need to add something to the pg_basebackup docs about that, at the very least (and it will also need to be a release note item.) cheers andrew
Вложения
>
> On 05/11/2015 02:02 AM, Amit Kapila wrote:
>>
>> On Sun, May 10, 2015 at 6:01 AM, Andrew Dunstan <andrew@dunslane.net <mailto:andrew@dunslane.net>> wrote:
>> >
>> >
>> >
>> > This generally looks good, but I have a couple of questions before I commit it.
>> >
>> > First, why is the new option for the BASE_BACKUP command of the Streaming Replication protcol "TAR"? It seems rather misleading. Shouldn't it be something like "TABLESPACEMAP"?
>> >
>>
>> The reason to keep new option's name as TAR was that tablespace_map
>> was generated for that format type, but I agree with you that something
>> like "TABLESPACEMAP" suits better, so I have changed it to
>> "TABLESPACE_MAP". Putting '_' in name makes it somewhat consistent
>> with other names and filename it generates with this new option.
>>
>>
>> > Second, these lines in xlog.c seem wrong:
>> >
>> > else if ((ch == '\n' || ch == '\r') && prev_ch == '\\')
>> > str[i-1] = '\n';
>> >
>> > It looks to me like we should be putting ch in the string, not arbitrarily transforming \r into \n.
>> >
>>
>> You are right, I have changed it as per your suggestion.
>>
>>
>
>
> OK, I have cleaned this up a bit - I had already started so I didn't take your latest patch but instead applied relevant changes to my changeset. Here is my latest version.
>
> In testing I notice that now "pg_baseback -F t" leaves it completely up to the user on all platforms to create the relevant links in pg_tblspc/. It includes the tablespace_map file in base.tar, but that's really just informational.
+ /* read the tablespace_map file if present and create symlinks. */
+ if (read_tablespace_map(&tablespaces))
+ {
>
On 05/11/2015 11:01 PM, Amit Kapila wrote: > > > On Tue, May 12, 2015 at 5:50 AM, Andrew Dunstan <andrew@dunslane.net > <mailto:andrew@dunslane.net>> wrote: > > > > > > On 05/11/2015 02:02 AM, Amit Kapila wrote: > >> > >> On Sun, May 10, 2015 at 6:01 AM, Andrew Dunstan > <andrew@dunslane.net <mailto:andrew@dunslane.net> > <mailto:andrew@dunslane.net <mailto:andrew@dunslane.net>>> wrote: > >> > > >> > > >> > > >> > This generally looks good, but I have a couple of questions > before I commit it. > >> > > >> > First, why is the new option for the BASE_BACKUP command of the > Streaming Replication protcol "TAR"? It seems rather misleading. > Shouldn't it be something like "TABLESPACEMAP"? > >> > > >> > >> The reason to keep new option's name as TAR was that tablespace_map > >> was generated for that format type, but I agree with you that something > >> like "TABLESPACEMAP" suits better, so I have changed it to > >> "TABLESPACE_MAP". Putting '_' in name makes it somewhat consistent > >> with other names and filename it generates with this new option. > >> > >> > >> > Second, these lines in xlog.c seem wrong: > >> > > >> > else if ((ch == '\n' || ch == '\r') && prev_ch == '\\') > >> > str[i-1] = '\n'; > >> > > >> > It looks to me like we should be putting ch in the string, not > arbitrarily transforming \r into \n. > >> > > >> > >> You are right, I have changed it as per your suggestion. > >> > >> > > > > > > OK, I have cleaned this up a bit - I had already started so I didn't > take your latest patch but instead applied relevant changes to my > changeset. Here is my latest version. > > > > In testing I notice that now "pg_baseback -F t" leaves it completely > up to the user on all platforms to create the relevant links in > pg_tblspc/. It includes the tablespace_map file in base.tar, but > that's really just informational. > > > > Sorry, but I am not able to follow your point. User don't need to create > the relevant links, they will get created during first startup (during > recovery) > from the backup. I have tested and it works both on Windows and Linux. > > Refer below code of patch in xlog.c > > + > + /* read the tablespace_map file if present and create symlinks. */ > + if (read_tablespace_map(&tablespaces)) > + { > .. > > > I think we need to add something to the pg_basebackup docs about > that, at the very least (and it will also need to be a release note item.) > > > > Currently, below lines in patch suggest that this file is required for > recovery. > Do you expect more information to be added? > > + These files are not merely for your information; their presence and > + contents are critical to the proper operation of the system's > recovery > + process. > Yes, sorry, I had a moment of brain fade yesterday. However, I think we're a bit under-documented on the pg_basebackup page, regarding both tar mode and tablespace_map (which isn't even mentioned). And there is this which I noticed this morning: the --tablespace-mapping feature of pg_basebackup seems to be quite broken in --format=tar mode - it simply has no effect AFAICT. I assume it was broken before, but we should either fix it (possibly hard) or disallow the combination (which would be a pity). I'm going to go ahead and commit this in the state I have it now, because for the most part these are preexisting deficiencies. cheers andrew
On 05/12/2015 08:35 AM, Andrew Dunstan wrote: > > Yes, sorry, I had a moment of brain fade yesterday. However, I think > we're a bit under-documented on the pg_basebackup page, regarding both > tar mode and tablespace_map (which isn't even mentioned). > > And there is this which I noticed this morning: the > --tablespace-mapping feature of pg_basebackup seems to be quite broken > in --format=tar mode - it simply has no effect AFAICT. I assume it was > broken before, but we should either fix it (possibly hard) or disallow > the combination (which would be a pity). > > I'm going to go ahead and commit this in the state I have it now, > because for the most part these are preexisting deficiencies. > > One more thing: I think pg_basebackup will now not work in tar mode with pre-9.5 servers, since it will be using an unrecognized option of the BASE_BACKUP protocol command. If so that certainly needs to be documented and release noted. cheers andrew
>
>
> On 05/12/2015 08:35 AM, Andrew Dunstan wrote:
>>
>>
>> Yes, sorry, I had a moment of brain fade yesterday. However, I think we're a bit under-documented on the pg_basebackup page, regarding both tar mode and tablespace_map (which isn't even mentioned).
>>
>> And there is this which I noticed this morning: the --tablespace-mapping feature of pg_basebackup seems to be quite broken in --format=tar mode - it simply has no effect AFAICT. I assume it was broken before, but we should either fix it (possibly hard) or disallow the combination (which would be a pity).
>>
>> I'm going to go ahead and commit this in the state I have it now, because for the most part these are preexisting deficiencies.
>>
>>
>
>
> One more thing: I think pg_basebackup will now not work in tar mode with pre-9.5 servers, since it will be using an unrecognized option of the BASE_BACKUP protocol command. If so that certainly needs to be documented and release noted.
>
Yes, thats true and I have added the same in docs, updated