Re: Add system identifier to backup manifest

Поиск
Список
Период
Сортировка
От Sravan Kumar
Тема Re: Add system identifier to backup manifest
Дата
Msg-id CA+=NbjgZtBAz8rd7KKSCrFOwKVpG9S8XnYAnHS7xA-mux_7oOw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Add system identifier to backup manifest  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Add system identifier to backup manifest  (Amul Sul <sulamul@gmail.com>)
Список pgsql-hackers
I have also done a review of the patch and some testing. The patch looks good, and I agree with Robert's comments.

On Wed, Jan 17, 2024 at 8:40 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Jan 17, 2024 at 6:31 AM Amul Sul <sulamul@gmail.com> wrote:
> > With the attached patch, the backup manifest will have a new key item as
> > "System-Identifier" 64-bit integer whose value is derived from pg_control while
> > generating it, and the manifest version bumps to 2.
> >
> > This helps to identify the correct database server and/or backup for the
> > subsequent backup operations.  pg_verifybackup validates the manifest system
> > identifier against the backup control file and fails if they don’t match.
> > Similarly, pg_basebackup increment backup will fail if the manifest system
> > identifier does not match with the server system identifier.  The
> > pg_combinebackup is already a bit smarter -- checks the system identifier from
> > the pg_control of all the backups, with this patch the manifest system
> > identifier also validated.
>
> Thanks for working on this. Without this, I think what happens is that
> you can potentially take an incremental backup from the "wrong"
> server, if the states of the systems are such that all of the other
> sanity checks pass. When you run pg_combinebackup, it'll discover the
> problem and tell you, but you ideally want to discover such errors at
> backup time rather than at restore time. This addresses that. And,
> overall, I think it's a pretty good patch. But I nonetheless have a
> bunch of comments.
>
> -      The associated value is always the integer 1.
> +      The associated value is the integer, either 1 or 2.
>
> is an integer. Beginning in <productname>PostgreSQL</productname> 17,
> it is 2; in older versions, it is 1.
>
> + context.identity_cb = manifest_process_identity;
>
> I'm not really on board with calling the system identifier "identity"
> throughout the patch. I think it should just say system_identifier. If
> we were going to abbreviate, I'd prefer something like "sysident" that
> looks like it's derived from "system identifier" rather than
> "identity" which is a different word altogether. But I don't think we
> should abbreviate unless doing so creates *ridiculously* long
> identifier names.
>
> +static void
> +manifest_process_identity(JsonManifestParseContext *context,
> +   int manifest_version,
> +   uint64 manifest_system_identifier)
> +{
> + uint64 system_identifier;
> +
> + /* Manifest system identifier available in version 2 or later */
> + if (manifest_version == 1)
> + return;
>
> I think you've got the wrong idea here. I think this function would
> only get called if System-Identifier is present in the manifest, so if
> it's a v1 manifest, this would never get called, so this if-statement
> would not ever do anything useful. I think what you should do is (1)
> if the client supplies a v1 manifest, reject it, because surely that's
> from an older server version that doesn't support incremental backup;
> but do that when the version is parsed rather than here; and (2) also
> detect and reject the case when it's supposedly a v2 manifest but this
> is absent.
>
> (1) should really be done when the version number is parsed, so I
> suspect you may need to add manifest_version_cb.
>
> +static void
> +combinebackup_identity_cb(JsonManifestParseContext *context,
> +   int manifest_version,
> +   uint64 manifest_system_identifier)
> +{
> + parser_context *private_context = context->private_data;
> + uint64 system_identifier = private_context->system_identifier;
> +
> + /* Manifest system identifier available in version 2 or later */
> + if (manifest_version == 1)
> + return;
>
> Very similar to the above case. Just reject a version 1 manifest as
> soon as we see the version number. In this function, set a flag
> indicating we saw the system identifier; if at the end of parsing that
> flag is not set, kaboom.
>
> - parse_manifest_file(manifest_path, &context.ht, &first_wal_range);
> + parse_manifest_file(manifest_path, &context.ht, &first_wal_range,
> + context.backup_directory);
>
> Don't do this! parse_manifest_file() should just record everything
> found in the manifest in the context object. Syntax validation should
> happen while parsing the manifest (e.g. "CAT/DOG" is not a valid LSN
> and we should reject that at this stage) but semantic validation
> should happen later (e.g. "0/0" can't be a the correct backup end LSN
> but we don't figure that out while parsing, but rather later). I think
> you should actually move validation of the system identifier to the
> point where the directory walk encounters the control file (and update
> the docs and tests to match that decision). Imagine if you wanted to
> validate a tar-format backup; then you wouldn't have random access to
> the directory. You'd see the manifest file first, and then all the
> files in a random order, with one chance to look at each one.
>
> (This is, in fact, a feature I think we should implement.)
>
> - if (strcmp(token, "1") != 0)
> + parse->manifest_version = atoi(token);
> + if (parse->manifest_version != 1 && parse->manifest_version != 2)
>   json_manifest_parse_failure(parse->context,
>   "unexpected manifest version");
>
> Please either (a) don't do a string-to-integer conversion and just
> strcmp() twice or (b) use strtol so that you can check that it
> succeeded. I don't want to accept manifest version 1a as 1.
>
> +/*
> + * Validate manifest system identifier against the database server system
> + * identifier.
> + */
>
> This comment assumes you know what the callback is going to do, but
> you don't. This should be more like the comment for
> json_manifest_finalize_file or json_manifest_finalize_wal_range.

This comment caught me off-guard too. After some testing and detailed review I found that this is
called by pg_verifybackup and pg_combinebackup both of which do not validate against any
running database system.
 
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com
>
>


--
Thanks & Regards,
Sravan Velagandula
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Предыдущее
От: vignesh C
Дата:
Сообщение: Re: pgsql: Clean up role created in new subscription test.
Следующее
От: vignesh C
Дата:
Сообщение: Re: Where can I find the doxyfile?