Re: recovery modules

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: recovery modules
Дата
Msg-id 20230128002319.362oxrxf7ardfz2a@awork3.anarazel.de
обсуждение исходный текст
Ответ на Re: recovery modules  (Michael Paquier <michael@paquier.xyz>)
Ответы Re: recovery modules  (Nathan Bossart <nathandbossart@gmail.com>)
Список pgsql-hackers
Hi,

On 2023-01-25 16:34:21 +0900, Michael Paquier wrote:
> diff --git a/src/include/access/xlogarchive.h b/src/include/access/xlogarchive.h
> index 299304703e..71c9b88165 100644
> --- a/src/include/access/xlogarchive.h
> +++ b/src/include/access/xlogarchive.h
> @@ -30,9 +30,45 @@ extern bool XLogArchiveIsReady(const char *xlog);
>  extern bool XLogArchiveIsReadyOrDone(const char *xlog);
>  extern void XLogArchiveCleanup(const char *xlog);
>  
> -extern bool shell_restore(const char *file, const char *path,
> -                          const char *lastRestartPointFileName);
> -extern void shell_archive_cleanup(const char *lastRestartPointFileName);
> -extern void shell_recovery_end(const char *lastRestartPointFileName);
> +/*
> + * Recovery module callbacks
> + *
> + * These callback functions should be defined by recovery libraries and
> + * returned via _PG_recovery_module_init().  For more information about the
> + * purpose of each callback, refer to the recovery modules documentation.
> + */
> +typedef bool (*RecoveryRestoreCB) (const char *file, const char *path,
> +                                   const char *lastRestartPointFileName);
> +typedef void (*RecoveryArchiveCleanupCB) (const char *lastRestartPointFileName);
> +typedef void (*RecoveryEndCB) (const char *lastRestartPointFileName);
> +typedef void (*RecoveryShutdownCB) (void);

I think the signature of these forces bad coding practices, because there's no
way to have state within a recovery module (as there's no parameter for it).


It's possible we would eventually support multiple modules, e.g. restoring
from shorter term file based archiving and from longer term archiving in some
blob store. Then we'll regret not having a varible for this.


> +typedef struct RecoveryModuleCallbacks
> +{
> +    RecoveryRestoreCB restore_cb;
> +    RecoveryArchiveCleanupCB archive_cleanup_cb;
> +    RecoveryEndCB recovery_end_cb;
> +    RecoveryShutdownCB shutdown_cb;
> +} RecoveryModuleCallbacks;
> +
> +extern RecoveryModuleCallbacks RecoveryContext;

I think that'll typically be interpreteted as a MemoryContext by readers.

Also, why is this a global var? Exported too?


> +/*
> + * Type of the shared library symbol _PG_recovery_module_init that is looked up
> + * when loading a recovery library.
> + */
> +typedef void (*RecoveryModuleInit) (RecoveryModuleCallbacks *cb);

I think this is a bad way to return callbacks. This way the
RecoveryModuleCallbacks needs to be modifiable, which makes the job for the
compiler harder (and isn't the greatest for security).

I strongly encourage you to follow the model used e.g. by tableam. The init
function should return a pointer to a *constant* struct. Which is compile-time
initialized with the function pointers.

See the bottom of heapam_handler.c for how that ends up looking.


> +void
> +LoadRecoveryCallbacks(void)
> +{
> +    RecoveryModuleInit init;
> +
> +    /*
> +     * If the shell command is enabled, use our special initialization
> +     * function.  Otherwise, load the library and call its
> +     * _PG_recovery_module_init().
> +     */
> +    if (restoreLibrary[0] == '\0')
> +        init = shell_restore_init;
> +    else
> +        init = (RecoveryModuleInit)
> +            load_external_function(restoreLibrary, "_PG_recovery_module_init",
> +                                   false, NULL);

Why a special rule for shell, instead of just defaulting the GUC to it?


> +    /*
> +     * If using shell commands, remove callbacks for any commands that are not
> +     * set.
> +     */
> +    if (restoreLibrary[0] == '\0')
> +    {
> +        if (recoveryRestoreCommand[0] == '\0')
> +            RecoveryContext.restore_cb = NULL;
> +        if (archiveCleanupCommand[0] == '\0')
> +            RecoveryContext.archive_cleanup_cb = NULL;
> +        if (recoveryEndCommand[0] == '\0')
> +            RecoveryContext.recovery_end_cb = NULL;

I'd just mandate that these are implemented and that the module has to handle
if it doesn't need to do anything.



> +    /*
> +     * Check for invalid combinations of the command/library parameters and
> +     * load the callbacks.
> +     */
> +    CheckMutuallyExclusiveGUCs(restoreLibrary, "restore_library",
> +                               recoveryRestoreCommand, "restore_command");
> +    CheckMutuallyExclusiveGUCs(restoreLibrary, "restore_library",
> +                               recoveryEndCommand, "recovery_end_command");
> +    before_shmem_exit(call_recovery_module_shutdown_cb, 0);
> +    LoadRecoveryCallbacks();

This kind of sequence is duplicated into several places.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amin
Дата:
Сообщение: Re: How to find the number of cached pages for a relation?
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: recovery modules