Обсуждение: binary version of pg_current_wal_insert_lsn and pg_walfile_name functions
Hi All, Currently, we have pg_current_wal_insert_lsn and pg_walfile_name sql functions which gives us information about the next wal insert location and the WAL file that the next wal insert location belongs to. Can we have a binary version of these sql functions? It would be like any other binaries we have for e.g. pg_waldump to which we can pass the location of the pg_wal directory. This binary would scan through the directory to return the next wal insert location and the wal file the next wal insert pointer belongs to. The binary version of these sql functions can be used when the server is offline. This can help us to know the overall WAL data that needs to be replayed when the server is in recovery. In the control file we do have the redo pointer. Knowing the end pointer would definitely be helpful. If you are ok then I will prepare a patch for it and share it. Please let me know your thoughts/comments. thank you.! -- With Regards, Ashutosh Sharma.
Re: binary version of pg_current_wal_insert_lsn and pg_walfile_name functions
От
Bharath Rupireddy
Дата:
On Mon, Sep 19, 2022 at 8:19 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > Hi All, > > Currently, we have pg_current_wal_insert_lsn and pg_walfile_name sql > functions which gives us information about the next wal insert > location and the WAL file that the next wal insert location belongs > to. Can we have a binary version of these sql functions? +1 for the idea in general. As said, pg_waldump seems to be the right candidate. I think we want the lsn of the last WAL record and its info and the WAL file name given an input data directory or just the pg_wal directory or any directory where WAL files are located. For instance, one can use this on an archive location containing archived WAL files or on a node where pg_receivewal is receiving WAL files. Am I missing any other use-cases? pg_waldump currently can't understand compressed and partial files. I think that we need to fix this as well. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: binary version of pg_current_wal_insert_lsn and pg_walfile_name functions
От
Ashutosh Sharma
Дата:
On Tue, Sep 20, 2022 at 5:13 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Mon, Sep 19, 2022 at 8:19 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > > > Hi All, > > > > Currently, we have pg_current_wal_insert_lsn and pg_walfile_name sql > > functions which gives us information about the next wal insert > > location and the WAL file that the next wal insert location belongs > > to. Can we have a binary version of these sql functions? > > +1 for the idea in general. > > As said, pg_waldump seems to be the right candidate. I think we want > the lsn of the last WAL record and its info and the WAL file name > given an input data directory or just the pg_wal directory or any > directory where WAL files are located. For instance, one can use this > on an archive location containing archived WAL files or on a node > where pg_receivewal is receiving WAL files. Am I missing any other > use-cases? > Yeah, we can either add this functionality to pg_waldump or maybe add a new binary itself that would return this information. -- With Regards, Ashutosh Sharma.
Re: binary version of pg_current_wal_insert_lsn and pg_walfile_name functions
От
Bharath Rupireddy
Дата:
On Wed, Sep 21, 2022 at 9:53 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > Yeah, we can either add this functionality to pg_waldump or maybe add > a new binary itself that would return this information. IMV, a separate tool isn't the way, since pg_waldump already reads WAL files and decodes WAL records, what's proposed here can be an additional functionality of pg_waldump. It will be great if an initial patch is posted here. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: binary version of pg_current_wal_insert_lsn and pg_walfile_name functions
От
Ashutosh Sharma
Дата:
On Thu, Sep 22, 2022 at 7:41 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Wed, Sep 21, 2022 at 9:53 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > > > Yeah, we can either add this functionality to pg_waldump or maybe add > > a new binary itself that would return this information. > > IMV, a separate tool isn't the way, since pg_waldump already reads WAL > files and decodes WAL records, what's proposed here can be an > additional functionality of pg_waldump. > > It will be great if an initial patch is posted here. > PFA that enhances pg_waldump to show the latest LSN and the corresponding WAL file when the -l or --lastLSN option is passed an argument to pg_waldump. Below is an example: ashu@92893de650ed:~/pgsql$ pg_waldump -l -D ./data-dir Latest LSN: 0/148A45F8 Latest WAL filename: 000000010000000000000014 How has it been coded? When the user passes the '-l' command line option along with the data directory path to pg_waldump, it reads the control file from the data directory. From the control file, it gets information like redo pointer and current timeline id. The redo pointer is considered to be the start pointer from where the pg_waldump starts reading wal data until end-of-wal to find the last LSN. For details please check the attached patch. Please note that for compressed and .partial wal files this doesn't work. -- With Regards, Ashutosh Sharma.
Вложения
Re: binary version of pg_current_wal_insert_lsn and pg_walfile_name functions
От
Bharath Rupireddy
Дата:
On Thu, Sep 22, 2022 at 10:25 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > PFA that enhances pg_waldump to show the latest LSN and the > corresponding WAL file when the -l or --lastLSN option is passed an > argument to pg_waldump. Below is an example: Thanks for the patch. I have some quick thoughts about it. > When the user passes the '-l' command line option along with the data > directory path to pg_waldump, it reads the control file from the data > directory. I don't think we need a new option for data directory -D. pg_waldump's option 'p' can be used, please see the comments around identify_target_directory(). > From the control file, it gets information like redo > pointer and current timeline id. Is there any reason for not using get_control_file() from src/common/controldata_utils.c, but defining the exact same function in pg_waldump.c? > The redo pointer is considered to be > the start pointer from where the pg_waldump starts reading wal data > until end-of-wal to find the last LSN. For details please check the > attached patch. Making it dependent on the controlfile limits the usability of this feature. Imagine, using this feature on an archive location or pg_receivewal target directory where there are WAL files but no controlfile. I think we can choose the appropriate combinations of existing pg_waldump options, for instance, let users enter the start WAL segment via startseg and/or start LSN via --start and the new option for end WAL segment and end LSN. > Please note that for compressed and .partial wal files this doesn't work. Looking forward to the above capability because it expands the usability of this feature. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: binary version of pg_current_wal_insert_lsn and pg_walfile_name functions
От
Ashutosh Sharma
Дата:
On Fri, Sep 23, 2022 at 6:05 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Thu, Sep 22, 2022 at 10:25 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > > > PFA that enhances pg_waldump to show the latest LSN and the > > corresponding WAL file when the -l or --lastLSN option is passed an > > argument to pg_waldump. Below is an example: > > Thanks for the patch. I have some quick thoughts about it. > > > When the user passes the '-l' command line option along with the data > > directory path to pg_waldump, it reads the control file from the data > > directory. > > I don't think we need a new option for data directory -D. pg_waldump's > option 'p' can be used, please see the comments around > identify_target_directory(). > -p is the path to the WAL directory. It doesn't necessarily have to be a data directory, however the user can specify the data directory path here as well using which the path to the WAL directory can be recognized, but as I said it doesn't mean -p will always represent the data directory. > > From the control file, it gets information like redo > > pointer and current timeline id. > > Is there any reason for not using get_control_file() from > src/common/controldata_utils.c, but defining the exact same function > in pg_waldump.c? > Will give it a thought on it later. If possible, will try to reuse it. > > The redo pointer is considered to be > > the start pointer from where the pg_waldump starts reading wal data > > until end-of-wal to find the last LSN. For details please check the > > attached patch. > > Making it dependent on the controlfile limits the usability of this > feature. Imagine, using this feature on an archive location or > pg_receivewal target directory where there are WAL files but no > controlfile. I think we can choose the appropriate combinations of > existing pg_waldump options, for instance, let users enter the start > WAL segment via startseg and/or start LSN via --start and the new > option for end WAL segment and end LSN. > I have written this patch assuming that the end user is not aware of any LSN or any other WAL data and wants to know the last LSN. So all he can do is take the help of the control data to find the redo LSN and use that as a reference point (start pointer) to find the last LSN. And whatever is the WAL directory (be it archive location or wall collected via pg_receivewal or pg_wal directory), we will consider the redo pointer as the start pointer. Now, it's possible that the WAL corresponding to the start pointer is not at all available in the WAL directory like archive location or pg_receivewal directory in which this cannot be used, but this is very unlikely. > > Please note that for compressed and .partial wal files this doesn't work. > > Looking forward to the above capability because it expands the > usability of this feature. > This is a different task altogether. We will probably need to work on it separately. -- With Regards, Ashutosh Sharma.
Re: binary version of pg_current_wal_insert_lsn and pg_walfile_name functions
От
Ashutosh Sharma
Дата:
PFA v2 patch. Changes in the v2 patch: - Reuse the existing get_controlfile function in src/common/controldata_utils.c instead of adding a new one. - Set env variable PGDATA with the data directory specified by the user. -- With Regards, Ashutosh Sharma.
Re: binary version of pg_current_wal_insert_lsn and pg_walfile_name functions
От
Ashutosh Sharma
Дата:
On Fri, Sep 23, 2022 at 12:24 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > PFA v2 patch. > > Changes in the v2 patch: > > - Reuse the existing get_controlfile function in > src/common/controldata_utils.c instead of adding a new one. > > - Set env variable PGDATA with the data directory specified by the user. > Forgot to attach the patch with above changes. Here it is. -- With Regards, Ashutosh Sharma.