Re: Make unlogged table resets detectable
От | Jeff Davis |
---|---|
Тема | Re: Make unlogged table resets detectable |
Дата | |
Msg-id | 1811a9765adeb2e5aef0c094366e167994fbea16.camel@j-davis.com обсуждение исходный текст |
Ответ на | Re: Make unlogged table resets detectable (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Make unlogged table resets detectable
|
Список | pgsql-hackers |
On Tue, 2021-06-08 at 12:52 -0400, Tom Lane wrote: > Yeah. If there are actually use-cases for knowing both things, then > we ought to record both. However, it's not real clear to me why > LSN would be interesting. Let me expand on my use case: in a sharded environment, how do you figure out if you need to repopulate an UNLOGGED table? For a single node, there's not much risk, because you either have the data or you don't. But in a sharded environment, if one node crashes, you might end up with some shards empty and others populated, and that's inconsistent. If Postgres provides a way to figure out when the last crash happened, then that would give the sharding solution the basic information it needs to figure out if it needs to clear and repopulate the entire unlogged table (i.e. all its shards on all nodes). Clearly, the sharding solution would need to do some tracking of its own, like recording when the last TRUNCATE happened, to figure out what to do. For that tracking, I think using the LSN makes more sense than a timestamp. > (If there is a reason to log LSN, maybe the argument is different > for that? Although I'd think that looking at the last checkpoint > REDO location is sufficient for figuring out where the current > crash recovery attempt started.) I came to a similar conclusion for my use case: tracking the LSN at the end of the recovery makes more sense. I attached a patch to track last recovery LSN, time, and total count. But there are a few issues: 1. Do we want a way to reset the counter? If so, should it be done with pg_resetwal or a superuser SQL function? 2. It would be helpful to also know the last time a promotion happened, for the same reason (e.g. a failover of a single node leading to an unlogged table with some empty shards and some populated ones). Should also store the last promotion LSN and time as well? Does "promotion count" make sense, and should we track that, too? 3. Should we try to track crash information across promotions, or just start them at the initial values when promoted? Regards, Jeff Davis
Вложения
В списке pgsql-hackers по дате отправления: