Dear all,
I am a newcomer to postgres and love it so far. I've given this problem a lot of thought already, RTFM to the best of my ability, but hit a dead end, so I need a nudge in the right direction.
I'm designing a database where each entity of interest has a "rowversion" column that gets assigned a value from a global sequence. So, in the simplest scenario, if I have two rows in table "emps": emp1 with rowversion@3 and emp2 with rowversion@5, then I know emp2 was modified after emp1.
This is to form the foundation of a data sync scenario, where a client knows they have everything up until @3 and now they need the latest updates (SELECT * FROM emps WHERE rowversion>3 and rowversion<=new_anchor). The problem here is, there's no way to interrogate the database for the "latest committed @rowversion that has none pending before it". An example scenario:
@3 - committed
@4 - committed
@5 - committed
@6 - in progress - not committed yet
@7 - in progress - not committed yet
@8 - committed
@9 - committed
When client asks for updated records, we ask the database for an appropriate new_anchor. Since the rows with rowversion @6 and @7 are still in progress, new_anchor has to be @5, so that our range query doesn't miss any uncommitted updates. Now the client can be confident it has everything up until @5.
So the actual problem distilled: how can this new_anchor be safely determined each time?
As you can probably tell I've borrowed this idea from SQL Server, where this problem is trivially solved by the min_active_rowversion() function. This function would return @6 in the above scenario, so your new_anchor is always going to be "min_active_rowversion()-1". I sort of had an idea how this could be implemented in postgres using an "active_rowversions" table, and a "SELECT min(id) FROM active_rowversions" but that would require READ UNCOMMITTED isolation, which is not available in postgres.
I would really appreciate any help or ideas.
Best regards,
Nick