Обсуждение: MVCC and all that...
Reading this article https://firebirdsql.org/migrating-from-firebird-to-postgresql-what-can-go-wrong- I'm a bit confused (not the first time...) In part 1. Differences in MVCC implementation - he's saying that "It’s not that the PostgreSQL implementation of MVCC is bad — it’s just fundamentally different" But, I thought Firebird had SWEEP instead of VACUUM (how much closer can the words be?) and that FB's implementation of MVCC was essentially similar. Is this correct? Then why is he pointing out the performance diffs because of MVCC? I'm puzzled! -- El!
> On Sep 9, 2025, at 10:27 AM, Ellen Allhatatlan <ellenallhatatlan@gmail.com> wrote: > > Reading this article > https://firebirdsql.org/migrating-from-firebird-to-postgresql-what-can-go-wrong- > I'm a bit confused (not the first time...) > > In part 1. Differences in MVCC implementation - he's saying that "It’s > not that the PostgreSQL implementation of MVCC is bad — it’s just > fundamentally different" > > But, I thought Firebird had SWEEP instead of VACUUM (how much closer > can the words be?) and that FB's implementation of MVCC was > essentially similar. > Is this correct? Then why is he pointing out the performance diffs > because of MVCC? I'm puzzled! > > > -- > > El! > It is written by someone @firebirdsql.org so one assumes a few grains of salt necessary. >
On Tue, Sep 9, 2025 at 10:27 AM Ellen Allhatatlan <ellenallhatatlan@gmail.com> wrote:
Reading this article
https://firebirdsql.org/migrating-from-firebird-to-postgresql-what-can-go-wrong-
I'm a bit confused (not the first time...)
In part 1. Differences in MVCC implementation - he's saying that "It’s
not that the PostgreSQL implementation of MVCC is bad — it’s just
fundamentally different"
But, I thought Firebird had SWEEP instead of VACUUM (how much closer
can the words be?) and that FB's implementation of MVCC was
essentially similar.
Is this correct? Then why is he pointing out the performance diffs
because of MVCC? I'm puzzled!
Note: your link is wrong, corrected here:
What the article is driving at is that postgres does not use rollback logs to handle updated records in the MVCC implementation. There are absolutely performance tradeoffs in that decision and, if you do a lot of development against postgresql, those tradeoffs should influence how you design databases. The author then cherry picked the 'worst case' case, large unconstrained updates.
The article is a bit of a cheezy dig on postgres. Another example is the complaint about autonomous transactions with another cherry picked example to make postgres look back. In the real world, these would not matter much, and can be worked around (if you want to see my take on how to deal with it, see here: https://github.com/leaselock/pgasync).
merlin
> > In part 1. Differences in MVCC implementation - he's saying that "It’s > > not that the PostgreSQL implementation of MVCC is bad — it’s just > > fundamentally different" > It is written by someone @firebirdsql.org so one assumes a few grains of salt necessary. I know - but the guy does stress that he's not knocking PostgreSQL, just that there are differences. However, it *_was_* my understanding that MVCC was implemented similarly in PostgreSQL and Firebird - PG has VACUUM and FB has SWEEP. Why would FB need SWEEP if it didn't have to clear up after transactions - a problem that apparently doesn't affect Oracle/MySQL? Oracle and MySQL (InnoDB) implement a different model (as does Orioledb IIUC) where there's are UNDO/REDO logs. So, my question is: Is FB's MVCC implementation fundamentally different from that of PG or have I mixed things up? Thanks for your input. -- El!
> Note: your link is wrong, corrected here: Extra hyphen - sorry about and thanks for pointing it out! > What the article is driving at is that postgres does not use rollback logs to handle updated records in the MVCC implementation. There are absolutely performance tradeoffs in that decision and, if you do a lot of development against postgresql,those tradeoffs should influence how you design databases. The author then cherry picked the 'worst case' case,large unconstrained updates. Hmm... I was wondering about that - even though he stressed that there was (paraphrasing) no right or wrong - just different design decisions! > The article is a bit of a cheezy dig on postgres. Another example is the complaint about autonomous transactions withanother cherry picked example to make postgres look back. In the real world, these would not matter much, and can beworked around (if you want to see my take on how to deal with it, see here: https://github.com/leaselock/pgasync). OK - so, I was wrong in my original assumption that somehow (and it wasn't simply because of the phraseology - sweep vs vacuum) I thought that PG and FB had a similar MVCC implementation vs. Oracle and MySQL (InnoDB) (and OrioleDB). I'll do a deep dive into their docco and see what they actually do! I'm actually very interested in the benchmarking side of database technology - but I do know the old adage - there are lies, damned lies, statistics and *_then_* there are database benchmarks (as seen with the link I posted!). Thanks for your input. Best regards, El! > merlin
On Tue, Sep 9, 2025 at 11:57 AM Ellen Allhatatlan <ellenallhatatlan@gmail.com> wrote:
> Note: your link is wrong, corrected here:
Extra hyphen - sorry about and thanks for pointing it out!
> What the article is driving at is that postgres does not use rollback logs to handle updated records in the MVCC implementation. There are absolutely performance tradeoffs in that decision and, if you do a lot of development against postgresql, those tradeoffs should influence how you design databases. The author then cherry picked the 'worst case' case, large unconstrained updates.
Hmm... I was wondering about that - even though he stressed that there
was (paraphrasing) no right or wrong - just different design
decisions!
> The article is a bit of a cheezy dig on postgres. Another example is the complaint about autonomous transactions with another cherry picked example to make postgres look back. In the real world, these would not matter much, and can be worked around (if you want to see my take on how to deal with it, see here: https://github.com/leaselock/pgasync).
OK - so, I was wrong in my original assumption that somehow (and it
wasn't simply because of the phraseology - sweep vs vacuum) I thought
that PG and FB had a similar MVCC implementation vs. Oracle and MySQL
(InnoDB) (and OrioleDB). I'll do a deep dive into their docco and see
what they actually do! I'm actually very interested in the
benchmarking side of database technology - but I do know the old adage
- there are lies, damned lies, statistics and *_then_* there are
database benchmarks (as seen with the link I posted!).
Sure. I think you'll find that postgres approach to MVCC is somewhat unusual relative to other players in this space, which is to write 'old' records or changes in a rollback log; if the transaction commits it is discarded but if it rolls back, the rollback log is written back to the heap. This makes rollbacks potentially very painful. I suspect the postgres approach can also do better in cases of highly contended records, but that's just a guess.
For small (one or a small number of records) updates, the approach doesn't make a whole lot of difference especially if you are aware of and exploit HOT. For very large updates however, it absolutely does, and one might try to avoid them using various strategies. I very much appreciate fast rollbacks though.
merlin
merlin
I read through the article its click bait/flame war just waiting to happen.
Article is a list of cherry picked PG drawbacks that can be mitigated or worked around.
On the bulk updating. I'm shaking my finger at any one that locks up 25% of a table with an update or delete. That is asking for problems in a production database with a high TPS rate.
The author brings up threaded vs multi-process. That's an old old old old old conversation that has been shown there is no clear better way.
Article is a list of cherry picked PG drawbacks that can be mitigated or worked around.
On the bulk updating. I'm shaking my finger at any one that locks up 25% of a table with an update or delete. That is asking for problems in a production database with a high TPS rate.
The author brings up threaded vs multi-process. That's an old old old old old conversation that has been shown there is no clear better way.
Number of open connections. so firebird can do 1000 open sessions with a smaller memory footprint, still can not have 1000 simultaneous running sessions unless we have 1000 CPU's. Where is the win here?? We should be managing resources better on the application side, not opening thousands of connections that sit idle doing nothing.
On autonomous transactions we have procedures now that allow transactions inside of transactions that can be committed and rollbacked. that has been around for several years now.
Backup argument is cherry picking and not discussing pgBackrest and other solutions or the use of tablespaces to isolate databases in a cluster at the disk layer or disk snapshots.
On autonomous transactions we have procedures now that allow transactions inside of transactions that can be committed and rollbacked. that has been around for several years now.
Backup argument is cherry picking and not discussing pgBackrest and other solutions or the use of tablespaces to isolate databases in a cluster at the disk layer or disk snapshots.
"PostgreSQL has a relatively simple, but fast query planning algorithm" Compared to what.... What feature is PG missing these days... the only thing I know it can't do is change the plan in the middle of the execution stage. Which is not a query planner thing but the execution layer saying to itself I am taking too long maybe go back to the planning stage... Query Hints that have been discussed endlessly. Adding hints adds its own problems and has become a big mess for databases that support it.
Multiple transactions per connection. I am asking WHY is that a feature. when one can have multiple sessions, what is the difference? running multiple transactions in single or multiple sessions means moving part of transaction logic into the application space. What do we gain here.....
Multiple transactions per connection. I am asking WHY is that a feature. when one can have multiple sessions, what is the difference? running multiple transactions in single or multiple sessions means moving part of transaction logic into the application space. What do we gain here.....
No application packaging. This Oracle thing that firebird has duplicated at some level. we can simulate this with namespace/schemas.
I can keep going on here.
There are litigmate points here
Compression,
not being able to return partials result sets from functions
XID being 32 bit
anonymous functions in PG have several limitation not just input arguments (not sure i see the need for that)
Temporary tables are a pain and cause issues for big databases
The article is unfair in many places..
I can keep going on here.
There are litigmate points here
Compression,
not being able to return partials result sets from functions
XID being 32 bit
anonymous functions in PG have several limitation not just input arguments (not sure i see the need for that)
Temporary tables are a pain and cause issues for big databases
The article is unfair in many places..
On Tue, Sep 9, 2025 at 6:55 PM Merlin Moncure <mmoncure@gmail.com> wrote:
On Tue, Sep 9, 2025 at 11:57 AM Ellen Allhatatlan <ellenallhatatlan@gmail.com> wrote:> Note: your link is wrong, corrected here:
Extra hyphen - sorry about and thanks for pointing it out!
> What the article is driving at is that postgres does not use rollback logs to handle updated records in the MVCC implementation. There are absolutely performance tradeoffs in that decision and, if you do a lot of development against postgresql, those tradeoffs should influence how you design databases. The author then cherry picked the 'worst case' case, large unconstrained updates.
Hmm... I was wondering about that - even though he stressed that there
was (paraphrasing) no right or wrong - just different design
decisions!
> The article is a bit of a cheezy dig on postgres. Another example is the complaint about autonomous transactions with another cherry picked example to make postgres look back. In the real world, these would not matter much, and can be worked around (if you want to see my take on how to deal with it, see here: https://github.com/leaselock/pgasync).
OK - so, I was wrong in my original assumption that somehow (and it
wasn't simply because of the phraseology - sweep vs vacuum) I thought
that PG and FB had a similar MVCC implementation vs. Oracle and MySQL
(InnoDB) (and OrioleDB). I'll do a deep dive into their docco and see
what they actually do! I'm actually very interested in the
benchmarking side of database technology - but I do know the old adage
- there are lies, damned lies, statistics and *_then_* there are
database benchmarks (as seen with the link I posted!).Sure. I think you'll find that postgres approach to MVCC is somewhat unusual relative to other players in this space, which is to write 'old' records or changes in a rollback log; if the transaction commits it is discarded but if it rolls back, the rollback log is written back to the heap. This makes rollbacks potentially very painful. I suspect the postgres approach can also do better in cases of highly contended records, but that's just a guess.For small (one or a small number of records) updates, the approach doesn't make a whole lot of difference especially if you are aware of and exploit HOT. For very large updates however, it absolutely does, and one might try to avoid them using various strategies. I very much appreciate fast rollbacks though.merlinmerlin
On Tue, Sep 9, 2025 at 8:41 PM Justin <zzzzz.graf@gmail.com> wrote:
I read through the article its click bait/flame war just waiting to happen.
Article is a list of cherry picked PG drawbacks that can be mitigated or worked around.
On the bulk updating. I'm shaking my finger at any one that locks up 25% of a table with an update or delete. That is asking for problems in a production database with a high TPS rate.
The author brings up threaded vs multi-process. That's an old old old old old conversation that has been shown there is no clear better way.Number of open connections. so firebird can do 1000 open sessions with a smaller memory footprint, still can not have 1000 simultaneous running sessions unless we have 1000 CPU's. Where is the win here?? We should be managing resources better on the application side, not opening thousands of connections that sit idle doing nothing.
On autonomous transactions we have procedures now that allow transactions inside of transactions that can be committed and rollbacked. that has been around for several years now.
Backup argument is cherry picking and not discussing pgBackrest and other solutions or the use of tablespaces to isolate databases in a cluster at the disk layer or disk snapshots."PostgreSQL has a relatively simple, but fast query planning algorithm" Compared to what.... What feature is PG missing these days... the only thing I know it can't do is change the plan in the middle of the execution stage. Which is not a query planner thing but the execution layer saying to itself I am taking too long maybe go back to the planning stage... Query Hints that have been discussed endlessly. Adding hints adds its own problems and has become a big mess for databases that support it.
Multiple transactions per connection. I am asking WHY is that a feature. when one can have multiple sessions, what is the difference? running multiple transactions in single or multiple sessions means moving part of transaction logic into the application space. What do we gain here.....No application packaging. This Oracle thing that firebird has duplicated at some level. we can simulate this with namespace/schemas.
I can keep going on here.
There are litigmate points here
Compression,
not being able to return partials result sets from functions
XID being 32 bit
Would converting them to 64 bits require changing the on-disk structure of database files?
anonymous functions in PG have several limitation not just input arguments (not sure i see the need for that)
Aren't transience and "ad hockery" the whole point of anonymous procedures? Thus, I don't see the point of passing them parameters, either.
(When I do need something similar, I build the DO block as a bash string variable with environment variables as "parameters", and then execute it via psql -c "$sql" More like a template, TBH. It's great for purging old data from tables, since I can bypass records who's DELETE statements fail due to a FK constraint.
Temporary tables are a pain and cause issues for big databases
I'd like to see GLOBAL TEMPORARY tables. Each connection gets its own private copy of the table, so that applications don't need to carry around CREATE TEMPORARY TABLE code with them.
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
On Tue, Sep 9, 2025 at 9:12 PM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
On Tue, Sep 9, 2025 at 8:41 PM Justin <zzzzz.graf@gmail.com> wrote:
XID being 32 bitWould converting them to 64 bits require changing the on-disk structure of database files?
Yes this is one of the reasons 64 bit xid has not be used yet. pg_upgrade is not an option as the database must be reloaded.
anonymous functions in PG have several limitation not just input arguments (not sure i see the need for that)Aren't transience and "ad hockery" the whole point of anonymous procedures? Thus, I don't see the point of passing them parameters, either.
I agree. It would be nice to be able to return a result. I cheat using RAISE NOTICE to get something back..
Temporary tables are a pain and cause issues for big databasesI'd like to see GLOBAL TEMPORARY tables. Each connection gets its own private copy of the table, so that applications don't need to carry around CREATE TEMPORARY TABLE code with them.
Temporary tables implementation is not optimal, In most cases I am able to remove temp tables with better SQL, that does not require intermediate temp tables to hold result sets that are used later in the logic. The SQL is harder to write and understand .
Thanks
Justin
Hi, and thanks for your input, Just before I reply - if you (at least here in Ireland - Google's answers vary per location, unlike Duckduckgo's) search for "firebird mvcc mechanism" the "AI assistant" tells me twice that FB's MVCC implementation is "like PostgreSQL's"... I'll investigate further and report back. Igor Rogov's book looks like a good place to start! > I read through the article its click bait/flame war just waiting to happen. > Article is a list of cherry picked PG drawbacks that can be mitigated or worked around. Pity - I took the guy at his word when he said that PostgreSQL was just different, not better or worse. > On the bulk updating. I'm shaking my finger at any one that locks up 25% of a table with an update or delete. That isasking for problems in a production database with a high TPS rate. OK - I'm going to run the benchmarks myself and see what happens - but I"m sure he didn't pick that test for nothing - come to think of it, the table stable structure is bizarre! > The author brings up threaded vs multi-process. That's an old old old old old conversation that has been shown there isno clear better way. This is where things become interesting. Firebird actually has 3 process/threading models - and they manage to maintain these with a team that is *_much_* smaller than the PostgreSQL one - FB is a minnow compared to PG! AIUI, Michael Stonebraker suggested that the process model would/should be "upgraded" to a threaded one at some point in the system's developement? > Number of open connections. so firebird can do 1000 open sessions with a smaller memory footprint, still can not have1000 simultaneous running sessions unless we have 1000 CPU's. Where is the win here?? We should be managing resourcesbetter on the application side, not opening thousands of connections that sit idle doing nothing. Agreed on that point. > On autonomous transactions we have procedures now that allow transactions inside of transactions that can be committedand rollbacked. that has been around for several years now. OK. > Backup argument is cherry picking and not discussing pgBackrest and other solutions or the use of tablespaces to isolatedatabases in a cluster at the disk layer or disk snapshots. OK again. I'm just wondering if the single file per database isn't a fundamental architectural flaw in itself? AIUI, you could have mulitple files (back in 32-bit land) "chained" - but (again AIUI) the same table could be spread over x files - all "intermingled"... weird. > "PostgreSQL has a relatively simple, but fast query planning algorithm" Compared to what.... What feature is PG missingthese days... the only thing I know it can't do is change the plan in the middle of the execution stage. Whichis not a query planner thing but the execution layer saying to itself I am taking too long maybe go back to the planningstage... Query Hints that have been discussed endlessly. Adding hints adds its own problems and has become a bigmess for databases that support it. I know - personally, I'm in favour of the PostgreSQL approach - rather than improve the hints, improve the planner! Plus, if you really want to, you can go here: https://www.postgresql.org/docs/current/runtime-config-query.html and, for example SET enable_seqscan = OFF; Plus, there is/are extension(s) which allow one to provide hints - I did think this was a bit of a whopper alright! > Multiple transactions per connection. I am asking WHY is that a feature. when one can have multiple sessions, what isthe difference? running multiple transactions in single or multiple sessions means moving part of transaction logic intothe application space. What do we gain here..... No idea - I'll take your word for it! > No application packaging. This Oracle thing that firebird has duplicated at some level. we can simulate this with namespace/schemas. Again, I'm not too sure of my ground here - but I do know that Oracle (and SQL Server) are ahead in this domain. > There are litigmate points here > Compression, > not being able to return partials result sets from functions > XID being 32 bit There's a lot of talk about 64 bit ones - FB has 48 bit ones AIUI - that could kick the can down the road for PostgreSQL at the price of 2 bytes per record - is it worth it to alleviate the difficulties associated with VACUUM-ing? > anonymous functions in PG have several limitation not just input arguments (not sure i see the need for that) > Temporary tables are a pain and cause issues for big databases > The article is unfair in many places.. Accepted now - thanks for your input. -- El!
On 9/10/25 00:41, Ellen Allhatatlan wrote: > Hi, and thanks for your input, > >> The author brings up threaded vs multi-process. That's an old old old old old conversation that has been shown there isno clear better way. > > This is where things become interesting. Firebird actually has 3 > process/threading models - and they manage to maintain these with a > team that is *_much_* smaller than the PostgreSQL one - FB is a minnow > compared to PG! Though I would like to know what happened in mid 2010?: https://github.com/FirebirdSQL/firebird/graphs/contributors > >> Backup argument is cherry picking and not discussing pgBackrest and other solutions or the use of tablespaces to isolatedatabases in a cluster at the disk layer or disk snapshots. > > OK again. I'm just wondering if the single file per database isn't a > fundamental architectural flaw in itself? AIUI, you could have > mulitple files (back in 32-bit land) "chained" - but (again AIUI) the > same table could be spread over x files - all "intermingled"... weird. > What are you referring to above? At any rate from the article: "pg_basebackup always dumps the entire cluster, when restoring one database, your entire PostgreSQL cluster will be restored, which will contain only one database. The files of all your other databases will be zero size." AFAIK there is no option to restore one database with pg_basebackup. I think the author is referring to pgBackRest: https://pgbackrest.org/user-guide.html#restore/option-db-include " The test1 database, despite successful recovery, is not accessible. This is because the entire database was restored as sparse, zeroed files." -- Adrian Klaver adrian.klaver@aklaver.com
> Though I would like to know what happened in mid 2010?: > https://github.com/FirebirdSQL/firebird/graphs/contributors Yes, indeed, WTF? I'm not a member of the FB Illuminati - so I can't say! > > OK again. I'm just wondering if the single file per database isn't a > > fundamental architectural flaw in itself? AIUI, you could have > > mulitple files (back in 32-bit land) "chained" - but (again AIUI) the > > same table could be spread over x files - all "intermingled"... weird. > What are you referring to above? I'm sorry - the single file flaw I was referring to occurs in FB and has nothing to do with PG. FB dbs are single files - or were - 32 bit - up to 2GB and then there was another file. I don't know what happens for 64 bit - (note to self - find out)! So, you have table X - it has 2M rows (say, 0.5 GB) in the first file (along with all the other tables). The 2GB limit is hit, more data is added. 0.7 GB is added to table X - these records go into a new database file - the table is split in two - you have 2 "extents" of 2GB with X split 0.5 - in extent1, 0.7 in extent2. All mixed up with other tables as well! That was the architectural flaw to which I was referring. Nothing to do with PG, backups or anything like that - again, apologies for any confusion - my phraseology wasn't the best! And I should have put what I wrote elsewhere anyway! -- El!
On Wed, Sep 10, 2025 at 11:08 AM Ellen Allhatatlan <ellenallhatatlan@gmail.com> wrote:
[snip]
So, you have table X - it has 2M rows (say, 0.5 GB) in the first file
(along with all the other tables). The 2GB limit is hit, more data is
added. 0.7 GB is added to table X - these records go into a new
database file - the table is split in two - you have 2 "extents" of
2GB with X split 0.5 - in extent1, 0.7 in extent2. All mixed up with
other tables as well!
That was the architectural flaw to which I was referring. Nothing to
do with PG
You're gonna be in for a very rude surprise after adding 1GB of data to a PG table...
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
On 9/10/25 08:11, Ron Johnson wrote: > On Wed, Sep 10, 2025 at 11:08 AM Ellen Allhatatlan > <ellenallhatatlan@gmail.com <mailto:ellenallhatatlan@gmail.com>> wrote: > [snip] > > So, you have table X - it has 2M rows (say, 0.5 GB) in the first file > (along with all the other tables). The 2GB limit is hit, more data is > added. 0.7 GB is added to table X - these records go into a new > database file - the table is split in two - you have 2 "extents" of > 2GB with X split 0.5 - in extent1, 0.7 in extent2. All mixed up with > other tables as well! > > That was the architectural flaw to which I was referring. Nothing to > do with PG > > You're gonna be in for a very rude surprise after adding 1GB of data to > a PG table... Yes, but the table is split not the database as a whole per: https://www.postgresql.org/docs/current/storage-file-layout.html > > -- > Death to <Redacted>, and butter sauce. > Don't boil me, I'm still alive. > <Redacted> lobster! -- Adrian Klaver adrian.klaver@aklaver.com
On Wed, Sep 10, 2025 at 3:41 AM Ellen Allhatatlan <ellenallhatatlan@gmail.com> wrote:
Hi, and thanks for your input,
Just before I reply - if you (at least here in Ireland - Google's
answers vary per location, unlike Duckduckgo's) search for "firebird
mvcc mechanism" the "AI assistant" tells me twice that FB's MVCC
implementation is "like PostgreSQL's"... I'll investigate further and
report back. Igor Rogov's book looks like a good place to start!
> I read through the article its click bait/flame war just waiting to happen.
> Article is a list of cherry picked PG drawbacks that can be mitigated or worked around.
Pity - I took the guy at his word when he said that PostgreSQL was
just different, not better or worse.
> On the bulk updating. I'm shaking my finger at any one that locks up 25% of a table with an update or delete. That is asking for problems in a production database with a high TPS rate.
OK - I'm going to run the benchmarks myself and see what happens - but
I"m sure he didn't pick that test for nothing - come to think of it,
the table stable structure is bizarre!
My point here is not to criticize that article as being wrong, PG will bloat the table and be slower than firebird PG. My criticism is this argument/example is not reflective of the real world. What DBA/Programmer would not be in trouble locking 25% of a table for a prolonged period of time doing an update. This approach would all but guarantee an outage or other issues across the database. This argument is bogus and very much against best practices updating large amounts of data in a single transaction creates blockers for other sessions.
> The author brings up threaded vs multi-process. That's an old old old old old conversation that has been shown there is no clear better way.
This is where things become interesting. Firebird actually has 3
process/threading models - and they manage to maintain these with a
team that is *_much_* smaller than the PostgreSQL one - FB is a minnow
compared to PG!
AIUI, Michael Stonebraker suggested that the process model
would/should be "upgraded" to a threaded one at some point in the
system's developement?
I am going to need a source on this. Process vs Threads: pro and cons are very well documented and proven today.
> Backup argument is cherry picking and not discussing pgBackrest and other solutions or the use of tablespaces to isolate databases in a cluster at the disk layer or disk snapshots.
OK again. I'm just wondering if the single file per database isn't a
fundamental architectural flaw in itself? AIUI, you could have
mulitple files (back in 32-bit land) "chained" - but (again AIUI) the
same table could be spread over x files - all "intermingled"... weird.
Single file approach vs multiple file approach. This boils down to preference; there is no clear plus and minus to either approach. the Path to the actual data has to go through several Maps.
Multiple Files. Disk Internal Structure, => File System( ZFS EXT FAT etc..), => Table mapped to a File or group of Files,=> DataPage mapped to data type and columns,
Multiple Files. Disk Internal Structure, => File System( ZFS EXT FAT etc..), => Table mapped to a File or group of Files,=> DataPage mapped to data type and columns,
Single File Disk Internal Structure => File System( ZFS EXT FAT etc..), Single File => Mapping Tables to locations in the single file => DataPage mapped to data type and columns.
Single file has the advantage removing IO context switching at the OS level, however it moves the Context switch to the database side jumping around in the single file itself to find the necessary data.
Single file has the advantage removing IO context switching at the OS level, however it moves the Context switch to the database side jumping around in the single file itself to find the necessary data.
This does not reduce IO
> "PostgreSQL has a relatively simple, but fast query planning algorithm" Compared to what.... What feature is PG missing these days... the only thing I know it can't do is change the plan in the middle of the execution stage. Which is not a query planner thing but the execution layer saying to itself I am taking too long maybe go back to the planning stage... Query Hints that have been discussed endlessly. Adding hints adds its own problems and has become a big mess for databases that support it.
My criticism here "The claim postgresql query planner is simple" is not backed up with actual examples and data. Looks at a specific case which does not even involve the query planner.
> Multiple transactions per connection. I am asking WHY is that a feature. when one can have multiple sessions, what is the difference? running multiple transactions in single or multiple sessions means moving part of transaction logic into the application space. What do we gain here.....
No idea - I'll take your word for it!
Making a logical argument here, the statement this feature is a plus without any examples to backup the claim. I'm trying to show a counter example why would this be a plus when we can do the same thing just differently. It ignores the obvious issue of moving the transaction logic into the app has big drawbacks.
Accepted now - thanks for your input.
--
El!
On Tue, Sep 9, 2025 at 8:41 PM Justin <zzzzz.graf@gmail.com> wrote:
[snip]
Multiple transactions per connection. I am asking WHY is that a feature. when one can have multiple sessions, what is the difference? running multiple transactions in single or multiple sessions means moving part of transaction logic into the application space. What do we gain here.....
If I interpret "Multiple transactions per connection" correctly, they're a great way to emulate WITH HOLD cursors.
For example:
1. In channel #1, open a Read Only cursor. (Yes, this was long ago.)
2. Fetch a row.
3. Switch to channel #2, begin R/W txn, modify the table, commit.
4. Switch back to channel #1
5. Go to step 2.
Not being an application developer anymore, I've never needed to use WITH HOLD cursors or wish for multiple channels in PG.
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
On 2025-Sep-10, Justin wrote: > On Wed, Sep 10, 2025 at 3:41 AM Ellen Allhatatlan < > ellenallhatatlan@gmail.com> wrote: > > > The author brings up threaded vs multi-process. That's an old old old > > old old conversation that has been shown there is no clear better way. > > > > This is where things become interesting. Firebird actually has 3 > > process/threading models - and they manage to maintain these with a > > team that is *_much_* smaller than the PostgreSQL one - FB is a minnow > > compared to PG! > > > > AIUI, Michael Stonebraker suggested that the process model > > would/should be "upgraded" to a threaded one at some point in the > > system's developement? > > I am going to need a source on this. Process vs Threads: pro and cons are > very well documented and proven today. My recollection is that this is correct -- Stonebraker and team used a process model because it was the expedient thing to do. Decades later we're still using it, but there's an ongoing effort to implement a threaded model; there are patches already being committed for that. See https://wiki.postgresql.org/wiki/Multithreading Heikki Linnakangas gave two talks on this effort, one at pgconf.eu 2023 titled "Multithreaded PostgreSQL?" and another at pgconf.eu 2024 titled "Multithreaded PostgreSQL!". You kinda see a progression there. I expect the next one should be "Multithreaded PostgreSQL!!1yksitoista!" or something. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "Find a bug in a program, and fix it, and the program will work today. Show the program how to find and fix a bug, and the program will work forever" (Oliver Silfridge)
On Tue, Sep 9, 2025 at 7:11 PM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
On Tue, Sep 9, 2025 at 8:41 PM Justin <zzzzz.graf@gmail.com> wrote:
On autonomous transactions we have procedures now that allow transactions inside of transactions that can be committed and rollbacked. that has been around for several years now.
[snip]
anonymous functions in PG have several limitation not just input arguments (not sure i see the need for that)Aren't transience and "ad hockery" the whole point of anonymous procedures? Thus, I don't see the point of passing them parameters, either.(When I do need something similar, I build the DO block as a bash string variable with environment variables as "parameters", and then execute it via psql -c "$sql" More like a template, TBH. It's great for purging old data from tables, since I can bypass records who's DELETE statements fail due to a FK constraint.
IMO, you can't equate procedures to autonomous transactions. The only way to do AT in a procedure today is with dblink if you are limited to core/contrib. pg_background can do them, as can my library, pgasync. I would humbly state it's a very high quality library to deal with them in a general way :).
Autonomous transactions are basically threading at the SQL level. The classic use case for them is to emit log records while you're processing some bigger transaction, so that if/when rollback occurs you have some evidence of processing. There are many, many other use cases for them however if you are deep into backend programming.
I will say that the stored procedure COMMIT feature made this library possible as before, it was not possible to have long running processes in the database at the SQL level.
Personally, rather than having explicit syntax supporting AT (or at least, in addition to-), I would like to see the dblink library boned up; we ought to have asynchronous connections, and a multi connection dblink_poll() to avoid the dblink_is_busy polling loop.
Getting back to firebase, I suppose we ought not to treat the article author too harshly. There's nothing wrong with advocating for your solution although a more balanced treatment might have been more compelling.
merlin
On Tue, Sep 09, 2025 at 08:41:02PM -0400, Justin wrote: > The author brings up threaded vs multi-process. That's an old old old old > old conversation that has been shown there is no clear better way. This is relevant to the next part: > Number of open connections. so firebird can do 1000 open sessions with a > smaller memory footprint, still can not have 1000 simultaneous running > sessions unless we have 1000 CPU's. Where is the win here?? We should be > managing resources better on the application side, not opening thousands of > connections that sit idle doing nothing. When a service is written in such a way as to minimize the memory footprint of each request/client then it can process more of them assuming it's only memory-bound. Why? Because less memory per thing == less bandwidth use, and also less thrashing of caches and higher cache hit ratios. Minimizing request/client state means not spreading any of it on the stack, thus not requiring a stack per-client. This means not thread-per-client (green or otherwise) or process-per-client. It means essentially some flavor of continuation passing style (CPS). For a query plan executor that's really: the query plan, all the in-flight I/O requests, all cached data still needed to continue processing the plan. If you have a Duff's device style / CPS style implementation, then nothing on the stack needs to be preserved while waiting for I/Os, and the state of the query plan is effectively minimized. But for a database with storage I/O costs the memory footprint doesn't matter quite so much because most likely it will be I/O bound rather than CPU- or memory-bound. > "PostgreSQL has a relatively simple, but fast query planning algorithm" > Compared to what.... What feature is PG missing these days... the only > thing I know it can't do is change the plan in the middle of the > execution stage. Which is not a query planner thing but the execution > layer saying to itself I am taking too long maybe go back to the planning > stage... Query Hints that have been discussed endlessly. Adding hints > adds its own problems and has become a big mess for databases that support > it. I would really like out-of-band hints. These would be hints not specified in the SQL itself but to be sent separately and which address table sources or joins by name, like this: psql> SELECT .. FROM x x1 JOIN y y1 ON .. JOIN y y2 ON .. WHERE ..; ...> \hint y1 indexed by .. ...> \hint y2 indexed by .. ...> ; > Multiple transactions per connection. I am asking WHY is that a feature. > when one can have multiple sessions, what is the difference? running > multiple transactions in single or multiple sessions means moving part of > transaction logic into the application space. What do we gain here..... I agree it's not really important. Moreover interleaving multiple queries over one TCP connection will lead to having to manage how much bandwidth each query consumes so as not to drown out the others. > No application packaging. This Oracle thing that firebird has duplicated > at some level. we can simulate this with namespace/schemas. And extensions. > XID being 32 bit This is a huge problem. > Temporary tables are a pain and cause issues for big databases Yes. PG badly needs GLOBAL TEMP. Another thing that would be nice is if PG could have tables that are not heaps. Nico --
On Wed, Sep 10, 2025 at 5:28 PM Nico Williams <nico@cryptonector.com> wrote:
On Tue, Sep 09, 2025 at 08:41:02PM -0400, Justin wrote:
> The author brings up threaded vs multi-process. That's an old old old old
> old conversation that has been shown there is no clear better way.
This is relevant to the next part:
> Number of open connections. so firebird can do 1000 open sessions with a
> smaller memory footprint, still can not have 1000 simultaneous running
> sessions unless we have 1000 CPU's. Where is the win here?? We should be
> managing resources better on the application side, not opening thousands of
> connections that sit idle doing nothing.
When a service is written in such a way as to minimize the memory
footprint of each request/client then it can process more of them
assuming it's only memory-bound. Why? Because less memory per thing ==
less bandwidth use, and also less thrashing of caches and higher cache
hit ratios.
Minimizing request/client state means not spreading any of it on the
stack, thus not requiring a stack per-client. This means not
thread-per-client (green or otherwise) or process-per-client. It means
essentially some flavor of continuation passing style (CPS). For a
query plan executor that's really: the query plan, all the in-flight I/O
requests, all cached data still needed to continue processing the plan.
If you have a Duff's device style / CPS style implementation, then
nothing on the stack needs to be preserved while waiting for I/Os, and
the state of the query plan is effectively minimized.
But for a database with storage I/O costs the memory footprint doesn't
matter quite so much because most likely it will be I/O bound rather
than CPU- or memory-bound.
I am not following you here, Databases are going to be bound somewhere at some point, Disk,IO, Network IO, Memory, or CPU bound. Which one is causing the bottle neck just depends on the workload and size of the database.
The number of idle sessions does not really matter it is just wasting resources across the entire application stack.
> "PostgreSQL has a relatively simple, but fast query planning algorithm"
> Compared to what.... What feature is PG missing these days... the only
> thing I know it can't do is change the plan in the middle of the
> execution stage. Which is not a query planner thing but the execution
> layer saying to itself I am taking too long maybe go back to the planning
> stage... Query Hints that have been discussed endlessly. Adding hints
> adds its own problems and has become a big mess for databases that support
> it.
I would really like out-of-band hints. These would be hints not
specified in the SQL itself but to be sent separately and which address
table sources or joins by name, like this:
psql> SELECT .. FROM x x1 JOIN y y1 ON .. JOIN y y2 ON .. WHERE ..;
...> \hint y1 indexed by ..
...> \hint y2 indexed by ..
...> ;
I humbly disagree, the point of SQL being a 4th generation language is, I tell it what I want, not how to go get what I want.
Thank you,
Justin
On Wed, Sep 10, 2025 at 06:20:18PM -0400, Justin wrote: > I am not following you here, Databases are going to be bound somewhere at > some point, Disk,IO, Network IO, Memory, or CPU bound. Which one is > causing the bottle neck just depends on the workload and size of the > database. > > The number of idle sessions does not really matter it is just > wasting resources across the entire application stack. Idle sessions do cost something, but I agree it's not much. The point is that if you've got a thread- or process-per-client/request architecture then you can easily make the per-client memory footprint (and cache pressure) of the service much higher than if per-client/req state was minimized (by not spreading it over the stack). If you do that you might go from I/O-bound to memory-bound, though frankly I doubt it. Instead the likely cost of PG's architecture is just that you need more memory per (active) client, which is probably a fine price to pay considering how featureful PG is and how active the PG community is. > > I would really like out-of-band hints. These would be hints not > > specified in the SQL itself but to be sent separately and which address > > table sources or joins by name, like this: > > > > psql> SELECT .. FROM x x1 JOIN y y1 ON .. JOIN y y2 ON .. WHERE ..; > > ...> \hint y1 indexed by .. > > ...> \hint y2 indexed by .. > > ...> ; > > I humbly disagree, the point of SQL being a 4th generation language is, I > tell it what I want, not how to go get what I want. This debate will go on and on. And query planners will continue to plan some queries very poorly. And telling users to deal with it sucks for them. Nico --
On Wed, Sep 10, 2025 at 6:20 PM Justin <zzzzz.graf@gmail.com> wrote:
On Wed, Sep 10, 2025 at 5:28 PM Nico Williams <nico@cryptonector.com> wrote:
[snip]
I would really like out-of-band hints. These would be hints not
specified in the SQL itself but to be sent separately and which address
table sources or joins by name, like this:
psql> SELECT .. FROM x x1 JOIN y y1 ON .. JOIN y y2 ON .. WHERE ..;
...> \hint y1 indexed by ..
...> \hint y2 indexed by ..
...> ;
I humbly disagree, the point of SQL being a 4th generation language is, I tell it what I want, not how to go get what I want.
And the point of high-level languages is to eliminate GOTO, and yet GOTO still exists.
The real problem is requiring the application to carry the hints.
My old RDBMS let you CREATE HINT to which you (the DBA) would put a SELECT, UPDATE or DELETE statement (similar to PREPARE), and what indices to use. That way, the application doesn't carry them, and "you" can update them as needed.
Rarely useful, but occasionally *VERY* useful.
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
>> AIUI, Michael Stonebraker suggested that the process model >> would/should be "upgraded" to a threaded one at some point in the >> system's developement? > I am going to need a source on this. Process vs Threads: pro and cons are very well documented and proven today. Ask, and it will be given to you; seek, and you will find; knock, and it will be opened to you! Matthew 7:7 From here: https://dsf.berkeley.edu/papers/ERL-M85-95.pdf (bottom of page 13) - 5.1. Process Structure - Stonebraker expresses regret that the project doesn't have the resources to use the "server model" - i.e. threading: "The server model has many performance benefits (e.g., sharing of open file descriptors and buffers and optimized task switching and message sending overhead) in a large machine environment in which high performance is critical. However, this approach requires that a fairly complete special-purpose operating system be built. In constrast, the process-per-user model is simpler to implement but will not perform as well on most conventional operating systems. We decided after much soul searching to implement POSTGRES using a process-per-user model architecture because of our limited programming esources. POSTGRES is an ambitious undertaking and we believe the additional complexity introduced by the server architecture was not worth the additional risk of not getting the system running. Our current plan then is to implement POSTGRES as a process-per-user model on Unix 4.3 BSD" No date, but the latest date in the references was 1986. and from here https://dsf.berkeley.edu/papers/ERL-M90-34.pdf (middle of page 29): "A last aspect of our design concerns the operating system process structure. Currently, POSTGRES runs as one process for each active user. This was done as an expedient to get a system operational as quickly as possible. We plan on converting POSTGRES to use lightweight processes available in the operating systems we are using. These include PRESTO for the Sequent Symmetry and threads in Version 4 of Sun/OS." Again, no date, but this time the latest reference was from 1989 - (threading advancing?). The use of "because of our limited programming esources" (1st ref.) and "expedient" (2nd) clearly indicates to me that Stonebraker was eyeing up the threading model and saw the per-process one as an albatross around the neck of the project! I hope my sources are sufficient? -- El!
On Thu, Sep 11, 2025 at 5:38 AM Ellen Allhatatlan <ellenallhatatlan@gmail.com> wrote:
"A last aspect of our design concerns the operating system process
structure. Currently, POSTGRES runs as one process for each active
user. This was done as an expedient to get a system operational as
quickly as possible. We plan on converting POSTGRES to use lightweight
processes available in the operating systems we are using. These
include PRESTO for the Sequent Symmetry and threads in Version 4 of
Sun/OS."
Technical discussions from the 80's are more or less historically interesting only. At that time, support for threads was pretty immature relative to today, and the general state of operating system technology was pretty crude by modern standards. Process spinup via fork() might also have been much more performance relevant that it is today, and various synchronization primitives might have been pretty lousy as well.
I find the threads/processes debate to be pretty silly in general. Things have changed a lot, IPC has improved a lot, and I would argue the decision to use/not use SSL is much more important to database session startup than the database spinning up a process vs a thread. The mythology around this architectural decision is pervasive and mostly incorrect IMO, and there are many high quality solutions to work around this connection poolers, pgbouncer, etc, which are essentially employed against all databases in some or another way, and are essentially universally employed in scenarios where scale and reliable performance is important.
Maybe Microsoft is the odd man out here as its weird non-standard process model made porting multi process servers (including postgres) difficult and imperformant. Those issues are (mostly) long since worked out though.
Having said that, I suspect 3rd party vendor support for postgres/microsoft being relatively limited would be much more based in business calculation rather than technical constraints.
Final thoughts on this: firebird (fmrly interbase) did not achieve the level of success in the market that postgres, even though they may have been similarly positioned. My take: that disparity in success has more to do with postgres having a more open development model, stronger community, and (especially) timing; postgres was pretty well established in the open source world when Borland open sourced it around the year 2000. Firebird had (and has) some neat stuff, in particular a nice embedding option and strong windows support, but the market was already pretty crowded at that time.
merlin
Hi
Final thoughts on this: firebird (fmrly interbase) did not achieve the level of success in the market that postgres, even though they may have been similarly positioned. My take: that disparity in success has more to do with postgres having a more open development model, stronger community, and (especially) timing; postgres was pretty well established in the open source world when Borland open sourced it around the year 2000. Firebird had (and has) some neat stuff, in particular a nice embedding option and strong windows support, but the market was already pretty crowded at that time.
Still around 2005 Firebird was more popular than Postgres (in the Czech Republic). I think after this time, the Firebird had not some wave of popularity (and Postgres did - RoR, PostGIS, ora2pg, json, ...), and it was too joined with windows desktop and Delphi. I have not any idea about current situation, and current performance of Firebird, but 15 years ago (Firebird 2.x), Postgres was better for web applications running on Linux server with more than thousands users. I did some benchmarks at 2005, and Firebird was really fast on Windows, and slower on Linux (slower than Postgres).
merlin
You, (Merlin Moncure) said: > Technical discussions from the 80's are more or less historically interesting only. I agree with your technical points - and the fact that I brought up "history". I was replying to Justin in this context: I wrote: > AIUI, Michael Stonebraker suggested that the process model > would/should be "upgraded" to a threaded one at some point in the > system's developement? To which Justin replied: > I am going to need a source on this. Process vs Threads: pro and cons are very well documented and proven today. Hence the history lesson - to provide the "source" from Michael Stonebraker HTH. > merlin -- El!
On Fri, Sep 12, 2025 at 3:18 PM Ellen Allhatatlan <ellenallhatatlan@gmail.com> wrote:
You, (Merlin Moncure) said:Thank you for the documents.
> Technical discussions from the 80's are more or less historically interesting only.
I agree with your technical points - and the fact that I brought up "history".
I was replying to Justin in this context:
I wrote:
> AIUI, Michael Stonebraker suggested that the process model
> would/should be "upgraded" to a threaded one at some point in the
> system's developement?
To which Justin replied:
> I am going to need a source on this. Process vs Threads: pro and cons are very well documented and proven today.
Hence the history lesson - to provide the "source" from Michael Stonebraker
The reason PostgreSQL was not developed using threads was that it was not technically feasible at the time. The entire idea of PostgreSQL was experimental. One of the things you want to do is test out as many ideas as you can feasibly do .
Jump forward 20-40 years into the future, what have we learned, What was thought would be a clear advantage for threads has been shown not to be a clear advantage.
Each approach has pluses and minuses. I agree with many others that the time spent trying to get threads to work in PostgreSQL is just not worth it when we should be spending our time other issues
moving XID to 64 bit and moving away from the current file format and stop storing row versions in the tables, would pay far more dividends in performance than threading will...
The append only storage design aka Storing row versions in the table is a Stonebraker idea, which would allow for Time-Travel Queries, I am pretty sure this proved to be unworkable.
moving XID to 64 bit and moving away from the current file format and stop storing row versions in the tables, would pay far more dividends in performance than threading will...
The append only storage design aka Storing row versions in the table is a Stonebraker idea, which would allow for Time-Travel Queries, I am pretty sure this proved to be unworkable.
PostgreSQL table layout and MVCC being tracked in the tables has been shown to be problematic and a performance bottleneck . Just look at all the time the community has spent making AutoVacuum better and all the IO spent keeping the XID from wrapping around, or HOT updates or the FILL FACTOR setting.
The question I should have asked is what is Stonebraker's current thought on process vs threading today.
Thank you
Justin
Thank you
Justin