Exporting Snapshots
От | Markus Wanner |
---|---|
Тема | Exporting Snapshots |
Дата | |
Msg-id | 4B6D1F4E.7070104@bluegap.ch обсуждение исходный текст |
Ответы |
Re: Exporting Snapshots
|
Список | pgsql-cluster-hackers |
Hi, the very first item on the ClusterFeatures [1] wishlist is "Export snapshots to other sessions". Joachim Wieland has recently sent in a patch to hackers [1] which he called "Synchronized Snapshots". To me that sounded similar enough to review it. That patch doesn't really "export" a snapshot, but rather just tries to make sure the transactions start with the same snapshot. They can then do whatever they want, including writing and committing or aborting whenever they want. But for any kind of parallel querying (be it on the same or across multiple nodes) we need to be able to export a snapshot of a transaction to another backend - from any point in time of the origin transaction. This includes the full XIP array (list of transactions in progress at the time of snapshot creation) as well as making sure the data that's already written (but uncommitted) by that transaction is available to the destination backend (which is a no-op for a single node, but needs care for remote backends). Additionally, some access controlling information needs to be transferred, to ensure parallel querying isn't a security hole. Joachim's patch currently circumvents this issue by requiring superuser privileges. A worker backend for parallel querying should never need to write any data, so it should be forced into read-only mode. And I'd say the origin transaction should not be allowed to continue with another query before having "collected" all worker backends that attached to its snapshot. So we have yet another difference to Joachim's approach: continuing independently or being bound to the origin transaction. I realize this is not quite the same as what Joachim has in mind for parallel pg_dump. It seems to be a more general approach, which certainly also requires more work. However, I think it could fit the requirements of a parallel pg_dump as well. Cluster hackers, is this a good summary which covers your needs as well? Something that's still missing? Joachim, would you be willing to work on such a more general approach? Regards Markus Wanner [1]: feature wish list of cluster hackers: http://wiki.postgresql.org/wiki/ClusterFeatures [2]: Synchronized Snapshots, by Joachim Wieland http://archives.postgresql.org/message-id/dc7b844e1001081136k12ae4eq6d1f7689ed1adfe6@mail.gmail.com
В списке pgsql-cluster-hackers по дате отправления: