Autonomous transactions 2023, WIP

Поиск
Список
Период
Сортировка
От Ivan Kush
Тема Autonomous transactions 2023, WIP
Дата
Msg-id f7470d5a-3cf1-4919-8404-5c4d91341a9f@tantorlabs.com
обсуждение исходный текст
Ответы Re: Autonomous transactions 2023, WIP  (Pavel Stehule <pavel.stehule@gmail.com>)
Re: Autonomous transactions 2023, WIP  ("Andrey M. Borodin" <x4mmm@yandex-team.ru>)
Список pgsql-hackers

Hello. I'm working on the support of autonomous transactions in Postgres.
Could you please make a preliminary review and give advices (see section 
#TODO)

# Patch
v0001-Autonomous-transactions.patch

# Introduction
This patch implements Autonomous Transactions for PL/pgSQL.
Autonomous transaction is a transaction that can be succesfully commited 
even if base transaction is rolled back. Common use cases: 
logging/auditing/tracking progress in tables, so that information about 
the execution attempt is preserved even when the main transaction is 
rolled back — for example, due to an error.

# Glossary
Session - entity that groups multiple related SQL commands into a single 
transaction.
Main session (backend, foreground session) - session through which the 
user interacts.
Main transaction (parent) - transaction that runs in the main session.
Autonomous session - session that performs an offline transaction. It 
starts from the main session.
Autonomous transaction - independent transaction that runs inside an 
autonomous session.
Autonomous function - function with the pragma AUTONOMOUS_TRANSACTION. 
When it is executed, an autonomous session is created in it.
Background worker - background process that performs some actions in the 
background, without the user's participation.
dsm - dynamic shared memory.
shm_mq - shared memory message queue.

# Internals
This patch introduces a "pragma AUTONOMOUS_TRANSACTION" to functions. 
When one such function is executed all (at the current time not all, 
WIP) statements from it are executed in an autonomous session.

* Example *
*SQL-request:*

```sql
CREATE TABLE tbl (a int);
CREATE OR REPLACE FUNCTION func() RETURNS void
LANGUAGE plpgsql
AS $$
DECLARE
   PRAGMA AUTONOMOUS_TRANSACTION;
BEGIN
   INSERT INTO tbl VALUES (1);
END;
$$;

START TRANSACTION;
SELECT func();
ROLLBACK;

SELECT * FROM tbl;

DROP FUNCTION func;
DROP TABLE tbl;
```

*Output:*

```bash
  a
---
  1
(1 row)
```


For each backend the patch lazily creates a pool of autonomous sessions. 
When backend calls autonomous function, backend takes one autonomous 
session from this pool and sends there function's statements for 
execution. When execution is finished backend returns session to pool. 
Lazily means that pool is created only when first autonomous session is 
needed.
Backend and autonomous session communicate with the help of Postgres 
client-server protocol. Messages are sent through dynamic shared memory. 
Execution of backend and autonomous session is synchronous: autonomous 
session waits for messages from backendand backend waits for messages 
from autonomous session.
Autonomous session uses Background workers internally. As it's a 
separate process, it contains caches, etc. In order to prevent infinite 
grow of resources usage we reset all caches by timeout using restart of 
autonomous sessions. This timeout is set by guc setting 
autonomous_session_lifetime.
Source code contains more detailed comments.

# Alternatives
At the current time for this functionality may be uses extensions: 
dblink and pg_background. But they have shortcomings:
     1) not in the Postgres core, they are extensions
     2) lower performance. Each call creates new process that is 
destroyed immediately after transaction is finished.

# TODO
Could you please give advices how implement public pool shared between 
all backends?
1) Support execution of remaining statements in autonomous sessions.
2) Public pool shared between all backends. At the current time for each 
backend private pool is created.

# Tests
Implementation contains many regression tests of varying complexity, 
which check supported features.

# Platform
This patch was checkouted from tag 15.4. This is WIP. I've developed in 
Linux, code doesn't contain platfrom-specific code, only Postgres 
internal data structures and functions.

# Documentation
Regression tests contain many examples
* Describe the effect your patch has on performance, if any.
It adds a new feature and increase performance compared to dblink and 
pg_background

# History
## 1st feature requests and discussions in pgsql-hackers (without code)
1) 2008
https://www.postgresql.org/message-id/flat/1A6E6D554222284AB25ABE3229A9276271549A%40nrtexcus702.int.asurion.com
2) 2010
https://www.postgresql.org/message-id/flat/AANLkTi%3DuogmYxLKWmUfFSg-Ki2bejsQiO2g5GTMxvdW2%40mail.gmail.com
3) 2011
https://www.postgresql.org/message-id/flat/1303399444.9126.8.camel%40vanquo.pezone.net
4) 2011
https://wiki.postgresql.org/wiki/Autonomous_subtransactions
5) 2011
https://www.postgresql.org/message-id/flat/20111218082812.GA14355%40leggeri.gi.lan
https://wiki.postgresql.org/wiki/Autonomous_subtransactions
## Implementaion
1) 2014, Rajeev Rastogi, implementation based on subtransactions
https://www.postgresql.org/message-id/flat/BF2827DCCE55594C8D7A8F7FFD3AB7713DDDEF59%40SZXEML508-MBX.china.huawei.com
2) 2015, Rajeev Rastogi, new theme, continues discussion about semantics 
and syntax of autonomous transactions
https://www.postgresql.org/message-id/flat/BF2827DCCE55594C8D7A8F7FFD3AB7715990499A%40szxeml521-mbs.china.huawei.com
3) 2016, Peter Eisentraut, implementation based on background workers
https://www.postgresql.org/message-id/flat/659a2fce-b6ee-06de-05c0-c8ed6a01979e%402ndquadrant.com

# Summary
* Add pragma AUTONOMOUS_TRANSACTION in the functions. When function 
contains this pragma, the it's executed autonomously
* Background workers are used to run autonomous sessions.
* Synchronous execution between backend and autonomous session
* Postgres Client-Server Protocol is used to communicate between them
* Pool of autonomous sessions. Pool is created lazily.
* Infinite nested calls of autonomous functions are allowed. Limited 
only by computer resources.
* If another 2nd autonomous function is called in the 1st autonomous 
function, the 2nd is executed at the beginning, and then the 1st 
continues execution.

-- 
Best wishes,
Ivan Kush
Tantor Labs LLC

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jakub Wartak
Дата:
Сообщение: Re: trying again to get incremental backup
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: trying again to get incremental backup