pg_background (and more parallelism infrastructure patches)
От | Robert Haas |
---|---|
Тема | pg_background (and more parallelism infrastructure patches) |
Дата | |
Msg-id | CA+Tgmoam66dTzCP8N2cRcS6S6dBMFX+JMba+mDf68H=KAkNjPQ@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: pg_background (and more parallelism infrastructure
patches)
Re: pg_background (and more parallelism infrastructure patches) Re: pg_background (and more parallelism infrastructure patches) Re: pg_background (and more parallelism infrastructure patches) |
Список | pgsql-hackers |
Attached is a contrib module that lets you launch arbitrary command in a background worker, and supporting infrastructure patches for core. You can launch queries and fetch the results back, much as you could do with a dblink connection back to the local database but without the hassles of dealing with authentication; and you can also run utility commands, like VACUUM. For people who have always wanted to be able to launch a vacuum (or an autonomous transaction, or a background task) from a procedural language ... enjoy. Here's an example of running vacuum and then fetching the results. Notice that the notices from the original session are propagated to our session; if an error had occurred, it would be re-thrown locally when we try to read the results. rhaas=# create table foo (a int); CREATE TABLE rhaas=# select pg_background_launch('vacuum verbose foo'); pg_background_launch ---------------------- 51101 (1 row) rhaas=# select * from pg_background_result(51101) as (x text); INFO: vacuuming "public.foo" INFO: "foo": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.00s/0.00u sec elapsed 0.00 sec. x -------- VACUUM (1 row) Here's an overview of the attached patches: Patches 1 and 2 add a few new interfaces to the shm_mq and dsm APIs that happen to be convenient for later patches in the series. I'm pretty sure I could make all this work without these, but it would take more code and be less efficient, so here they are. Patch 3 adds the ability for a backend to request that the protocol messages it would normally send to the frontend get redirected to a shm_mq. I did this by adding a couple of hook functions. The best design is definitely arguable here, so if you'd like to bikeshed, this is probably the patch to look at. This patch also adds a function to help you parse an ErrorResponse or NoticeResponse and re-throw the error or notice in the originating backend. Obviously, parallelism is going to need this kind of functionality, but I suspect a variety of other applications people may develop using background workers may want it too; and it's certainly important for pg_background itself. Patch 4 adds infrastructure that allows one session to save all of its non-default GUC values and another session to reload those values. This was written by Amit Khandekar and Noah Misch. It allows pg_background to start up the background worker with the same GUC settings that the launching process is using. I intend this as a demonstration of how to synchronize any given piece of state between cooperating backends. For real parallelism, we'll need to synchronize snapshots, combo CIDs, transaction state, and so on, in addition to GUCs. But GUCs are ONE of the things that we'll need to synchronize in that context, and this patch shows the kind of API we're thinking about for these sorts of problems. Patch 5 is a trivial patch to add a function to get the authenticated user ID. Noah pointed out to me that it's important for the authenticated user ID, session user ID, and current user ID to all match between the original session and the background worker. Otherwise, pg_background could be used to circumvent restrictions that we normally impose when those values differ from each other. The session and current user IDs are restored by the GUC save-and-restore machinery ("session_authorization" and "role") but the authenticated user ID requires special treatment. To make that happen, it has to be exposed somehow. Patch 6 is pg_background itself. I'm quite pleased with how easily this came together. The existing background worker, dsm, shm_toc, and shm_mq infrastructure handles most of the heavily lifting here - obviously with some exceptions addressed by the preceding patches. Again, this is the kind of set-up that I'm expecting will happen in a background worker used for actual parallelism - clearly, more state will need to be restored there than here, but nonetheless the general flow of the code here is about what I'm imagining, just with somewhat more different kinds of state. Most of the work of writing this patch was actually figuring out how to execute the query itself; what I ended up with is mostly copied form exec_simple_query, but with some difference here and there. I'm not sure if it would be possible/advisable to try to refactor to reduce duplication. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
- 0001-Extend-shm_mq-API-with-new-functions-shm_mq_sendv-an.patch
- 0002-Extend-dsm-API-with-a-new-function-dsm_unkeep_mappin.patch
- 0003-Support-frontend-backend-protocol-communication-usin.patch
- 0004-Add-infrastructure-to-save-and-restore-GUC-values.patch
- 0005-Add-a-function-to-get-the-authenticated-user-ID.patch
- 0006-pg_background-Run-commands-in-a-background-worker-an.patch
В списке pgsql-hackers по дате отправления: