Creating a DSA area to provide work space for parallel execution
От | Thomas Munro |
---|---|
Тема | Creating a DSA area to provide work space for parallel execution |
Дата | |
Msg-id | CAEepm=0HmRefi1+xDJ99Gj5APHr8Qr05KZtAxrMj8b+ay3o6sA@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: Creating a DSA area to provide work space for parallel execution
|
Список | pgsql-hackers |
Hi hackers, A couple of months ago I proposed dynamic shared areas[1]. DSA areas are dynamically sized shared memory heaps that backends can use to share data, building on top of the existing DSM infrastructure. One target use case for DSA areas is to provide work space for parallel query execution. To that end, here is a patch to create a DSA area for use by executor code. The area is automatically attached to the leader and all worker processes for the duration of the parallel query, and is available as estate->es_query_area. Backends already have access to shared memory through a single DSM segment managed with a table-of-contents. The TOC provides a way to carve out some shared storage space for individual executor nodes and look it up later by plan node ID. That works for things like ParallelHeapScanDescData whose size is known up front, but not so well if you need something more like a heap in which to build shared data structures. Through estate->es_query_area, a parallel-aware executor node can use and recycle arbitrary amounts of shared memory with an allocate/free interface. Motivating use cases include shared bitmaps and shared hash tables (patches to follow). Currently, this doesn't mean you don't also need the existing DSM segment. In order share data structures in the DSA area, you need a way to exchange pointers to find them, and the existing segment + TOC mechanism is ideal for that. One obvious problem is that this patch results in at least *two* DSM segments being created for every parallel query execution: the main segment used for parallel execution, and then the initial segment managed by the DSA area. One thought is that DSA areas are the more general mechanism, so perhaps we should figure out how to store contents of the existing segment in it. The TOC interface would need a few tweaks to be able to live in memory allocated with dsa_allocate, and they we'd need to share that address with other backends so that they could find it (cf the current approach of finding the TOC at the start of the segment). I haven't prototyped that yet. That'd involve changing the wording "InitializeDSM" that appears in various places including the FDW API, which has been putting me off... This patch depends on dsa-v2.patch[1]. [1] https://www.postgresql.org/message-id/flat/CAEepm%3D1z5WLuNoJ80PaCvz6EtG9dN0j-KuHcHtU6QEfcPP5-qA%40mail.gmail.com -- Thomas Munro http://www.enterprisedb.com
Вложения
В списке pgsql-hackers по дате отправления: