O(1) DSM handle operations
От | Thomas Munro |
---|---|
Тема | O(1) DSM handle operations |
Дата | |
Msg-id | CAEepm=3-EX8sTW7G5a_9TXABMZNpTuWeR1gtq5s+88OXGpo=vA@mail.gmail.com обсуждение исходный текст |
Список | pgsql-hackers |
Hi hackers, This is just a thought for discussion, no patch attached... DSM operations dsm_create(), dsm_attach(), dsm_unpin_segment() perform linear searches of the dsm_control->item array for either a free slot or a slot matching a given handle. Maybe no one thinks this is a problem, because in practice the number of DSM slots you need to scan should be something like number of backends * some small factor at peak. On the other hand it makes DSM slots a scarce resource and creates an incentive for us never to allow many segments to be created, which has knock-on implications, like dsa.c's geometric growth pattern: it starts allocating bigger and bigger DSM segments so that we don't run out of DSM slots which increases fragmentation (ie allocation of memory you don't need). Would it make sense to implement $SUBJECT like this? 1. Perform O(1) handle lookup by using N bits of dsm_handle to identify a dsm_control->item slot number, and detect ABA slot reuse by using the rest of the bits of dsm_handle to hold a counter which must match a cycling counter in the control slot. This would replace the current scheme where handles are random numbers. 2. Perform O(1) slot allocation with a freelist using a next-free-slot link pointer in the slots themselves. Perhaps the freelist from (2) should be a FIFO queue so that slots are reused as slowly as possible, strengthening the counter scheme from (1), or perhaps that's optimising for the wrong thing, I don't know. The biggest problem with point (1) above would seem to be the SysV implementation, where dsm_handle is transmogrified into key_t and some of the bits might get chomped if the latter is smaller, which seems to imply that at least the slot identification bits would need to be sure to fit into key_t, and also the use of the special key_t sentinel values IPC_PRIVATE which somehow needs to be avoided. Do any systems really have sizeof(key_t) < 4? Point (2) seems to be implementable independently of (1). -- Thomas Munro http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: