Sequence Access Methods, round two

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Sequence Access Methods, round two
Дата
Msg-id ZWlohtKAs0uVVpZ3@paquier.xyz
обсуждение исходный текст
Ответы Re: Sequence Access Methods, round two  (Michael Paquier <michael@paquier.xyz>)
Re: Sequence Access Methods, round two  (Peter Eisentraut <peter@eisentraut.org>)
Список pgsql-hackers
Hi all,

Back in 2016, a patch set has been proposed to add support for
sequence access methods:
https://www.postgresql.org/message-id/flat/CA%2BU5nMLV3ccdzbqCvcedd-HfrE4dUmoFmTBPL_uJ9YjsQbR7iQ%40mail.gmail.com

This included quite a few concepts, somewhat adapted to the point
where this feature was proposed:
- Addition of USING clause for CREATE/ALTER SEQUENCE.
- Addition of WITH clause for sequences, with custom reloptions.
- All sequences rely on heap
- The user-case focused on was the possibility to have cluster-wide
sequences, with sequence storage always linked to heap.
- Dump/restore logic depended on that, with a set of get/set functions
to be able to retrieve and force a set of properties to sequences.

A bunch of the implementation and design choices back then come down
to the fact that *all* the sequence properties were located in a
single heap file, including start, restart, cycle, increment, etc.
Postgres 10 has split this data with the introduction of the catalog
pg_sequence, that has moved all the sequence properties within it.
As a result, the sequence "heap" metadata got reduced to its
last_value, is_called and log_cnt (to count if a metadata tuple should
be WAL-logged).  Honestly, I think we can do simpler than the original
proposal, while satisfying more cases than what the original thread
wanted to address.  One thing is that a sequence AM may want storage,
but it should be able to plug in to a table AM of its choice.

Please find attached a patch set that aims at implementing sequence
access methods, with callbacks following a model close to table and
index AMs, with a few cases in mind:
- Global sequences (including range-allocation, local caching).
- Local custom computations (a-la-snowflake).

The patch set has been reduced to what I consider the minimum
acceptable for an implementation, with some properties like:
- Implementation of USING in CREATE SEQUENCE only, no need for WITH
and reloptions (could be added later).
- Implementation of dump/restore, with a GUC to force a default
sequence AM, and a way to dump/restore without a sequence AM set (like
table AMs, this relies on SET and a --no-sequence-access-method).
- Sequence AMs can use a custom table AM to store its meta-data, which
could be heap, or something else.  A sequence AM is free to choose if
it should store data or not, and can plug into a custom RMGR to log
data.
- Ensure compatibility with the existing in-core method, called
"local" in this patch set.  This uses a heap table, and a local
sequence AM registers the same pg_class entry as past releases.
Perhaps this should have a less generic name, like "seqlocal",
"sequence_local", but I have a bad tracking history when it comes to
name things.  I've just inherited the name from the patch of 2016.
- pg_sequence is used to provide hints (or advices) to the sequence
AM about the way to compute values.  A nice side effect of that is
that cross-property check for sequences are the same for all sequence
AMs.  This gives a clean split between pg_sequence and the metadata
managed by sequence AMs.

On HEAD, sequence.c holds three different concepts, and decided that
this stuff should actually split them for table AMs:
1) pg_sequence and general sequence properties.
2) Local sequence cache, for lastval(), depending on the last sequence
value fetched.
3) In-core sequence metadata, used to grab or set values for all
the flavors of setval(), nextval(), etc.

With a focus on compatibility, the key design choices here are that 1)
and 2) have the same rules shared across all AMs, and 3) is something
that sequence AMs are free to play with as they want.  Using this
concept, the contents of 3) in sequence.c are now local into the
"local" sequence AM:
- RMGR for sequences, as of xl_seq_rec and RM_SEQ_ID (renamed to use
"local" as well).
- FormData_pg_sequence_data for the local sequence metadata, including
its attributes, SEQ_COL_*, the internal routines managing rewrites of
its heap, etc.
- In sequence.c, log_cnt is not a counter, just a way to decide if a
sequence metadata should be reset or not (note that init_params() only
resets it to 0 if sequence properties are changed).
As a result, 30% of sequence.c is trimmed of its in-core AM concepts,
all moved to local.c.

While working on this patch, I've finished by keeping a focus on
dump/restore permeability and focus on being able to use nextval(),
setval(), lastval() and even pg_sequence_last_value() across all AMs
so as it makes integration with things like SERIAL or GENERATED
columns natural.  Hence, the callbacks are shaped so as these
functions are transparent across all sequence AMs.  See sequenceam.h
for the details about them, and local.c for the "local" sequence AM.

The attached patch set covers all the ground I wanted to cover with
this stuff, including dump/restore, tests, docs, compatibility, etc,
etc.  I've done a first split of things to make the review more
edible, as there are a few independent pieces I've bumped into while
implementing the callbacks.

Here are some independent refactoring pieces:
- 0001 is something to make dump/restore transparent across all
sequence AMs.  Currently, pg_dump queries sequence heap tables, but a
sequence AM may not have any storage locally, or could grab its values
from elsewhere.  pg_sequence_last_value(), a non-documented function
used for pg_sequence, is extended so as it returns a row made of
(last_value, is_called), so as it can be used for dump data, across
all AMs.
- 0002 introduces sequence_open() and sequence_close().  Like tables
and indexes, this is coupled with a relkind check, and used as the
sole way to open sequences in sequence.c.
- 0003 groups the sequence cache updates of sequence.c closer to each
other.  This stuff was hidden in the middle of unrelated computations.
- 0004 removes all traces of FormData_pg_sequence_data from
init_params(), which is used to guess the start value and is_called
for a sequence depending on its properties in the catalog pg_sequence.
- 0005 is an interesting one.  I've noticed that I wanted to attach
custom attributes to the pg_class entry of a sequence, or just not
store any attributes at *all* within it.  One approach I have
considered first is to list for the attributes to send to
DefineRelation() within each AM, but this requires an early lookup at
the sequence AM routines, which was gross.  Instead, I've chosen the
method similar to views, where attributes are added after the relation
is defined, using AlterTableInternal().  This simplifies the set of
callbacks so as initialization is in charge of creating the sequence
attributes (if necessary) and add the first batch of metadata tuple
for a sequence (again, if necessary).  The changes that reflect to
event triggers and the commands collected is also something I've
wanted, as it becomes possible to track what gets internally created
for a sequence depending on its AM (see test_ddl_deparse).

Then comes the core of the changes, with a split depending on code
paths:
- 0006 includes the backend changes, that caches a set of callback
routines for each sequence Relation, with an optional rd_tableam.
Callbacks are documented in sequenceam.h.  Perhaps the sequence RMGR
renames should be split into a patch of its own, or just let as-is as
as it could be shared across more than one AM, but I did not see a
huge argument one way or another.  The diffs are not that bad
considering that the original patch at +1200 lines for src/backend/,
with less documentation for the internal callbacks:
 45 files changed, 1414 insertions(+), 718 deletions(-)
- 0007 adds some documentation.
- 0008 adds support for dump/restore, where I have also incorporated
tests and docs.  The implementation finishes by being really
straight-forward, relying on a new option switch to control if
SET queries for sequence AMs should be dumped and/or restored,
depending ona GUC called default_sequence_access_method.
- 0009 is a short example of sequence AM, which is a kind of in-memory
sequence reset each time a new connection is made, without any
physical storage.  I am not clear yet if that's useful as local.c can
be used as a point of reference, but I was planning to include that in
one of my own repos on github like my blackhole_am.

I am adding that to the next CF.  Thoughts and comments are welcome.
--
Michael

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: shveta malik
Дата:
Сообщение: Re: Synchronizing slots from primary to standby
Следующее
От: Justin Pryzby
Дата:
Сообщение: processes stuck in shutdown following OOM/recovery