Re: New Table Access Methods for Multi and Single Inserts
От | Andres Freund |
---|---|
Тема | Re: New Table Access Methods for Multi and Single Inserts |
Дата | |
Msg-id | 20230603223824.o7iyochli2dwwi7k@alap3.anarazel.de обсуждение исходный текст |
Ответ на | Re: New Table Access Methods for Multi and Single Inserts (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>) |
Ответы |
Re: New Table Access Methods for Multi and Single Inserts
(Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Re: New Table Access Methods for Multi and Single Inserts (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>) |
Список | pgsql-hackers |
Hi, This patch was referenced in a discussion at pgcon, so I thought I'd give it a look, even though Bharat said that he won't have time to drive it forward... On 2021-04-19 10:21:36 +0530, Bharath Rupireddy wrote: > diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c > index bd5faf0c1f..655de8e6b7 100644 > --- a/src/backend/access/heap/heapam_handler.c > +++ b/src/backend/access/heap/heapam_handler.c > @@ -2558,6 +2558,11 @@ static const TableAmRoutine heapam_methods = { > .tuple_insert_speculative = heapam_tuple_insert_speculative, > .tuple_complete_speculative = heapam_tuple_complete_speculative, > .multi_insert = heap_multi_insert, > + .tuple_insert_begin = heap_insert_begin, > + .tuple_insert_v2 = heap_insert_v2, > + .multi_insert_v2 = heap_multi_insert_v2, > + .multi_insert_flush = heap_multi_insert_flush, > + .tuple_insert_end = heap_insert_end, > .tuple_delete = heapam_tuple_delete, > .tuple_update = heapam_tuple_update, > .tuple_lock = heapam_tuple_lock, I don't think we should have multiple callback for the insertion APIs in tableam.h. I think it'd be good to continue supporting the old table_*() functions, but supporting multiple insert APIs in each AM doesn't make much sense to me. > +/* > + * GetTupleSize - Compute the tuple size given a table slot. > + * > + * For heap tuple, buffer tuple and minimal tuple slot types return the actual > + * tuple size that exists. For virtual tuple, the size is calculated as the > + * slot does not have the tuple size. If the computed size exceeds the given > + * maxsize for the virtual tuple, this function exits, not investing time in > + * further unnecessary calculation. > + * > + * Important Notes: > + * 1) Size calculation code for virtual slots is being used from > + * tts_virtual_materialize(), hence ensure to have the same changes or fixes > + * here and also there. > + * 2) Currently, GetTupleSize() handles the existing heap, buffer, minimal and > + * virtual slots. Ensure to add related code in case any new slot type is > + * introduced. > + */ > +inline Size > +GetTupleSize(TupleTableSlot *slot, Size maxsize) > +{ > + Size sz = 0; > + HeapTuple tuple = NULL; > + > + if (TTS_IS_HEAPTUPLE(slot)) > + tuple = ((HeapTupleTableSlot *) slot)->tuple; > + else if(TTS_IS_BUFFERTUPLE(slot)) > + tuple = ((BufferHeapTupleTableSlot *) slot)->base.tuple; > + else if(TTS_IS_MINIMALTUPLE(slot)) > + tuple = ((MinimalTupleTableSlot *) slot)->tuple; > + else if(TTS_IS_VIRTUAL(slot)) I think this embeds too much knowledge of the set of slot types in core code. I don't see why it's needed either? > diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h > index 414b6b4d57..2a1470a7b6 100644 > --- a/src/include/access/tableam.h > +++ b/src/include/access/tableam.h > @@ -229,6 +229,32 @@ typedef struct TM_IndexDeleteOp > TM_IndexStatus *status; > } TM_IndexDeleteOp; > > +/* Holds table insert state. */ > +typedef struct TableInsertState I suspect we should design it to be usable for updates and deletes in the future, and thus name it TableModifyState. > +{ > + Relation rel; > + /* Bulk insert state if requested, otherwise NULL. */ > + struct BulkInsertStateData *bistate; > + CommandId cid; Hm - I'm not sure it's a good idea to force the cid to be the same for all inserts done via one TableInsertState. > + int options; > + /* Below members are only used for multi inserts. */ > + /* Array of buffered slots. */ > + TupleTableSlot **mi_slots; > + /* Number of slots that are currently buffered. */ > + int32 mi_cur_slots; > + /* > + * Access method specific information such as parameters that are needed > + * for buffering and flushing decisions can go here. > + */ > + void *mistate; I think we should instead have a generic TableModifyState, which each AM then embeds into an AM specific AM state. Forcing two very related structs to be allocated separately doesn't seem wise in this case. > @@ -1430,6 +1473,50 @@ table_multi_insert(Relation rel, TupleTableSlot **slots, int nslots, > cid, options, bistate); > } > > +static inline TableInsertState* > +table_insert_begin(Relation rel, CommandId cid, int options, > + bool alloc_bistate, bool is_multi) Why have alloc_bistate and options? > +static inline void > +table_insert_end(TableInsertState *state) > +{ > + /* Deallocate bulk insert state here, since it's AM independent. */ > + if (state->bistate) > + FreeBulkInsertState(state->bistate); > + > + state->rel->rd_tableam->tuple_insert_end(state); > +} Seems like the order in here should be swapped? Greetings, Andres Freund
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Michael PaquierДата:
Сообщение: Re: [PATCH] Slight improvement of worker_spi.c example