Re: Make COPY format extendable: Extract COPY TO format implementations
От | Michael Paquier |
---|---|
Тема | Re: Make COPY format extendable: Extract COPY TO format implementations |
Дата | |
Msg-id | ZcGcQq3Y6FeX1z4e@paquier.xyz обсуждение исходный текст |
Ответ на | Re: Make COPY format extendable: Extract COPY TO format implementations (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: Make COPY format extendable: Extract COPY TO format implementations
|
Список | pgsql-hackers |
On Mon, Feb 05, 2024 at 05:41:25PM -0800, Andres Freund wrote: > On 2024-02-06 10:01:36 +0900, Michael Paquier wrote: >> If you have concerns about that, I'm OK to revert, I'm not wedded to >> this level of control. Note that I've actually seen *better* >> runtimes. > > I'm somewhat worried that handling the different formats at that level will > make it harder to improve copy performance - it's quite attrociously slow > right now. The more we reduce the per-row/field overhead, the more the > dispatch overhead will matter. Yep. That's the hard part when it comes to design these callbacks. We don't want something too high level because this leads to more code duplication churns when someone wants to plug in its own routine set, and we don't want to be at a too low level because of the indirect calls as you said. I'd like to think that the current CopyFromOneRow offers a good balance here, avoiding the "if" branch with the binary and non-binary paths. >> Hmm. Do you have concerns about v13 posted on [2] then? > > As is I'm indeed not a fan. It imo doesn't make sense to have an indirect > dispatch for *both* ->copy_attribute_out *and* ->CopyToOneRow. After all, when > in ->CopyToOneRow for text, we could know that we need to call > CopyAttributeOutText etc. Right. >> If yes, then I'd assume that this shuts down the whole thread or that it >> needs a completely different approach, because we will multiply indirect >> function calls that can control how data is generated for each row, which is >> the original case that Sutou-san wanted to tackle. > > I think it could be rescued fairly easily - remove the dispatch via > ->copy_attribute_out(). To avoid duplicating code you could use a static > inline function that's used with constant arguments by both csv and text mode. Hmm. So you basically mean to tweak the beginning of CopyToTextOneRow() and CopyToTextStart() so as copy_attribute_out is saved in a local variable outside of cstate and we'd save the "if" checked for each attribute. If I got that right, it would mean something like the v13-0002 attached, on top of the v13-0001 of upthread. Is that what you meant? > I think it might also be worth ensuring that future patches can move branches > like > if (cstate->encoding_embeds_ascii) > if (cstate->need_transcoding) > into the choice of per-row callback. Yeah, I'm still not sure how much we should split CopyToStateData in the initial patch set. I'd like to think that the best result would be to have in the state data an opaque (void *) that points to a structure that can be set for each format, so as there is a clean split between which variable gets set and used where (same remark applies to COPY FROM with its raw_fields, raw_fields, for example). -- Michael
Вложения
В списке pgsql-hackers по дате отправления: