Обсуждение: [HACKERS] Separation walsender & normal backends

Поиск
Список
Период
Сортировка

[HACKERS] Separation walsender & normal backends

От
Andres Freund
Дата:
Hi,

I've for a while suspected that the separation & duplication of
infrastructure between walsenders and normal backends isn't nice.  But
looking at the code changes in v10 drove that home quite a bit.

The major changes in this area are
1) d1ecd539477fe640455dc890216a7c1561e047b4 - Add a SHOW command to the replication command language.
2) 7c4f52409a8c7d85ed169bbbc1f6092274d03920 - Logical replication support for initial data copy

The first adds SHOW support for the replication protocol. With such
oddities that "SHOW hba_file;" works, but "show hba_file;"
doesn't. Unless your're connected to a database, when feature 2) kicks
in and falls back to the normal SQL parser (as it'd for most other SQL
commands).

With 2) we can execute normal SQL over a walsender connection. That's
useful for logical rep because it needs to able to copy data (via
e.g. COPY), read catalogs (SELECT) and stream data (via
START_REPLICATION).

The problem is that we now end up with fairly similar infrastructure,
that's duplicated in walsender and normal backends. We've already seen a
couple bugs stem from this:
- parallel queries hang when executed over walsender connection
- query cancel doesn't work over walsender connection
- parser/scanner hack deciding between replication and normal grammer
  isn't correct
- SnapBuildClearExportedSnapshot isn't called reliably anymore
- config files aren't reloaded while idle in a walsender connection

while all of those are fixable, and several of them already have
proposed fixes, I think this is some indication that the current split
isn't well thought out.  I think in hindsight, the whole idea of
introducing a separate protocol/language for the replication protocol
was a rather bad one.

I think we should give up on the current course, and decide to unify as
much as possible.  Petr already proposed some fixes for the above
(unifying signal and config file handling), but I don't think that
solves the more fundamental issue of different grammars.

For me it's fairly obvious that we should try to merge the two grammars,
and live with the slight uglyness that'll caused by doing so.  The
primary problem here is that both languages "look" different, primary
that the replication protocol uses enforced-all-caps-with-underscores as
style.

Attached is a very rough POC, that unifies the two grammars.
Unsurprisingly that requires the addition of a bunch of additional
keywords, but at least they're all unreserved.  Both grammars seem to
mostly merge well, except for repl_scanner.l's RECPTR which I wasn't
able to add to scan.l within a couple minutes (therefore single quotes
are now necessary).  This really is WIP, and not meant as a patch to be
reviewed in detail, but just as something to be discussed.

I'm probably going to be tarred and feathered for this position, but I
think we need to fix this before v10 is coming out.  We're taking on
quite some architectural burden here, and I'm pretty sure we'd otherwise
have to fix it in v11, so we'll be stuck with some odd-duck v10, that
behaves different from all other versions.

If so, we'd have to:
- gate catalog access more careful in !am_db_walsender connections
- find a better way to deal with repl_scan.l's RECPTR
- remove replnodes.h (or move at least parts of it) into parsenodes.h
- do a lot of other minor cleanup

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] Separation walsender & normal backends

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> I've for a while suspected that the separation & duplication of
> infrastructure between walsenders and normal backends isn't nice.

I think we should consider a more radical solution: trying to put
general SQL query capability into the replication protocol was a
bad idea and we should revert it while we still can.  The uglinesses
you mention aren't merely implementation issues, they're an indication
that that concept is broken in itself.
        regards, tom lane



Re: [HACKERS] Separation walsender & normal backends

От
Fujii Masao
Дата:
On Tue, Apr 25, 2017 at 11:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andres Freund <andres@anarazel.de> writes:
>> I've for a while suspected that the separation & duplication of
>> infrastructure between walsenders and normal backends isn't nice.
>
> I think we should consider a more radical solution: trying to put
> general SQL query capability into the replication protocol was a
> bad idea and we should revert it while we still can.  The uglinesses
> you mention aren't merely implementation issues, they're an indication
> that that concept is broken in itself.

I think that it's worth considering this option in order to "stabilize"
logical replication stuff before the release. The table sync patch
(which allows walsender to run normal queries) introduced such
uglinesses and increased the complexity in logical rep code.
OTOH, I believe that logical replication is still useful even without
initial table sync feature. So reverting the table sync patch seems
possible idea.

Regards,

-- 
Fujii Masao



Re: [HACKERS] Separation walsender & normal backends

От
Andres Freund
Дата:
On 2017-04-25 10:34:20 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > I've for a while suspected that the separation & duplication of
> > infrastructure between walsenders and normal backends isn't nice.
> 
> I think we should consider a more radical solution: trying to put
> general SQL query capability into the replication protocol was a
> bad idea and we should revert it while we still can.  The uglinesses
> you mention aren't merely implementation issues, they're an indication
> that that concept is broken in itself.

I don't think that's the right solution, I think it's evidence that the
split was a bad idea in the first place.  There's a growing amount of
duplication between the protocols, and a growing number of use-cases
that need facilities of both protocols.  We e.g. now already have SHOW
statements in both, except that they behave slightly differently.  For
logical rep we'd alternatively add more complexity because we'd need
both replication and non-replication connections (to stream changes, to
copy tables, to query config), which'd also complicate administration
because users & hba config have to be setup so the same user can connect
over both.

Therefore I think it's the implementation that's not perfect, but the
idea is perfectly sound.  Having two awkwardly & increasingly different
languages in postgres doesn't sound like a sound idea.

- Andres



Re: [HACKERS] Separation walsender & normal backends

От
Petr Jelinek
Дата:
On 25/04/17 17:13, Fujii Masao wrote:
> On Tue, Apr 25, 2017 at 11:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Andres Freund <andres@anarazel.de> writes:
>>> I've for a while suspected that the separation & duplication of
>>> infrastructure between walsenders and normal backends isn't nice.
>>
>> I think we should consider a more radical solution: trying to put
>> general SQL query capability into the replication protocol was a
>> bad idea and we should revert it while we still can.  The uglinesses
>> you mention aren't merely implementation issues, they're an indication
>> that that concept is broken in itself.
> 
> I think that it's worth considering this option in order to "stabilize"
> logical replication stuff before the release. The table sync patch
> (which allows walsender to run normal queries) introduced such
> uglinesses and increased the complexity in logical rep code.
> OTOH, I believe that logical replication is still useful even without
> initial table sync feature. So reverting the table sync patch seems
> possible idea.
> 

I don't think that's good idea, the usefulness if much lower without the
initial copy. The original patch for this added new commands to
replication protocol, adding generic SQL interface was result of request
in the reviews.

I personally don't mind moving back my original idea of special commands
if that was the consensus, but previous consensus was to do SQL instead.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: [HACKERS] Separation walsender & normal backends

От
"David G. Johnston"
Дата:
On Tue, Apr 25, 2017 at 2:24 PM, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote:
On 25/04/17 17:13, Fujii Masao wrote:
> On Tue, Apr 25, 2017 at 11:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> OTOH, I believe that logical replication is still useful even without
> initial table sync feature. So reverting the table sync patch seems
> possible idea.
>

I don't think that's good idea, the usefulness if much lower without the
initial copy. The original patch for this added new commands to
replication protocol, adding generic SQL interface was result of request
in the reviews.

Haven't followed this feature closely but my first thoughts when recently reading about it were related to the initial copy and table synchronization - so I'd have to agree with Petr here.  Full table sync is big and for any table with activity on it the confidence level of knowing you have everything is greatly reduced if the system isn't making a guarantee.

David J.

Re: [HACKERS] Separation walsender & normal backends

От
Andres Freund
Дата:
On 2017-04-25 23:24:40 +0200, Petr Jelinek wrote:
> On 25/04/17 17:13, Fujii Masao wrote:
> > On Tue, Apr 25, 2017 at 11:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Andres Freund <andres@anarazel.de> writes:
> >>> I've for a while suspected that the separation & duplication of
> >>> infrastructure between walsenders and normal backends isn't nice.
> >>
> >> I think we should consider a more radical solution: trying to put
> >> general SQL query capability into the replication protocol was a
> >> bad idea and we should revert it while we still can.  The uglinesses
> >> you mention aren't merely implementation issues, they're an indication
> >> that that concept is broken in itself.
> > 
> > I think that it's worth considering this option in order to "stabilize"
> > logical replication stuff before the release. The table sync patch
> > (which allows walsender to run normal queries) introduced such
> > uglinesses and increased the complexity in logical rep code.
> > OTOH, I believe that logical replication is still useful even without
> > initial table sync feature. So reverting the table sync patch seems
> > possible idea.
> > 
> 
> I don't think that's good idea, the usefulness if much lower without the
> initial copy.

Agreed. I think that'd move us way backwards, and we'd have to tackle
exactly the same issue in a few weeks again.

> The original patch for this added new commands to
> replication protocol, adding generic SQL interface was result of request
> in the reviews.

Yea, I still think it's the right approach in general - I don't think
the patch itself was properly discussed and such though, being
essentially burried in another commit.


> I personally don't mind moving back my original idea of special commands
> if that was the consensus, but previous consensus was to do SQL instead.

I really don't think that'll solve anything.

- Andres



Re: [HACKERS] Separation walsender & normal backends

От
Craig Ringer
Дата:
On 26 April 2017 at 02:36, Andres Freund <andres@anarazel.de> wrote:

> For
> logical rep we'd alternatively add more complexity because we'd need
> both replication and non-replication connections (to stream changes, to
> copy tables, to query config), which'd also complicate administration
> because users & hba config have to be setup so the same user can connect
> over both.

We have experience of this from BDR and pglogical, of course, and it's
definitely a pain and source of user misconfiguration.

I'm not sure how practical it is to merge walsenders and regular
backends at this stage of pg10 development, but it seems like a
worthwhile goal down the track. I'd very much like to reduce the
amount of magic global juggling done by the walsender, unify the
XLogRead paths, unify the timeline following logic for physical
walsenders, logical walsenders and logical decoding on normal
backends, allow normal backends to be signaled when there's new WAL,
etc. I think there's a fair bit to do in order to do this well though.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Separation walsender & normal backends

От
Andres Freund
Дата:
On 2017-04-26 08:41:46 +0800, Craig Ringer wrote:
> I'd very much like to reduce the amount of magic global juggling done
> by the walsender, unify the XLogRead paths, unify the timeline
> following logic for physical walsenders, logical walsenders and
> logical decoding on normal backends, allow normal backends to be
> signaled when there's new WAL, etc. I think there's a fair bit to do
> in order to do this well though.

That all seems orthogonal however.

- Andres