Re: logical changeset generation v6.4
От | Hannu Krosing |
---|---|
Тема | Re: logical changeset generation v6.4 |
Дата | |
Msg-id | 526547B7.5030009@2ndQuadrant.com обсуждение исходный текст |
Ответ на | Re: logical changeset generation v6.4 (Andres Freund <andres@2ndquadrant.com>) |
Список | pgsql-hackers |
On 10/21/2013 05:06 PM, Andres Freund wrote: > On 2013-10-21 16:40:43 +0200, Hannu Krosing wrote: >> On 10/18/2013 08:50 PM, Andres Freund wrote: >>> On 2013-10-18 08:11:29 -0400, Robert Haas wrote: >> ... >>>> 2. If that seems too complicated, how about just logging the whole old >>>> tuple for version 1? >>> I think that'd make the patch much less useful because it bloats WAL >>> unnecessarily for the primary user (replication) of it. I'd rather go >>> for primary keys only if that proves to be the contentious point. >>> >>> How about modifying the selection to go from: >>> * all rows if ALTER TABLE ... REPLICA IDENTITY NOTHING|FULL; >>> * index chosen by ALTER TABLE ... REPLICA IDENTITY USING indexname >>> * [later, maybe] ALTER TABLE ... REPLICA IDENTITY (cola, colb) >>> * primary key >>> * candidate key with the smallest oid >>> >>> Including the candidate key will help people using changeset extration >>> for auditing that do not have primary key. That really isn't an >>> infrequent usecase. >> As I understand it for a table with *no* unique index, >> the "candidate key" is the full tuple, so if we get an UPDATE for >> it then this should be replicated as >> "UPDATE first row matching (NOT DISTINCT FROM) all columns" >> which on replay side will be equivalent to >> CREATE CURSOR ...; FETCH 1 ...; UPDATE ... WHERE CURRENT...' > No, it's not a candidate key since it's not uniquely identifying a > row. You can play tricks as you describe, but that still doesn't make > the whole row a candidate key. > > But anyway, I suggest allowing for logging all columns above... I the "all columns" option this ? How about modifying the selection to go from: * all rows if ALTER TABLE ... REPLICA IDENTITY NOTHING|FULL; for some reason I thought it to be option to either log or not log PK column ... > >> I know that this will slow down replication, as you can not use direct >> index updates internally - at least not easily - but need to let postgreSQL >> actually plan this, but such single row update is no faster on origin >> either. > That's not actually true. Consider somebody doing something like: > UPDATE big_table_without_indexes SET column = ...; > On the source side that's essentialy O(n). If you replicate on a > row-by-row basis it will be O(n^2) on the replay side. Probably more like O(n^2 / 2) but yes, this is what I meant with the sentence after that ;) Cheers -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ
В списке pgsql-hackers по дате отправления: