Re: Postgres-R: tuple serialization

Поиск

Список

Период

Сортировка

От	Decibel!
Тема	Re: Postgres-R: tuple serialization
Дата	22 июля 2008 г. 18:36:44
Msg-id	D697DBAF-D495-4602-A1EF-E013DB804B0F@decibel.org обсуждение исходный текст
Ответ на	Postgres-R: tuple serialization (Markus Wanner <markus@bluegap.ch>)
Ответы	Re: Postgres-R: tuple serialization
Список	pgsql-hackers

Дерево обсуждения

On Jul 22, 2008, at 3:04 AM, Markus Wanner wrote:
> yesterday, I promised to outline the requirements of Postgres-R for  
> tuple serialization, which we have been talking about before. There  
> are basically three types of how to serialize tuple changes,  
> depending on whether they originate from an INSERT, UPDATE or  
> DELETE. For updates and deletes, it saves the old pkey as well as  
> the origin (a global transaction id) of the tuple (required for  
> consistent serialization on remote nodes). For inserts and updates,  
> all added or changed attributes need to be serialized as well.
>
>            pkey+origin    changes
>   INSERT        -            x
>   UPDATE        x            x
>   DELETE        x            -
>
> Note, that the pkey attributes may never be null, so an isnull bit  
> field can be skipped for those attributes. For the insert case, all  
> attributes (including primary key attributes) are serialized.  
> Updates require an additional bit field (well, I'm using chars ATM)  
> to store which attributes have changed. Only those should be  
> transferred.
>
> I'm tempted to unify that, so that inserts are serialized as the  
> difference against the default vaules or NULL. That would make  
> things easier for Postgres-R. However, how about other uses of such  
> a fast tuple applicator? Does such a use case exist at all? I mean,  
> for parallelizing COPY FROM STDIN, one certainly doesn't want to  
> serialize all input tuples into that format before feeding multiple  
> helper backends. Instead, I'd recommend letting the helper backends  
> do the parsing and therefore parallelize that as well.
>
> For other features, like parallel pg_dump or even parallel query  
> execution, this tuple serialization code doesn't help much, IMO. So  
> I'm thinking that optimizing it for Postgres-R's internal use is  
> the best way to go.
>
> Comments? Opinions?

ISTM that both londiste and Slony would be able to make use of these  
improvements as well. A modular replication system should be able to  
use a variety of methods for logging data changes and then applying  
them on a subscriber, so long as some kind of common transport can be  
agreed upon (such as text). So having a change capture and apply  
mechanism that isn't dependent on a lot of extra stuff would be  
generally useful to any replication mechanism.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Postgres-R: tuple serialization