Re: Postgres-R: tuple serialization
От | Decibel! |
---|---|
Тема | Re: Postgres-R: tuple serialization |
Дата | |
Msg-id | D697DBAF-D495-4602-A1EF-E013DB804B0F@decibel.org обсуждение исходный текст |
Ответ на | Postgres-R: tuple serialization (Markus Wanner <markus@bluegap.ch>) |
Ответы |
Re: Postgres-R: tuple serialization
|
Список | pgsql-hackers |
On Jul 22, 2008, at 3:04 AM, Markus Wanner wrote: > yesterday, I promised to outline the requirements of Postgres-R for > tuple serialization, which we have been talking about before. There > are basically three types of how to serialize tuple changes, > depending on whether they originate from an INSERT, UPDATE or > DELETE. For updates and deletes, it saves the old pkey as well as > the origin (a global transaction id) of the tuple (required for > consistent serialization on remote nodes). For inserts and updates, > all added or changed attributes need to be serialized as well. > > pkey+origin changes > INSERT - x > UPDATE x x > DELETE x - > > Note, that the pkey attributes may never be null, so an isnull bit > field can be skipped for those attributes. For the insert case, all > attributes (including primary key attributes) are serialized. > Updates require an additional bit field (well, I'm using chars ATM) > to store which attributes have changed. Only those should be > transferred. > > I'm tempted to unify that, so that inserts are serialized as the > difference against the default vaules or NULL. That would make > things easier for Postgres-R. However, how about other uses of such > a fast tuple applicator? Does such a use case exist at all? I mean, > for parallelizing COPY FROM STDIN, one certainly doesn't want to > serialize all input tuples into that format before feeding multiple > helper backends. Instead, I'd recommend letting the helper backends > do the parsing and therefore parallelize that as well. > > For other features, like parallel pg_dump or even parallel query > execution, this tuple serialization code doesn't help much, IMO. So > I'm thinking that optimizing it for Postgres-R's internal use is > the best way to go. > > Comments? Opinions? ISTM that both londiste and Slony would be able to make use of these improvements as well. A modular replication system should be able to use a variety of methods for logging data changes and then applying them on a subscriber, so long as some kind of common transport can be agreed upon (such as text). So having a change capture and apply mechanism that isn't dependent on a lot of extra stuff would be generally useful to any replication mechanism. -- Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828
В списке pgsql-hackers по дате отправления: