Re: row filtering for logical replication
От | Andres Freund |
---|---|
Тема | Re: row filtering for logical replication |
Дата | |
Msg-id | 20220203182922.344fhhqzjp2ah6yp@alap3.anarazel.de обсуждение исходный текст |
Ответ на | Re: row filtering for logical replication (Peter Smith <smithpb2250@gmail.com>) |
Ответы |
Re: row filtering for logical replication
|
Список | pgsql-hackers |
Hi, On 2022-02-01 13:31:36 +1100, Peter Smith wrote: > TEST STEPS - Workload case a > > 1. Run initdb pub and sub and start both postgres instances (use the nosync postgresql.conf) > > 2. Run psql for both instances and create tables > CREATE TABLE test (key int, value text, data jsonb, PRIMARY KEY(key, value)); > > 3. create the PUBLISHER on pub instance (e.g. choose from below depending on filter) > CREATE PUBLICATION pub_1 FOR TABLE test; -- 100% (no filter) > CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 0); -- 100% allowed > CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 250000); -- 75% allowed > CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 500000); -- 50% allowed > CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 750000); -- 25% allowed > CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 1000000); -- 0% allowed > > 4. create the SUBSCRIBER on sub instance > CREATE SUBSCRIPTION sync_sub CONNECTION 'host=127.0.0.1 port=5432 dbname=postgres application_name=sync_sub' PUBLICATIONpub_1; > > 5. On pub side modify the postgresql.conf on the publisher side and restart > \q quite psql > edit synchronous_standby_names = 'sync_sub' > restart the pub instance > > 6. Run psql (pub side) and perform the test run. > \timing > INSERT INTO test SELECT i, i::text, row_to_json(row(i)) FROM generate_series(1,1000001)i; > select count(*) from test; > TRUNCATE test; > select count(*) from test; > repeat 6 for each test run. I think think using syncrep as the mechanism for benchmarking the decoding side makes the picture less clear than it could be - you're measuring a lot of things other than the decoding. E.g. the overhead of applying those changes. I think it'd be more accurate to do something like: /* create publications, table, etc */ -- create a slot from before the changes SELECT pg_create_logical_replication_slot('origin', 'pgoutput'); /* the changes you're going to measure */ -- save end LSN SELECT pg_current_wal_lsn(); -- create a slot for pg_recvlogical to consume SELECT * FROM pg_copy_logical_replication_slot('origin', 'consume'); -- benchmark, endpos is from pg_current_wal_lsn() above time pg_recvlogical -S consume --endpos 0/2413A720 --start -o proto_version=3 -o publication_names=pub_1 -f /dev/null -dpostgres -- clean up SELECT pg_drop_replication_slot('consume'); Then repeat this with the different publications and compare the time taken for the pg_recvlogical. That way the WAL is exactly the same, there is no overhead of actually doing anything with the data on the other side, etc. Greetings, Andres Freund
В списке pgsql-hackers по дате отправления: