Re: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows
От | Pawel Kudzia |
---|---|
Тема | Re: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows |
Дата | |
Msg-id | CAJYBUS_5Z+9aHyNQ7wkTEkZF+c_RMn7hMjPkTGU0Uc7OHMU3sQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows (Pawel Kudzia <kudzia@gmail.com>) |
Ответы |
Re: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows
|
Список | pgsql-bugs |
Based on the suggestions we've received from IRC we've: * Moved some of our postgresql instances to another physical server, * Enabled data_checksums on all instances, * Upgraded upgraded our servers to 13.3 (Debian 13.3-1.pgdg100+1) with hope that https://github.com/postgres/postgres/commit/0d779d22a290a89b6c892137a37280b9588ad0cc addresses the issue [ although it's ts-vector specific, so unlikely to be related to our issue ]. Yet soon after we've observe another inconsistency. Recap of the problem - SELECTs return rows, based on GIN, that do not actually meet criteria expressed in WHERE. SELECT entity_id,attribute_name_ids FROM entity WHERE ( attribute_name_ids && '{7572}' ) AND NOT ( (attribute_name_ids||0) && '{7572}') LIMIT 100 ; entity_id | attribute_name_ids -----------+---------------------------------------------------------------------------------------- 22327791 | {1,2,3,4,6,8,9,10,11,13,14,17,19,21,35,72,366,1659,2208,2211,3270,3273,3279,5787,7650} (1 row) EXPLAIN ANALYZE of the query above: QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=35.46..46.67 rows=100 width=132) (actual time=307.221..335.842 rows=1 loops=1) -> Bitmap Heap Scan on entity (cost=35.46..3705.32 rows=32724 width=132) (actual time=307.220..335.839 rows=1 loops=1) Recheck Cond: (attribute_name_ids && '{7572}'::integer[]) Rows Removed by Index Recheck: 72012 Filter: (NOT ((attribute_name_ids || 0) && '{7572}'::integer[])) Rows Removed by Filter: 21501 Heap Blocks: exact=8998 lossy=9257 -> Bitmap Index Scan on entity_attribute_name_ids_gin (cost=0.00..35.46 rows=32889 width=0) (actual time=19.790..19.790 rows=115485 loops=1) Index Cond: (attribute_name_ids && '{7572}'::integer[]) Planning Time: 0.068 ms Execution Time: 335.879 ms (11 rows) Relevant part of the table structure: CREATE TABLE public.entity ( entity_id bigint NOT NULL, attribute_value_ids integer[] NOT NULL, attribute_name_ids integer[] NOT NULL, [..] ); ALTER TABLE ONLY public.entity ADD CONSTRAINT entity_pkey PRIMARY KEY (entity_id); CREATE INDEX entity_attribute_name_ids_gin ON public.entity USING gin (attribute_name_ids public.gin__int_ops); CREATE INDEX entity_attribute_value_ids_gin ON public.entity USING gin (attribute_value_ids public.gin__int_ops); There are at leat tens of updates / inserts per second to the table all the time. the issue above manifests very rarely, after few days if not weeks of uptime. we did not find any deterministic way of reproducing it, but it's a matter of time till it shows. The issue persist is related to on-disk data, it replicates from streaming replication masters to slaves. Thank you in advance for your suggestions how to tackle this. -- regards, Pawel Kudzia
В списке pgsql-bugs по дате отправления: