Re: Fillfactor for GIN indexes
От | Tomas Vondra |
---|---|
Тема | Re: Fillfactor for GIN indexes |
Дата | |
Msg-id | 54EC8779.6000103@2ndquadrant.com обсуждение исходный текст |
Ответ на | Re: Fillfactor for GIN indexes (Alexander Korotkov <aekorotkov@gmail.com>) |
Ответы |
Re: Fillfactor for GIN indexes
|
Список | pgsql-hackers |
Hi, I've been wondering whether this might improve behavior with one of my workloads, suffering by GIN bloat - the same one I used to test GIN fastscan, for example. It's a batch process that loads a mailing list archive into a table with a GIN index on message body, by doing something like this: for file in files: BEGIN; for message in file: SAVEPOINT s; INSERT INTO messages VALUES (...) if error: ROLLBACK TO s; COMMIT; And there are multiple processes, each processing subset of mbox files. There are ~1M messages and right after the load I see this: List of relations Schema | Name | Type | Owner | Table | Size --------+------------------+-------+-------+----------+--------- public | message_body_idx | index | tomas | messages |2247 MB (1 row) and after VACUUM FULL: List of relations Schema | Name | Type | Owner | Table | Size --------+------------------+-------+-------+----------+--------- public | message_body_idx | index | tomas | messages |403 MB (1 row) So the index is ~5x larger, which is probably expected due to the amount of random inserts within a very short time (~15 minutes), executed in parallel. I hoped lowering the fillfactor will improve this, but fillfactor=75 had pretty much no effect in this case. Is that expected for this kind of workload? I see the previous discussion talked about random updates, not inserts, so maybe that's the culprit? -- Tomas Vondra http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления: