Re: Why does the query planner use two full indexes, when a dedicated partial index exists?
От | Richard Neill |
---|---|
Тема | Re: Why does the query planner use two full indexes, when a dedicated partial index exists? |
Дата | |
Msg-id | 50D5EDDF.6010702@richardneill.org обсуждение исходный текст |
Ответ на | Re: Why does the query planner use two full indexes, when a dedicated partial index exists? (Jeff Janes <jeff.janes@gmail.com>) |
Ответы |
Re: Why does the query planner use two full indexes, when a
dedicated partial index exists?
|
Список | pgsql-performance |
> I've now installed 9.2. As you said, thanks to the change in 9.2 it > initially prefers the partial index. > > BUT, after 1 cycle of inserting 500k rows, then deleting them all, > then starting to insert again, I find that the planner has reverted > to the former bad behaviour. > > > Presumably the real work load has this type of turn over happen one row > at a time, rather than all in one giant mass update transaction, right? > That makes a big difference in the way space is re-used. Sorry - I meant a "real" workload here. I replayed a whole day's worth of real data into the DB, and that's what I meant by a cycle. Everything was row-at-a-time. (It currently takes about an hour to do this) > > Reindexing only takes a couple of seconds, and restores correctness. > > > Even your slow query is pretty fast. If you can't afford that, can you > afford to take an exclusive lock for a couple of seconds every few minutes? Yes, I can. If that's the root cause, I'll do that. But it seems to me that I've stumbled upon some rather awkward behaviour that I need to understand fully, and if the index is bloating that badly and that quickly, then perhaps it's a PG bug (or at least cause for a logfile warning). BTW, The index has gone from 16kB to 4.5MB in 6 hours of runtime today. It still only has 252 matching rows. > What's going on? Do I need to run reindex in a cron-job? I thought > that reindex wasn't "normally" needed, and that index bloat happened > only after every row had changed value hundreds of times. > > > The partial index is highly leveraged. If every tuple in the table is > updated once, that amounts to every tuple in the index being updated > 25,000 times. How so? That sounds like O(n_2) behaviour. > > For the same reason, it is probably not getting vacuum often enough. > The default settings have the table vacuumed once 20% of its rows > turns over, but that means the partial index has been turned over many > many times. You could crank down the auto-vacuum settings for that > table, or run manual vacuum with a cron job. > > Vacuum will not unbloat the index, but if you run it often enough it > will keep the bloat from getting too bad in the first place. Thanks. I've reduced autovacuum_vacuum_scale_factor from 0.2 to 0.05 (and set autovacuum_analyze_scale_factor = 0.05 for good measure) As I understand it, both of these can run in parallel, and I have 7 cores usually idle, while the other is maxed out. > But what I think I'd do is change one of your full indexes to contain > the other column as well, and get rid of the partial index. It might > not be quite as efficient as the partial index might theoretically be, > but it should be pretty good and also be less fragile. I'll try that. Thanks, Richard
В списке pgsql-performance по дате отправления: