Re: Why creating GIN table index is so slow than inserting data into empty table with the same index?
От | Heikki Linnakangas |
---|---|
Тема | Re: Why creating GIN table index is so slow than inserting data into empty table with the same index? |
Дата | |
Msg-id | 49C8BEF6.3080909@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: Why creating GIN table index is so slow than inserting data into empty table with the same index? (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Why creating GIN table index is so slow than inserting data into empty table with the same index?
|
Список | pgsql-performance |
Tom Lane wrote: > Sergey Burladyan <eshkinkot@gmail.com> writes: >> show maintenance_work_mem ; >> maintenance_work_mem >> ---------------------- >> 128MB > >> create table a (i1 int, i2 int, i3 int, i4 int, i5 int, i6 int); >> insert into a select n, n, n, n, n, n from generate_series(1, 100000) as n; >> create index arr_gin on a using gin ( (array[i1, i2, i3, i4, i5, i6]) ); > > [ takes forever ] > > Seems the reason this is so awful is that the incoming data is exactly > presorted, meaning that the tree structure that ginInsertEntry() is > trying to build degenerates to a linear list (ie, every new element > becomes the right child of the prior one). So the processing becomes > O(N^2) up till you reach maintenance_work_mem and flush the tree. With > a large setting for maintenance_work_mem it gets spectacularly slow. Yes, this is probably the same issue I bumped into a while ago: http://archives.postgresql.org/message-id/49350A13.3020105@enterprisedb.com > I think a reasonable solution for this might be to keep an eye on > maxdepth and force a flush if that gets too large (more than a few > hundred, perhaps?). Something like this: > > /* If we've maxed out our available memory, dump everything to the index */ > + /* Also dump if the tree seems to be getting too unbalanced */ > - if (buildstate->accum.allocatedMemory >= maintenance_work_mem * 1024L) > + if (buildstate->accum.allocatedMemory >= maintenance_work_mem * 1024L || > + buildstate->accum.maxdepth > DEPTH_LIMIT) > { > > The new fast-insert code likely will need a similar defense. I fooled around with a balanced tree, which solved the problem but unfortunately made the unsorted case slower. Limiting the depth like that should avoid that so it's worth trying. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
В списке pgsql-performance по дате отправления: