Re: Hash index build patch has *worse* performance at small table sizes
От | Bruce Momjian |
---|---|
Тема | Re: Hash index build patch has *worse* performance at small table sizes |
Дата | |
Msg-id | 200806212234.m5LMYl114305@momjian.us обсуждение исходный текст |
Ответ на | Hash index build patch has *worse* performance at small table sizes (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Hash index build patch has *worse* performance at small table sizes
|
Список | pgsql-hackers |
Did we ever do anything about this? --------------------------------------------------------------------------- Tom Lane wrote: > I've been reviewing the hash index build patch submitted here: > http://archives.postgresql.org/pgsql-patches/2007-10/msg00154.php > > Although it definitely helps on large indexes, it's actually > counterproductive on not-so-large ones. The test case I'm using > is random integers generated like this: > create table foo as select (random() * N)::int as f1 > from generate_series(1,N); > select count(*) from foo; -- force hint bit updates > checkpoint; > then timing > create index fooi on foo using hash(f1); > > Using all-default configuration settings on some not-very-new hardware, > at N = 1E6 I see > > 8.3.1: 30 sec > With pre-expansion of index (CVS HEAD): 24 sec > With sorting: 72 sec > To build a btree index on same data: 34 sec > > Now this isn't amazingly surprising, because the original argument for > doing sorting was to improve locality of access to the index during > the build, and that only matters if you've got an index significantly > bigger than memory. If the index fits in RAM then the sort is pure > overhead. > > The obvious response to this is to use the sorting approach only when > the estimated index size exceeds some threshold. One possible choice of > threshold would be shared_buffers (or temp_buffers for a temp index) > but I think that is probably too conservative, because in most scenarios > the kernel's disk cache is available too. Plus you can't tweak that > setting without a postmaster restart. I'm tempted to use > effective_cache_size, which attempts to measure an appropriate number > and can be set locally within the session doing the CREATE INDEX if > necessary. Or we could invent a new GUC parameter, but that is probably > overkill. > > Comments? > > regards, tom lane > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
В списке pgsql-hackers по дате отправления: