Обсуждение: pgsql: Bloom index contrib module
Bloom index contrib module Module provides new access method. It is actually a simple Bloom filter implemented as pgsql's index. It could give some benefits on search with large number of columns. Module is a single way to test generic WAL interface committed earlier. Author: Teodor Sigaev, Alexander Korotkov Reviewers: Aleksander Alekseev, Michael Paquier, Jim Nasby Branch ------ master Details ------- http://git.postgresql.org/pg/commitdiff/9ee014fc899a28a198492b074e32b60ed8915ea9 Modified Files -------------- contrib/Makefile | 1 + contrib/bloom/.gitignore | 4 + contrib/bloom/Makefile | 24 ++ contrib/bloom/blcost.c | 48 ++++ contrib/bloom/blinsert.c | 313 ++++++++++++++++++++++++++ contrib/bloom/bloom--1.0.sql | 19 ++ contrib/bloom/bloom.control | 5 + contrib/bloom/bloom.h | 178 +++++++++++++++ contrib/bloom/blscan.c | 175 +++++++++++++++ contrib/bloom/blutils.c | 463 +++++++++++++++++++++++++++++++++++++++ contrib/bloom/blvacuum.c | 212 ++++++++++++++++++ contrib/bloom/blvalidate.c | 220 +++++++++++++++++++ contrib/bloom/expected/bloom.out | 122 +++++++++++ contrib/bloom/sql/bloom.sql | 47 ++++ contrib/bloom/t/001_wal.pl | 75 +++++++ doc/src/sgml/bloom.sgml | 218 ++++++++++++++++++ doc/src/sgml/contrib.sgml | 1 + doc/src/sgml/filelist.sgml | 1 + 18 files changed, 2126 insertions(+)
On 2016-04-01 15:49, Teodor Sigaev wrote: > Bloom index contrib module > > Module provides new access method. It is actually a simple Bloom filter > implemented as pgsql's index. It could give some benefits on search > with large number of columns. > > doc/src/sgml/bloom.sgml | 218 ++++++++++++++++++ I edited the bloom.sgml text a bit. Great stuff, thanks! Erik Rijkers
Вложения
Several non-x86 members of pgbuildfarm aren't happy with it, we are investigating the problem Teodor Sigaev wrote: > Bloom index contrib module > > Module provides new access method. It is actually a simple Bloom filter > implemented as pgsql's index. It could give some benefits on search > with large number of columns. > > Module is a single way to test generic WAL interface committed earlier. > > Author: Teodor Sigaev, Alexander Korotkov > Reviewers: Aleksander Alekseev, Michael Paquier, Jim Nasby > > Branch > ------ > master > > Details > ------- > http://git.postgresql.org/pg/commitdiff/9ee014fc899a28a198492b074e32b60ed8915ea9 > > Modified Files > -------------- > contrib/Makefile | 1 + > contrib/bloom/.gitignore | 4 + > contrib/bloom/Makefile | 24 ++ > contrib/bloom/blcost.c | 48 ++++ > contrib/bloom/blinsert.c | 313 ++++++++++++++++++++++++++ > contrib/bloom/bloom--1.0.sql | 19 ++ > contrib/bloom/bloom.control | 5 + > contrib/bloom/bloom.h | 178 +++++++++++++++ > contrib/bloom/blscan.c | 175 +++++++++++++++ > contrib/bloom/blutils.c | 463 +++++++++++++++++++++++++++++++++++++++ > contrib/bloom/blvacuum.c | 212 ++++++++++++++++++ > contrib/bloom/blvalidate.c | 220 +++++++++++++++++++ > contrib/bloom/expected/bloom.out | 122 +++++++++++ > contrib/bloom/sql/bloom.sql | 47 ++++ > contrib/bloom/t/001_wal.pl | 75 +++++++ > doc/src/sgml/bloom.sgml | 218 ++++++++++++++++++ > doc/src/sgml/contrib.sgml | 1 + > doc/src/sgml/filelist.sgml | 1 + > 18 files changed, 2126 insertions(+) > > -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Teodor Sigaev <teodor@sigaev.ru> writes:
> Bloom index contrib module
skink provided some pretty suggestive evidence about why this
is unstable:
==32446== VALGRINDERROR-BEGIN
==32446== Conditional jump or move depends on uninitialised value(s)
==32446== at 0x4E2E71: writeDelta (generic_xlog.c:137)
==32446== by 0x4E341E: GenericXLogFinish (generic_xlog.c:313)
==32446== by 0x14E83324: blbulkdelete (blvacuum.c:149)
==32446== by 0x4BCEE7: index_bulk_delete (indexam.c:627)
==32446== by 0x5DE577: lazy_vacuum_index (vacuumlazy.c:1581)
==32446== by 0x5DFB52: lazy_scan_heap (vacuumlazy.c:1273)
==32446== by 0x5E03AA: lazy_vacuum_rel (vacuumlazy.c:249)
==32446== by 0x5DC7B7: vacuum_rel (vacuum.c:1375)
==32446== by 0x5DD5F7: vacuum (vacuum.c:296)
==32446== by 0x693B71: autovacuum_do_vac_analyze (autovacuum.c:2807)
==32446== by 0x695B2A: do_autovacuum (autovacuum.c:2328)
==32446== by 0x696055: AutoVacWorkerMain (autovacuum.c:1647)
==32446== Uninitialised value was created by a stack allocation
==32446== at 0x14E82CAB: blbulkdelete (blvacuum.c:36)
==32446==
==32446== VALGRINDERROR-END
regards, tom lane
On 2016-04-01 14:36, Erik Rijkers wrote:
> On 2016-04-01 15:49, Teodor Sigaev wrote:
>> Bloom index contrib module
>>
>> doc/src/sgml/bloom.sgml | 218 ++++++++++++++++++
>
The size of example table (in bloom.sgml):
CREATE TABLE tbloom AS
SELECT
random()::int as i1,
random()::int as i2,
[...]
random()::int as i12,
random()::int as i13
FROM
generate_series(1,1000);
seems too small to demonstrate the index-use.
For me, both on $BigServer at work as on $ModestDesktop at home the 1000
rows are not enough.
I suggest making the rowcount in that example a larger, for instance
10000, so: generate_series(1,10000).
Does that make sense? I realize the behavior is probably somewhat
dependent from hardware and settings...
thanks,
Erik Rijkers