Re: BRIN indexes

Поиск
Список
Период
Сортировка
От Felipe Santos
Тема Re: BRIN indexes
Дата
Msg-id CAPYcRiXJTf8Zk+vXdk5fNFjCNm_=9A6uHvGmGU6=F0QG5MTq0A@mail.gmail.com
обсуждение исходный текст
Ответ на BRIN indexes  (Melvin Davidson <melvin6925@gmail.com>)
Ответы Re: BRIN indexes
Список pgsql-general


2016-01-28 16:33 GMT-02:00 Igor Neyman <ineyman@perceptron.com>:

 

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Felipe Santos
Sent: Thursday, January 28, 2016 1:17 PM
To: Joshua D. Drake <jd@commandprompt.com>
Cc: Melvin Davidson <melvin6925@gmail.com>; David Rowley <david.rowley@2ndquadrant.com>; pgsql-general@postgresql.org; Thomas Kellerer <spam_eater@gmx.net>
Subject: Re: [GENERAL] BRIN indexes

 

"Further to the point, it is self defeating to have more than one BRIN index on the table if the columns involved would have mutually  non-adjacent pages."

 

   Not really, if both columns are ordered, BRIN will work

 

"Therefore, it actually would be good to state that in the documentation, even it were just a comment."

 

   It is = "BRIN is designed for handling very large tables in which certain columns have some natural correlation with their physical location within the table"

 

 

Also, I did some tests and here are the results I got:

 

Query with no index = completion time 43s

Same Query with BRIN = completion time 14s / index size 0,5 MB

Same Query without BRIN and with BTREE = completion time 10s / index size 5.000,00 MB

 

As you can see, BRIN can save 99% of disk space for just a slightly worse performance.

 

It seems like a huge improvement, given that your data fits BRIN's use case.

 

Felipe,

 

What kind of queries you used in your test?

Where they based on clustering columns?

 

Regards

Igor Neyman



Hello Igor,

I took the sample BRIN test from the new release's wiki and added the BTREE test:

The results today may vary from the reported above but are still in the same levels of performance gain:

brin_db=# CREATE TABLE orders (
brin_db(#      id int,
brin_db(#      order_date timestamptz,
brin_db(#      item text);
CREATE TABLE

brin_db=# INSERT INTO orders (order_date, item)
brin_db-#  SELECT x, 'dfiojdso' 
brin_db-#  FROM generate_series('2000-01-01 00:00:00'::timestamptz, '2015-03-01 00:00:00'::timestamptz,'2 seconds'::interval) a(x);
INSERT 0 239243401

brin_db=# \dt+ orders 
                    List of relations
 Schema |  Name  | Type  |  Owner   | Size  | Description 
--------+--------+-------+----------+-------+-------------
 public | orders | table | postgres | 12 GB | 
(1 row)

brin_db=# EXPLAIN ANALYSE SELECT count(*) FROM orders WHERE order_date BETWEEN '2012-01-04 09:00:00' and '2014-01-04 14:30:00';
                                                                         QUERY PLAN                                                                   
       
------------------------------------------------------------------------------------------------------------------------------------------------------
-------
 Aggregate  (cost=4108912.01..4108912.02 rows=1 width=0) (actual time=81116.722..81116.722 rows=1 loops=1)
   ->  Seq Scan on orders  (cost=0.00..4106759.58 rows=860972 width=0) (actual time=60173.531..78566.113 rows=31589101 loops=1)
         Filter: ((order_date >= '2012-01-04 09:00:00-02'::timestamp with time zone) AND (order_date <= '2014-01-04 14:30:00-02'::timestamp with time 
zone))
         Rows Removed by Filter: 207654300
 Planning time: 0.443 ms
 Execution time: 81118.168 ms
(6 rows)

brin_db=#  CREATE INDEX idx_order_date_brin
brin_db-#    ON orders
brin_db-#    USING BRIN (order_date);
CREATE INDEX

brin_db=# \di+ idx_order_date_brin 
                                List of relations
 Schema |        Name         | Type  |  Owner   | Table  |  Size  | Description 
--------+---------------------+-------+----------+--------+--------+-------------
 public | idx_order_date_brin | index | postgres | orders | 432 kB | 
(1 row)

brin_db=# EXPLAIN ANALYSE SELECT count(*) FROM orders WHERE order_date BETWEEN '2012-01-04 09:00:00' and '2014-01-04 14:30:00';
                                                                              QUERY PLAN                                                              
                 
------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------
 Aggregate  (cost=2408269.34..2408269.35 rows=1 width=0) (actual time=14164.923..14164.923 rows=1 loops=1)
   ->  Bitmap Heap Scan on orders  (cost=326808.28..2328609.76 rows=31863832 width=0) (actual time=42.742..11430.562 rows=31589101 loops=1)
         Recheck Cond: ((order_date >= '2012-01-04 09:00:00-02'::timestamp with time zone) AND (order_date <= '2014-01-04 14:30:00-02'::timestamp with
 time zone))
         Rows Removed by Index Recheck: 21907
         Heap Blocks: lossy=201344
         ->  Bitmap Index Scan on idx_order_date_brin  (cost=0.00..318842.32 rows=31863832 width=0) (actual time=36.151..36.151 rows=2013440 loops=1)
               Index Cond: ((order_date >= '2012-01-04 09:00:00-02'::timestamp with time zone) AND (order_date <= '2014-01-04 14:30:00-02'::timestamp 
with time zone))
 Planning time: 0.297 ms
 Execution time: 14164.985 ms
(9 rows)

brin_db=# drop index idx_order_date_brin ;
DROP INDEX

brin_db=# create index idx_order_date_btree on orders(order_date);
CREATE INDEX

brin_db=# \di+ idx_order_date_btree 
                                 List of relations
 Schema |         Name         | Type  |  Owner   | Table  |  Size   | Description 
--------+----------------------+-------+----------+--------+---------+-------------
 public | idx_order_date_btree | index | postgres | orders | 5125 MB | 
(1 row)

brin_db=# EXPLAIN ANALYSE SELECT count(*) FROM orders WHERE order_date BETWEEN '2012-01-04 09:00:00' and '2014-01-04 14:30:00';
                                                                           QUERY PLAN                                                                 
           
------------------------------------------------------------------------------------------------------------------------------------------------------
-----------
 Aggregate  (cost=1269366.79..1269366.80 rows=1 width=0) (actual time=10435.148..10435.148 rows=1 loops=1)
   ->  Index Only Scan using idx_order_date_btree on orders  (cost=0.57..1189707.21 rows=31863832 width=0) (actual time=0.656..7919.754 rows=31589101 
loops=1)
         Index Cond: ((order_date >= '2012-01-04 09:00:00-02'::timestamp with time zone) AND (order_date <= '2014-01-04 14:30:00-02'::timestamp with t
ime zone))
         Heap Fetches: 31589101
 Planning time: 6.285 ms
 Execution time: 10435.197 ms
(6 rows)


Att.,

Felipe

В списке pgsql-general по дате отправления:

Предыдущее
От: Nikhil
Дата:
Сообщение: BDR replication
Следующее
От: John R Pierce
Дата:
Сообщение: Re: BDR replication