Re: again on index usage
От | Daniel Kalchev |
---|---|
Тема | Re: again on index usage |
Дата | |
Msg-id | 200201110532.HAA15650@dcave.digsys.bg обсуждение исходный текст |
Ответ на | Re: again on index usage (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
>>>Tom Lane said:> It would be interesting to check whether there is any correlation> between ipaddr and ipdate in your testdata. Perhaps clustering> on ipaddr might not destroy the ordering on ipdate as much as you> thought. A more clearlyrandom-order test would go:> > select * into iplog_test from iplog_gate200112 order by random();> create index iplog_test_ipdate_idxon iplog_test(ipdate);> vacuum verbose analyze iplog_test;> << run queries >> NOTICE: --Relation iplog_test-- NOTICE: Pages 17761: Changed 17761, reaped 0, Empty 0, New 0; Tup 1706202: Vac 0, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 80, MaxLen 88; Re-using: Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 2.36s/0.32u sec. NOTICE: Index iplog_test_ipdate_idx: Pages 4681; Tuples 1706202. CPU 0.26s/1.98u sec. NOTICE: --Relation pg_toast_275335644-- NOTICE: Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.00u sec. NOTICE: Index pg_toast_275335644_idx: Pages 1; Tuples 0. CPU 0.00s/0.00u sec. NOTICE: Analyzing... explain (with enable_seqscan to on) Aggregate (cost=56151.97..56151.97 rows=1 width=16) -> Seq Scan on iplog_test (cost=0.00..56150.54 rows=284 width=16) average run time 4 minutes (iostat before seq run) tin tout sps tps msps sps tps msps usr nic sys int idl 5 44 94 6 5.9 6842 120 3.3 25 11 6 0 58 1 14 27 1 5.9 5968 138 3.6 22 11 5 0 62 5 44 58 2 6.3 5647 117 3.0 27 9 6 0 58 1 13 7 1 5.0 5723 125 3.4 24 10 5 0 61 0 15 50 2 4.5 5606 110 3.1 27 10 5 0 58 5 44 90 5 9.1 4702 87 2.5 28 12 4 0 55 1 15 52 1 6.3 5045 114 4.1 24 10 4 0 61 (iostat during seq run) tin tout sps tps msps sps tps msps usr nic sys int idl 1 40 41 2 2.1 5847 123 3.6 25 11 4 0 60 1 16 164 13 6.2 7280 128 3.2 28 8 8 0 57 0 13 36 1 7.9 6059 147 3.9 25 8 5 0 62 0 13 48 3 5.3 6691 126 3.4 26 8 7 0 59 0 13 28 3 4.6 6473 103 2.3 28 11 7 0 54 0 13 138 11 7.6 10848 151 4.9 19 6 6 0 69 0 13 33 3 3.3 5568 100 3.6 21 16 3 0 59 0 13 24 2 1.1 6752 144 3.4 22 12 2 0 64 (sometime at the end of query run) tin tout sps tps msps sps tps msps usr nic sys int idl 0 38 20 2 5.5 5621 57 1.2 23 23 3 0 51 0 13 89 7 7.7 8854 101 3.4 21 18 4 0 57 0 13 72 6 7.3 9929 88 2.2 18 21 4 0 57 0 13 129 6 9.6 4865 43 1.0 21 24 4 0 51 0 13 72 3 4.2 5421 46 0.5 24 22 4 0 50 0 13 52 2 3.5 5877 64 1.8 22 21 4 0 53 0 13 50 3 6.1 5561 54 1.7 19 26 3 0 52 0 13 42 1 3.8 5455 76 1.4 22 22 3 0 53 (see lower msps - perhaps other queries slowed down? - but then again returned to 'normal') 0 13 244 20 6.6 6629 199 4.1 19 9 5 0 67 0 13 68 4 6.2 6080 191 4.3 14 14 3 0 70 0 13 75 3 5.9 6542 214 4.1 19 8 4 0 70 0 13 615 18 5.0 5454 129 4.1 20 18 3 0 59 0 13 88 2 5.7 3718 48 2.5 21 21 4 0 54 0 13 46 3 3.1 4533 75 2.9 20 19 5 0 56 0 13 143 7 5.1 4349 58 2.7 22 18 4 0 55 0 13 58 2 9.9 4038 45 2.0 20 24 4 0 52 0 13 111 4 5.8 4523 60 3.4 18 22 4 0 56 (with enable_seqscan to off) Aggregate (cost=85110.08..85110.08 rows=1 width=16) -> Index Scan using iplog_test_ipdate_idx on iplog_test (cost=0.00..85108.66 rows=284 width=16) average run time 2 minutes (worst 5 minutes, but this happened only once) (iostat output) tin tout sps tps msps sps tps msps usr nic sys int idl 0 38 1 1 6.7 5110 224 3.2 7 16 3 0 73 0 13 21 2 3.6 5249 219 4.1 12 13 2 0 73 0 13 0 0 10.0 5341 216 3.9 12 11 4 0 73 0 13 6 0 9.9 5049 218 3.5 10 14 3 0 72 0 13 6 0 0.0 7654 216 3.8 10 10 2 0 78 0 13 2 1 4.0 8758 222 4.1 6 11 4 0 80 0 13 6 0 9.9 8594 219 4.4 6 10 3 0 81 0 13 4 1 0.0 7290 210 4.3 6 10 4 0 80 0 13 36 3 4.9 5912 196 4.7 9 10 4 0 76 0 13 7 1 13.3 4787 209 3.7 9 17 2 0 72 0 13 0 0 10.0 4691 209 3.8 9 17 2 0 72 This sort of proves that Tom is right about the need for random data. <grin> Having the change to play with it at such idle time in the morning, when nobody is supposed to be working with the databae I dared to shut down all other sessions (idle iostat) iostat 5 tty sd0 sd1 % cputin tout sps tps msps sps tps msps usr nic sysint idl 0 559 148 2 7.7 8 1 3.3 0 0 0 0 99 0 559 235 4 4.1 30 2 2.0 0 0 1 0 98 0 559 189 2 8.7 3 0 10.0 0 0 1 0 99 seq scan (10 sec) 0 575 12103 101 9.2 18798 164 1.4 19 0 8 0 73 0 575 10118 82 11.2 30507 243 0.9 33 0 11 0 55 0 559 12704 101 9.3 7642 61 0.7 9 0 8 0 83 index scan 45 sec 0 38 8 1 4.0 6179 375 2.5 1 0 2 0 98 0 13 4 1 2.5 6208 378 2.4 1 0 2 0 97 0 13 7 1 3.3 6323 382 2.4 1 0 2 0 97 0 13 9 2 0.0 6310 389 2.4 1 0 2 0 97 0 13 0 0 0.0 3648 226 2.3 0 0 1 0 98 This proves to me 15000 RPM Cheetah drives are damn fast at sequential reads. Why index scan wins at 'normal' database load??? I would expect that if the disk subsystem is overloaded, it will be more overloaded seek-wise, than troughput-wise. Perhaps this penalizes the 'expected to be faster' sequential reads much more than the random page index reads. I will try 4.2 with the same data as soon as available and see the differences - on a separate machine with about the same configuration (same OS,same postgres install, same disks, only one 733 MHz CPU instead of two 550 MHz CPUs). Daniel
В списке pgsql-hackers по дате отправления: