Re: PostgreSQL block size for SSD RAID setup?
От | PFC |
---|---|
Тема | Re: PostgreSQL block size for SSD RAID setup? |
Дата | |
Msg-id | op.upw6xtwdcigqcu@soyouz обсуждение исходный текст |
Ответ на | PostgreSQL block size for SSD RAID setup? (henk de wit <henk53602@hotmail.com>) |
Ответы |
Re: PostgreSQL block size for SSD RAID setup?
|
Список | pgsql-performance |
> Hi, > I was reading a benchmark that sets out block sizes against raw IO > performance for a number of different RAID configurations involving high > end SSDs (the Mtron 7535) on a powerful RAID controller (the Areca > 1680IX with 4GB RAM). See > http://jdevelopment.nl/hardware/one-dvd-per-second/ Lucky guys ;) Something that bothers me about SSDs is the interface... The latest flash chips from Micron (32Gb = 4GB per chip) have something like 25 us "access time" (lol) and push data at 166 MB/s (yes megabytes per second) per chip. So two of these chips are enough to bottleneck a SATA 3Gbps link... there would be 8 of those chips in a 32GB SSD. Parallelizing would depend on the block size : putting all chips in parallel would increase the block size, so in practice I don't know how it's implemented, probably depends on the make and model of SSD. And then RAIDing those (to get back the lost throughput from using SATA) will again increase the block size which is bad for random writes. So it's a bit of a chicken and egg problem. Also since harddisks have high throughput but slow seeks, all the OS'es and RAID cards, drivers, etc are probably optimized for throughput, not IOPS. You need a very different strategy for 100K/s 8kbyte IOs versus 1K/s 1MByte IOs. Like huge queues, smarter hardware, etc. FusionIO got an interesting product by using the PCI-e interface which brings lots of benefits like much higher throughput and the possibility of using custom drivers optimized for handling much more IO requests per second than what the OS and RAID cards, and even SATA protocol, were designed for. Intrigued by this I looked at the FusionIO benchmarks : more than 100.000 IOPS, really mindboggling, but in random access over a 10MB file. A little bit of google image search reveals the board contains a lot of Flash chips (expected) and a fat FPGA (expected) probably a high-end chip from X or A, and two DDR RAM chips from Samsung, probably acting as cache. So I wonder if the 10 MB file used as benchmark to reach those humongous IOPS was actually in the Flash ?... or did they actually benchmark the device's onboard cache ?... It probably has writeback cache so on a random writes benchmark this is an interesting question. A good RAID card with BBU cache would have the same benchmarking gotcha (ie if you go crazy on random writes on a 10 MB file which is very small, and the device is smart, possibly at the end of the benchmark nothing at all was written to the disks !) Anyway in a database use case if random writes are going to be a pain they are probably not going to be distributed in a tiny 10MB zone which the controller cache would handle... (just rambling XDD)
В списке pgsql-performance по дате отправления: