Re: [HACKERS] A Better External Sort?
От | Josh Berkus |
---|---|
Тема | Re: [HACKERS] A Better External Sort? |
Дата | |
Msg-id | 200509301341.22795.josh@agliodbs.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] A Better External Sort? (Ron Peacetree <rjpeace@earthlink.net>) |
Ответы |
Re: [HACKERS] A Better External Sort?
Re: [HACKERS] A Better External Sort? |
Список | pgsql-performance |
Ron, > That 11MBps was your =bulk load= speed. If just loading a table > is this slow, then there are issues with basic physical IO, not just > IO during sort operations. Oh, yeah. Well, that's separate from sort. See multiple posts on this list from the GreenPlum team, the COPY patch for 8.1, etc. We've been concerned about I/O for a while. Realistically, you can't do better than about 25MB/s on a single-threaded I/O on current Linux machines, because your bottleneck isn't the actual disk I/O. It's CPU. Databases which "go faster" than this are all, to my knowledge, using multi-threaded disk I/O. (and I'd be thrilled to get a consistent 25mb/s on PostgreSQL, but that's another thread ... ) > As I said, the obvious candidates are inefficient physical layout > and/or flawed IO code. Yeah, that's what I thought too. But try sorting an 10GB table, and you'll see: disk I/O is practically idle, while CPU averages 90%+. We're CPU-bound, because sort is being really inefficient about something. I just don't know what yet. If we move that CPU-binding to a higher level of performance, then we can start looking at things like async I/O, O_Direct, pre-allocation etc. that will give us incremental improvements. But what we need now is a 5-10x improvement and that's somewhere in the algorithms or the code. -- --Josh Josh Berkus Aglio Database Solutions San Francisco
В списке pgsql-performance по дате отправления: