Re: out-of-line (TOAST) storage ineffective when loading from dump?
От | Richard Huxton |
---|---|
Тема | Re: out-of-line (TOAST) storage ineffective when loading from dump? |
Дата | |
Msg-id | 47BAB68A.1060709@archonet.com обсуждение исходный текст |
Ответ на | Re: out-of-line (TOAST) storage ineffective when loading from dump? ("Markus Bertheau" <mbertheau.pg@googlemail.com>) |
Ответы |
Re: out-of-line (TOAST) storage ineffective when loading from dump?
|
Список | pgsql-general |
Markus Bertheau wrote: > 2008/2/19, Richard Huxton <dev@archonet.com>: > I'm loading a table with some short attributes and a large toastable attribute. > That means that for every main table heap page several toast table heap pages > are written. This happens through the buffer cache and the background writer, > so maybe the pages aren't written in the order in which they were created in > the buffer cache, but if they are, they end up on disk (assuming that the file > system is not fragmented) roughly like that: > > main table heap page 1 > toast table heap page 1 > toast table heap page . > toast table heap page n > main table heap page 2 > toast table heap page n+1 > toast table heap page . > toast table heap page 2n Well, that's assuming: 1. You're not re-using space from previously deleted/updated rows. 2. You've not got a RAID array striping writes over multiple disks 3. The underlying filesystem + buffering isn't doing anything too clever. > Significantly later a sequential scan of the table has to be made, the > toastable attribute is not needed for the operation. The caches are cold or > otherwise occupied. OK > If the granularity of caches that are nearer to the disk in > the cache hierarchy than the PG buffer cache is higher than the PG page size > (i.e. a cache unit is bigger than the PG page size), then every read of a main > table heap page will inescapably read some toast table heap pages into the > cache (whichever cache that may be). I think disk blocks on your more common file-systems are 4KB by default (ext2/3 and ntfs for example). I'm not aware of any default disk-block sizes more than the 8KB page-size of PG. Of course, the OS may read ahead if it sees you scanning, but it will do that on a file basis. > If all the main table heap pages were > laid out adjecently on disk, they could be read faster and caches be polluted > less. True enough. The key word there though is "if" - it means that PG is trying to out-think the filesystem, OS and hardware. It should be easy enough to test on a particular system though. 1. pg_restore a sample table with TOASTed data. 2. pg_restore the same data but no TOASTed data. 3. cluster the table with TOASTed data (which should force a rewrite of the whole table but not its TOASTed data) If the timing of various selects differ hugely then there's something worth investigating. -- Richard Huxton Archonet Ltd
В списке pgsql-general по дате отправления: