Обсуждение: BUG #14290: materialized view refresh doesn't use temp_tablespace

Поиск
Список
Период
Сортировка

BUG #14290: materialized view refresh doesn't use temp_tablespace

От
rotten@windfish.net
Дата:
VGhlIGZvbGxvd2luZyBidWcgaGFzIGJlZW4gbG9nZ2VkIG9uIHRoZSB3ZWJz
aXRlOgoKQnVnIHJlZmVyZW5jZTogICAgICAxNDI5MApMb2dnZWQgYnk6ICAg
ICAgICAgIFJpY2sgT3R0ZW4KRW1haWwgYWRkcmVzczogICAgICByb3R0ZW5A
d2luZGZpc2gubmV0ClBvc3RncmVTUUwgdmVyc2lvbjogOS41LjQKT3BlcmF0
aW5nIHN5c3RlbTogICBNYWMgb3IgVWJ1bnR1CkRlc2NyaXB0aW9uOiAgICAg
ICAgCgpJJ3ZlIHJlY2VudGx5IG9ic2VydmVkIHRoYXQgaWYgd2hpbGUgcmVm
cmVzaGluZyBhIG1hdGVyaWFsaXplZCB2aWV3LCB0ZW1wCnNwYWNlIGlzIHJl
cXVpcmVkLCBpdCB3aWxsIG5vdCB1c2UgdGhlIGRlZmF1bHQgdGVtcF90YWJs
ZXNwYWNlcyBkZWZpbmVkIGluCnBvc3RncmVzcWwuY29uZi4gIEluc3RlYWQg
aXQgdXNlcyB0aGUgdGFibGVzcGFjZSB3aGVyZSB0aGUgbWF0ZXJpYWxpemVk
IHZpZXcKbGl2ZXMuDQoNClRoaXMgc3VycHJpc2VkIG1lIHRoZSBvdGhlciBk
YXkgd2hlbiBteSBwcm9kdWN0aW9uIGRhdGFiYXNlIHJhbiBvdXQgb2YgZGlz
awpzcGFjZS4gIEkgd2FzIGFibGUgdG8gY29uZmlybSB0aGUgYmVoYXZpb3Ig
aW4gbXkgZGV2ZWxvcG1lbnQgZW52aXJvbm1lbnQuDQoNClRvIHJlcHJvZHVj
ZToNCjEpIFNldCB1cCBhIHRlbXAgdGFibGVzcGFjZS4NCjIpIENvbmZpZ3Vy
ZSB0ZW1wX3RhYmxlc3BhY2VzIGluIHBvc3RncnNxbC5jb25mIHRvIHVzZSB0
aGF0IHRhYmxlc3BhY2UuDQozKSBUdXJuIG9uIGxvZ2dpbmcgb2YgdGFibGVz
cGFjZSBuYW1lcy4NCjQpIFJlZHVjZSB3b3JrX21lbSAoc28gdGhhdCB0ZW1w
IHNwYWNlIGlzIG1vcmUgbGlrZWx5IHRvIGJlIHJlcXVpcmVkKS4NCjUpIEJv
dW5jZSB0aGUgREIsIG9yIHJlbG9hZCB0aGUgcG9zdGdyZXNxbC5jb25mDQo2
KSBSZWZyZXNoIGEgZmFpcmx5IGxhcmdlIG1hdGVyaWFsaXplZCB2aWV3Lg0K
NykgT2JzZXJ2ZSBpbiB0aGUgbG9ncyB3aGVyZSB0aGUgdGVtcCB0YWJsZXNw
YWNlIHdhcyBjcmVhdGVkLg0KDQpZb3UgY2FuIGRvIGl0IGNvbmN1cnJlbnRs
eSwgb3Igbm90LiAgSW4gYm90aCBjYXNlcyB0aGUgdGVtcG9yYXJ5IHRhYmxl
cyBlbmQKdXAgaW4gdGhlIHNhbWUgdGFibGVzcGFjZSBhcyB0aGUgbWF0ZXJp
YWxpemVkIHZpZXcsIG5vdCBpbiB0aGUKdGVtcF90YWJsZXNwYWNlLg0KDQpJ
ZiB0aGlzIGNhbid0IGJlIGVhc2lseSBjaGFuZ2VkIHRvIHVzZSB0aGUgZGVm
YXVsdCB0ZW1wIHRhYmxlc3BhY2VzLCBpdAp3b3VsZCBiZSBnb29kIGlmIHRo
ZSBkb2N1bWVudGF0aW9uIHdlcmUgdXBkYXRlZCB0byBhbGVydCBtYXRlcmlh
bGl6ZWQgdmlldwp1c2VycyB0byB0aGlzIGNhdmVhdC4NCg0KUmVmcmVzaGlu
ZyBhIG1hdGVyaWFsaXplZCB2aWV3IG1heSB1c2UgYXMgbXVjaCB0ZW1wb3Jh
cnkgc3BhY2UsIG9yIG1vcmUsCnRoYW4gaXRzIGN1cnJlbnQgc2l6ZS4gIEZv
ciBleGFtcGxlLCBJIGhhdmUgYSAxMDBHIG1hdGVyaWFsaXplZCB2aWV3LiAK
UmVmcmVzaGluZyBpdCB1c2VzIGFsbW9zdCAzMDBHIGFkZGl0aW9uYWwgZGlz
ayB3aGlsZSB0aGUgcmVmcmVzaCBpcyBydW5uaW5nLgogDQoKCg==

Re: BUG #14290: materialized view refresh doesn't use temp_tablespace

От
Michael Paquier
Дата:
On Fri, Aug 19, 2016 at 4:06 AM,  <rotten@windfish.net> wrote:
> I've recently observed that if while refreshing a materialized view, temp
> space is required, it will not use the default temp_tablespaces defined in
> postgresql.conf.  Instead it uses the tablespace where the materialized view
> lives.
>
> If this can't be easily changed to use the default temp tablespaces, it
> would be good if the documentation were updated to alert materialized view
> users to this caveat.
>
> Refreshing a materialized view may use as much temporary space, or more,
> than its current size.  For example, I have a 100G materialized view.
> Refreshing it uses almost 300G additional disk while the refresh is running.

Per the docs:
https://www.postgresql.org/docs/9.6/static/runtime-config-client.html
This variable specifies tablespaces in which to create temporary
objects (temp tables and indexes on temp tables) when a CREATE command
does not explicitly specify a tablespace. Temporary files for purposes
such as sorting large data sets are also created in these tablespaces.

And as far as I know, there is no concept of temporary object for
matviews, so the documentation is correct IMO, and the behavior you
are seeing expected.
--
Michael

Re: BUG #14290: materialized view refresh doesn't usetemp_tablespace

От
Rick Otten
Дата:

Refreshing materialized views can definitely create temporary objects.  They even get logged when I turn on logging of temporary objects.

 

On 2016-08-24 03:47, Michael Paquier wrote:

On Fri, Aug 19, 2016 at 4:06 AM,  <rotten@windfish.net> wrote:
I've recently observed that if while refreshing a materialized view, temp space is required, it will not use the default temp_tablespaces defined in postgresql.conf. Instead it uses the tablespace where the materialized view lives. If this can't be easily changed to use the default temp tablespaces, it would be good if the documentation were updated to alert materialized view users to this caveat. Refreshing a materialized view may use as much temporary space, or more, than its current size. For example, I have a 100G materialized view. Refreshing it uses almost 300G additional disk while the refresh is running.
Per the docs:
https://www.postgresql.org/docs/9.6/static/runtime-config-client.html
This variable specifies tablespaces in which to create temporary
objects (temp tables and indexes on temp tables) when a CREATE command
does not explicitly specify a tablespace. Temporary files for purposes
such as sorting large data sets are also created in these tablespaces.

And as far as I know, there is no concept of temporary object for
matviews, so the documentation is correct IMO, and the behavior you
are seeing expected.

 

Re: BUG #14290: materialized view refresh doesn't use temp_tablespace

От
Kevin Grittner
Дата:
On Thu, Aug 18, 2016 at 2:06 PM,  <rotten@windfish.net> wrote:

> I've recently observed that if while refreshing a materialized view, temp
> space is required, it will not use the default temp_tablespaces defined in
> postgresql.conf.  Instead it uses the tablespace where the materialized view
> lives.
>
> This surprised me the other day when my production database ran out of disk
> space.  I was able to confirm the behavior in my development environment.
>
> To reproduce:
> 1) Set up a temp tablespace.
> 2) Configure temp_tablespaces in postgrsql.conf to use that tablespace.
> 3) Turn on logging of tablespace names.
> 4) Reduce work_mem (so that temp space is more likely to be required).
> 5) Bounce the DB, or reload the postgresql.conf
> 6) Refresh a fairly large materialized view.
> 7) Observe in the logs where the temp tablespace was created.
>
> You can do it concurrently, or not.  In both cases the temporary tables end
> up in the same tablespace as the materialized view, not in the
> temp_tablespace.

You are confusing two completely different things: temporary
objects (created by statements starting with CREATE TEMPORARY) and
temporary files (created when data spills to disk during, for
example, a sort, hash, or materialization of data internal to
processing some statement).  The former are placed based on
temp_tablespaces; the latter are normally placed in the
base/pgsql_tmp/ subdirectory.

It would be possible to place them underneath a tablespace
specified by temp_tablespaces, and it might even be a good
enhancement to implement, but that is not the normal or default
location for temporary files.

Out of curiosity, and to help justify this as a feature request
worth pursuing, can you explain why you want to do this?  For
example, have you placed your temporary tablespace on a faster
medium?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: BUG #14290: materialized view refresh doesn't usetemp_tablespace

От
Rick Otten
Дата:

The distinction between temporary objects that write to disk, and temporary files, never occurred to me before.  I had always thought they were the same thing.  I think it is a pretty confusing fine point.

Yes, I do have the temporary tablespace on the fastest disk since I thought that when I spilled out of memory, I'd still want the best performance I could get during those operations.

The temporary files generated during the refresh are several times larger than the final materialized view.  This tells me they are probably doing much more I/O than the final (concurrent) write into the materialized view.  It made sense to put those temporary writes onto the faster disk.

Additionally, in a tiered storage environment, I might put my regular tablespace on a highly redundant (ie, expensive) disk array, and then put my temporary tablespace on a fast, but cheaper disk array because I don't really need the redundancy in the temporary tables the way I do with my regular tables.

Another issue is a psychology of disk management.   When I look at a 1000G "regular" tablespace that only has 200G of tables on it, I might think "oh, there is plenty of space there for more tables" or "oh, look at all the space I'm wasting, I should pare that back and save a few dollars".

Whereas, when I see an 800G temp tablespace on the database, even if it isn't full _at_the_moment_, I think to myself "wow, there must be some big transient queries that need that space which must be why it is there".

I now know that I need all that extra space in the "regular tablespace" to support refreshing the materialized views and that in general I should plan for lots of empty space in my regular tablespaces to support that.  As I mentioned, I discovered this when a refresh view concurrently failed because it ran the regular tablespace out of disk, even though the disk was only at 40% capacity (when the refresh wasn't running) and I had a large temporary tablespace set up to support things like that.  Fortunately I had everything on logical volume managers so it was easy to extend the regular tablespace on the fly once the issue became apparent.   This could have been a much more painful thing to discover.

Lastly, if I had several tablespaces with these sorts of materialized views in them, they could all share the same temporary tablespace for the refresh.  Which means I wouldn't need so much extra space in every tablespace  (as long as they didn't all refresh at the same time).

 

On 2016-08-24 13:46, Kevin Grittner wrote:

You are confusing two completely different things: temporary
objects (created by statements starting with CREATE TEMPORARY) and
temporary files (created when data spills to disk during, for
example, a sort, hash, or materialization of data internal to
processing some statement).  The former are placed based on
temp_tablespaces; the latter are normally placed in the
base/pgsql_tmp/ subdirectory.

It would be possible to place them underneath a tablespace
specified by temp_tablespaces, and it might even be a good
enhancement to implement, but that is not the normal or default
location for temporary files.

Out of curiosity, and to help justify this as a feature request
worth pursuing, can you explain why you want to do this?  For
example, have you placed your temporary tablespace on a faster
medium?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

 

Re: BUG #14290: materialized view refresh doesn't use temp_tablespace

От
Kevin Grittner
Дата:
On Wed, Aug 24, 2016 at 1:21 PM, Rick Otten <rotten@windfish.net> wrote:

> The distinction between temporary objects that write to disk, and temporary
> files, never occurred to me before.  I had always thought they were the same
> thing.  I think it is a pretty confusing fine point.

On further investigation, both are supposed to use the tablespaces
you specify, which does make this a bug.  Will fix.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: BUG #14290: materialized view refresh doesn't use temp_tablespace

От
Andrew Gierth
Дата:
>>>>> "Kevin" == Kevin Grittner <kgrittn@gmail.com> writes:

 Kevin> You are confusing two completely different things: temporary
 Kevin> objects (created by statements starting with CREATE TEMPORARY)
 Kevin> and temporary files (created when data spills to disk during,
 Kevin> for example, a sort, hash, or materialization of data internal
 Kevin> to processing some statement).  The former are placed based on
 Kevin> temp_tablespaces; the latter are normally placed in the
 Kevin> base/pgsql_tmp/ subdirectory.

This is not true.

If temp_tablespaces is set, then all sort / hash / materialization files
will use the list of tablespaces in temp_tablespaces on a rotational
basis (to spread load). The only exception is for temp files that might
have to outlive the current transaction; these are forced into the
default tablespace of the current database. See storage/file/fd.c

However, materialized view refresh (without CONCURRENTLY) needs the new
copy of the matview's data to be in the same tablespace as the old copy,
since it moves the data into place via a heap swap. This restriction
does not apply to sort/hash/materialization files created by the
matview's query.

--
Andrew (irc:RhodiumToad)

Re: BUG #14290: materialized view refresh doesn't use temp_tablespace

От
Andrew Gierth
Дата:
>>>>> "rotten" == rotten  <rotten@windfish.net> writes:

 rotten> I've recently observed that if while refreshing a materialized
 rotten> view, temp space is required, it will not use the default
 rotten> temp_tablespaces defined in postgresql.conf.  Instead it uses
 rotten> the tablespace where the materialized view lives.

 rotten> This surprised me the other day when my production database ran
 rotten> out of disk space.  I was able to confirm the behavior in my
 rotten> development environment.

 rotten> To reproduce:
 rotten> 1) Set up a temp tablespace.
 rotten> 2) Configure temp_tablespaces in postgrsql.conf to use that tablespace.
 rotten> 3) Turn on logging of tablespace names.
 rotten> 4) Reduce work_mem (so that temp space is more likely to be required).
 rotten> 5) Bounce the DB, or reload the postgresql.conf
 rotten> 6) Refresh a fairly large materialized view.
 rotten> 7) Observe in the logs where the temp tablespace was created.

I can't reproduce this.

postgres=# show server_version;
 server_version
----------------
 9.5.4

postgres=# create materialized view mvx1 as select * from generate_series(1,100000);
SELECT 100000
postgres=# set temp_tablespaces = 't1';
SET
postgres=# set log_temp_files = 0;
SET
postgres=# set work_mem = '64kB';
SET

postgres=# refresh materialized view mvx1;
LOG:  temporary file: path "pg_tblspc/18002/PG_9.5_201510051/pgsql_tmp/pgsql_tmp91239.1", size 1400000
STATEMENT:  refresh materialized view mvx1;
REFRESH MATERIALIZED VIEW

(this clearly shows that the temp file created by the FunctionScan of
generate_series is located in tablespace oid 18002, which happens to be
t1)

--
Andrew (irc:RhodiumToad)

Re: BUG #14290: materialized view refresh doesn't usetemp_tablespace

От
Rick Otten
Дата:

Is t1 also the tablespace for your materialized view?  (It isn't clear from what you've posted below.)

In my environment my default tablespace for the database is different than the temp_tablespace.

 

On 2016-08-24 14:58, Andrew Gierth wrote:

"rotten" == rotten <rotten@windfish.net> writes:
 rotten> I've recently observed that if while refreshing a materializedrotten> view, temp space is required, it will not use the defaultrotten> temp_tablespaces defined in postgresql.conf.  Instead it usesrotten> the tablespace where the materialized view lives.
rotten> This surprised me the other day when my production database ranrotten> out of disk space.  I was able to confirm the behavior in myrotten> development environment.
rotten> To reproduce:rotten> 1) Set up a temp tablespace.rotten> 2) Configure temp_tablespaces in postgrsql.conf to use that tablespace.rotten> 3) Turn on logging of tablespace names.rotten> 4) Reduce work_mem (so that temp space is more likely to be required).rotten> 5) Bounce the DB, or reload the postgresql.confrotten> 6) Refresh a fairly large materialized view.rotten> 7) Observe in the logs where the temp tablespace was created.

I can't reproduce this.

postgres=# show server_version;server_version 
----------------9.5.4

postgres=# create materialized view mvx1 as select * from generate_series(1,100000);
SELECT 100000
postgres=# set temp_tablespaces = 't1';
SET
postgres=# set log_temp_files = 0;
SET
postgres=# set work_mem = '64kB';
SET

postgres=# refresh materialized view mvx1;
LOG:  temporary file: path "pg_tblspc/18002/PG_9.5_201510051/pgsql_tmp/pgsql_tmp91239.1", size 1400000
STATEMENT:  refresh materialized view mvx1;
REFRESH MATERIALIZED VIEW

(this clearly shows that the temp file created by the FunctionScan of
generate_series is located in tablespace oid 18002, which happens to be
t1)

 

Re: BUG #14290: materialized view refresh doesn't use temp_tablespace

От
Andrew Gierth
Дата:
>>>>> "Rick" == Rick Otten <rotten@windfish.net> writes:

 Rick> Is t1 also the tablespace for your materialized view? (It isn't
 Rick> clear from what you've posted below.)

No. The matview was created in pg_default.

 Rick> In my environment my default tablespace for the database is
 Rick> different than the temp_tablespace.

Possibly relevant here is that to put a tablespace into
temp_tablespaces, you must have CREATE permission on it. If you set the
value in postgresql.conf rather than with SET, you may not see errors
from lack of permission, instead tablespaces without permission will be
silently omitted.

--
Andrew (irc:RhodiumToad)

Re: BUG #14290: materialized view refresh doesn't use temp_tablespace

От
Rick Otten
Дата:

Aha.  Maybe that is my problem.  The refresh is silently failing to use TEMP_TABLESPACES.

When I explicitly do the grant create, the refresh then uses the temp tablespace as originally expected.

Thanks.

 

On 2016-08-24 16:14, Andrew Gierth wrote:

"Rick" == Rick Otten <rotten@windfish.net> writes:
 Rick> Is t1 also the tablespace for your materialized view? (It isn'tRick> clear from what you've posted below.)

No. The matview was created in pg_default.
Rick> In my environment my default tablespace for the database isRick> different than the temp_tablespace.

Possibly relevant here is that to put a tablespace into
temp_tablespaces, you must have CREATE permission on it. If you set the
value in postgresql.conf rather than with SET, you may not see errors
from lack of permission, instead tablespaces without permission will be
silently omitted.