Обсуждение: nested hstore - large insert crashes server

Поиск
Список
Период
Сортировка

nested hstore - large insert crashes server

От
"Erik Rijkers"
Дата:
CentOS Release 6.5 (final)
AMD FX(tm)-8120 Eight-Core
2.6.32-431.3.1.el6.x86_64 #1 SMP Fri Jan 3 21:39:27 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
memory: 8GB

I am testing nested hstore, on a server with both with these patches: jsonb-9.patch.gz nested-hstore-9.patch.gz

One of the first tries brings down the server (gobbles up too much memory, I think).

When I run:

--------------------
select version();

drop   table if exists t_10000000 ;
create table           t_10000000 (   id    serial , hs    hstore primary key , hs2   hstore
);
insert into t_10000000 (hs,hs2) select      ( '['   || i || ','  || i|| ']' ) ::hstore   ,  ( '{'   || i || '=>' || i||
'}') ::hstore from generate_series(1, 10000000) as f(i)
 
;
---------------------


I get:

$ time psql -af nestedhs.sql
\timing on
Timing is on.
select version();                                                        version

--------------------------------------------------------------------------------------------------------------------------PostgreSQL
9.4devel_nested_hstore_20140204_0814_00d4f2aon x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.2, 64-bit
 
(1 row)

Time: 1.288 ms
drop   table if exists t_10000000 ;
DROP TABLE
Time: 0.808 ms
create table           t_10000000 (   id    serial , hs    hstore primary key , hs2   hstore
);
CREATE TABLE
Time: 111.397 ms
insert into t_10000000 (hs,hs2) select     ( '['   || i || ','  || i|| ']' ) ::hstore   , ( '{'   || i || '=>' || i||
'}') ::hstore from generate_series(1, 10000000) as f(i)
 
;
[psql:nestedhs.sql:14: connection to server was lost

real    5m4.780s
user    0m0.005s
sys     0m0.009s

logging:

2014-02-04 10:34:25.376 CET 29133 LOG:  server process (PID 29459) was terminated by signal 9: Killed
2014-02-04 10:34:25.854 CET 29133 DETAIL:  Failed process was running: insert into t_10000000 (hs,hs2)           select
              ( '['   || i || ','  || i|| ']' ) ::hstore             ,  ( '{'   || i || '=>' || i|| '}' ) ::hstore
    from generate_series(1, 10000000) as f(i)       ;
 
2014-02-04 10:34:25.884 CET 29133 LOG:  terminating any other active server processes
2014-02-04 10:34:28.541 CET 29133 LOG:  all server processes terminated; reinitializing
2014-02-04 10:34:30.002 CET 29534 LOG:  database system was interrupted; last known up at 2014-02-04 10:30:42 CET
2014-02-04 10:34:30.933 CET 29535 FATAL:  the database system is in recovery mode
2014-02-04 10:34:31.150 CET 29534 LOG:  database system was not properly shut down; automatic recovery in progress
2014-02-04 10:34:31.344 CET 29534 LOG:  redo starts at 1/B1CC92F8
2014-02-04 10:34:46.681 CET 29534 LOG:  unexpected pageaddr 1/86F4A000 in log segment 0000000100000001000000CC, offset
16031744
2014-02-04 10:34:46.681 CET 29534 LOG:  redo done at 1/CCF49F50
2014-02-04 10:34:52.039 CET 29133 LOG:  database system is ready to accept connections



(and btw, I end up with a large but unusable table:

testdb=# \dt+ t_10000000                      List of relationsSchema |    Name    | Type  |  Owner   |  Size  |
Description
--------+------------+-------+----------+--------+-------------public | t_10000000 | table | aardvark | 291 MB |
(1 row)

testdb=# select count(*) from t_10000000;count
-------    0
(1 row)

)

Main .conf settings:
       setting          |               current_setting
--------------------------+----------------------------------------------autovacuum               | offport
       | 6541shared_buffers           | 512MBeffective_cache_size     | 2GBwork_mem                 |
50MBmaintenance_work_mem    | 1GBcheckpoint_segments      | 20server_version           |
9.4devel_nested_hstore_20140204_0814_00d4f2apg_postmaster_start_time| 2014-02-04 10:12 (uptime: 0d 0h 33m
42s)data_checksums          | off
 
feature_id |    feature_name    | is_supported | is_verified_by |                                      comments

------------+--------------------+--------------+----------------+-------------------------------------------------------------------------------------PKG100
   | project name       | YES          | ej             | nested_hstorePKG101     | patched            | YES          |
ej            | YESPKG102     | patch file         | YES          | ej             |
 
/home/aardvark/download/pgpatches/0094/nested_hstore/20140130/jsonb-9.patch        +           |                    |
          |                |
 
/home/aardvark/download/pgpatches/0094/nested_hstore/20140130/nested-hstore-9.patchPKG103     | build time         |
YES         | ej             | 2014-02-04 08:19:13.600371+01PKG104     | server_version     | YES          | ej
   | 9.4devel_nested_hstore_20140204_0814_00d4f2aPKG105     | server_version_num | YES          | ej             |
90400PKG106    | port               | YES          | ej             | 6541PKG110     | commit hash        | YES
| ej             | 00d4f2af8bd6a1b9db2f676cc76b64d98ace99fbPKG111     | catversion         | YES          | ej
  | 201402031PKG112     | control version    | YES          | ej             | 937PKG113     | xlog_page_magic    | YES
        | ej             | 0xD07B
 


I suppose for larger machines the crash will happen later; on this small machine I can generate_series a 1 million row
table without crashing.


Thanks,

Erik Rijkers














Re: nested hstore - large insert crashes server

От
Christian Kruse
Дата:
Hi,

On 04/02/14 17:41, Erik Rijkers wrote:

> 2014-02-04 10:34:25.376 CET 29133 LOG:  server process (PID 29459) was terminated by signal 9: Killed

Did you check if this was the OOM killer? Should be logged in dmesg.

Best regards,

-- Christian Kruse               http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services


Re: nested hstore - large insert crashes server

От
"Erik Rijkers"
Дата:
On Tue, February 4, 2014 18:56, Christian Kruse wrote:
> Hi,
>
> On 04/02/14 17:41, Erik Rijkers wrote:
>
>> 2014-02-04 10:34:25.376 CET 29133 LOG:  server process (PID 29459) was terminated by signal 9: Killed
>
> Did you check if this was the OOM killer? Should be logged in dmesg.
>

I would be surprised if it wasn't. (no access to that machine at the moment)

How do we regard such crashes?  It seems to me this was rather eaasily 'provoked' (for want of a better word).

I am inclined to blame the patch...










Re: nested hstore - large insert crashes server

От
Tom Lane
Дата:
"Erik Rijkers" <er@xs4all.nl> writes:
> On Tue, February 4, 2014 18:56, Christian Kruse wrote:
>> Did you check if this was the OOM killer? Should be logged in dmesg.

> I would be surprised if it wasn't. (no access to that machine at the moment)

> How do we regard such crashes?  It seems to me this was rather eaasily 'provoked' (for want of a better word).

Well, it suggests that there may be a memory leak, which would be a bug
even though we'd assign it lower priority than a true crash.
        regards, tom lane



Re: nested hstore - large insert crashes server

От
Stephen Frost
Дата:
* Erik Rijkers (er@xs4all.nl) wrote:
> On Tue, February 4, 2014 18:56, Christian Kruse wrote:
> > On 04/02/14 17:41, Erik Rijkers wrote:
> >> 2014-02-04 10:34:25.376 CET 29133 LOG:  server process (PID 29459) was terminated by signal 9: Killed
> >
> > Did you check if this was the OOM killer? Should be logged in dmesg.
>
> I would be surprised if it wasn't. (no access to that machine at the moment)
>
> How do we regard such crashes?  It seems to me this was rather eaasily 'provoked' (for want of a better word).
>
> I am inclined to blame the patch...

It sounds like there is at least some investigation which should happen
here to see why we're using so much memory (well beyond work_mem and
even maint_work_mem it sounds like), but it would also be good to have
the machine reconfigured to not allow OOM killing.
Thanks,        Stephen