Обсуждение: Disk Performance Problem on Large DB

Поиск

Список

Период

Сортировка

Disk Performance Problem on Large DB

От

"Jonathan Hoover"

Дата:

04 ноября 2010 г., 17:53:33

Hello,

I have a RHEL 5 box, 4 GB RAM, single hard drive SATA, Intel Core 2 Duo 2.4. A basic workstation.

I have a simple database, with one table for now. It has 4 columns:

anid serial primary key unique,

time timestamp,

source varchar(5),

unitid varchar(15),

guid varchar(32)

There is a btree index on each.

I am loading data 1,000,000 (1M) rows at a time using psql and a COPY command. Once I hit 2M rows, my performance just drops out, and the next 1M never finishes. It takes 7 minutes for 1M rows to load. Once 2M are in there, I’ve waited an hour, and nothing. It doesn’t seem to matter which 1M rows I try to load next, none ever finish. Each 1M rows is about 70MB on disk in the raw input file.

I have “atop” installed, and it reports the drives at 100%, which it reports for the first 1M rows too. The MBw/s goes from 20+ on the first 2M rows, down to about 4 MBw/s or less now. The processor usage is at about 2 to 8% at this time (used by postgres).

I have even waited for 1M rows to load, then done a vacuum for no good reason, then even restarted postgresql. I’ve made sure no disk or proc activity is happening before I start the next 1M rows. None of that seems to matter.

I have a total of about 70M rows to load, but am at a standstill. I’ve read up on whatever performance docs I can find online, but I am not getting anywhere.

I’ve increased shared_buffers to 256MB, and I’ve tried it with fsync commented out as per the default config. I’ve also tried it with fsync=off. No difference.

Ideas? Thanks in advance,

Jon

Re: Disk Performance Problem on Large DB

От

Kenneth Marshall

Дата:

04 ноября 2010 г., 18:02:59

On Thu, Nov 04, 2010 at 01:42:49PM -0700, Jonathan  Hoover wrote:
> Hello,
>
> I have a RHEL 5 box, 4 GB RAM, single hard drive SATA, Intel Core 2 Duo 2.4. A basic workstation.
>
> I have a simple database, with one table for now. It has 4 columns:
>
> anid serial primary key unique,
> time timestamp,
> source varchar(5),
> unitid varchar(15),
> guid varchar(32)
>
> There is a btree index on each.
>
> I am loading data 1,000,000 (1M) rows at a time using psql and a COPY command. Once I hit 2M rows, my performance
justdrops out, and the next 1M never finishes. It takes 7 minutes for 1M rows to load. Once 2M are in there, I've
waitedan hour, and nothing. It doesn't seem to matter which 1M rows I try to load next, none ever finish. Each 1M rows
isabout 70MB on disk in the raw input file. 
>
> I have "atop" installed, and it reports the drives at 100%, which it reports for the first 1M rows too. The MBw/s
goesfrom 20+ on the first 2M rows, down to about 4 MBw/s or less now. The processor usage is at about 2 to 8% at this
time(used by postgres). 
>
> I have even waited for 1M rows to load, then done a vacuum for no good reason, then even restarted postgresql. I've
madesure no disk or proc activity is happening before I start the next 1M rows. None of that seems to matter. 
>
> I have a total of about 70M rows to load, but am at a standstill. I've read up on whatever performance docs I can
findonline, but I am not getting anywhere. 
>
> I've increased shared_buffers to 256MB, and I've tried it with fsync commented out as per the default config. I've
alsotried it with fsync=off. No difference. 
>
> Ideas? Thanks in advance,
> Jon

The initial 1M load if the table has just been truncated or created
has no WAL logging. You can boost maintenance_work_mem to increase
index creation/update performance. You are severely I/O limited and
would be better off dropping your indexes during the load and re-
creating them afterwards. If you are starting with an empty table,
truncate it and then load all the data in a single transaction, all
7 COPY commands. Then COMMIT and build the indexes. Your question
is also missing key information like config details, PostgreSQL version,
...

Cheers,
Ken

Re: Disk Performance Problem on Large DB

От

"Kevin Grittner"

Дата:

04 ноября 2010 г., 18:03:52

"Jonathan  Hoover" <jhoover@yahoo-inc.com> wrote:

> I have a simple database, with one table for now. It has 4
> columns:
>
> anid serial primary key unique,
> time timestamp,
> source varchar(5),
> unitid varchar(15),
> guid varchar(32)
>
> There is a btree index on each.
>
> I am loading data 1,000,000 (1M) rows at a time using psql and a
> COPY command. Once I hit 2M rows, my performance just drops out

Drop the indexes and the primary key before you copy in.
Personally, I strongly recommend a VACUUM FREEZE ANALYZE after the
bulk load.  Then use ALTER TABLE to restore the primary key, and
create all the other indexes.

Also, if you don't mind starting over from initdb if it crashes
partway through you can turn fsync off.  You want a big
maintenance_work_mem setting during the index builds -- at least
200 MB.

-Kevin

Re: Disk Performance Problem on Large DB

От

"Jonathan Hoover"

Дата:

05 ноября 2010 г., 00:03:35

1. I have now set maintenance_work_mem to 256 MB (which was previously commented by the default config)
2. The version is PostgreSQL 8.1.18 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat
4.1.2-46)
3. What would be the best and cheapest thing I could for IO performance?
4. I need to read up on TRUNCATE, which I have not used before. Care to give a quick overview before I RTFM?

Thanks,
jon

-----Original Message-----
From: Kenneth Marshall [mailto:ktm@rice.edu]
Sent: Thursday, November 04, 2010 4:03 PM
To: Jonathan Hoover
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

On Thu, Nov 04, 2010 at 01:42:49PM -0700, Jonathan  Hoover wrote:
> Hello,
>
> I have a RHEL 5 box, 4 GB RAM, single hard drive SATA, Intel Core 2 Duo 2.4. A basic workstation.
>
> I have a simple database, with one table for now. It has 4 columns:
>
> anid serial primary key unique,
> time timestamp,
> source varchar(5),
> unitid varchar(15),
> guid varchar(32)
>
> There is a btree index on each.
>
> I am loading data 1,000,000 (1M) rows at a time using psql and a COPY command. Once I hit 2M rows, my performance
justdrops out, and the next 1M never finishes. It takes 7 minutes for 1M rows to load. Once 2M are in there, I've
waitedan hour, and nothing. It doesn't seem to matter which 1M rows I try to load next, none ever finish. Each 1M rows
isabout 70MB on disk in the raw input file. 
>
> I have "atop" installed, and it reports the drives at 100%, which it reports for the first 1M rows too. The MBw/s
goesfrom 20+ on the first 2M rows, down to about 4 MBw/s or less now. The processor usage is at about 2 to 8% at this
time(used by postgres). 
>
> I have even waited for 1M rows to load, then done a vacuum for no good reason, then even restarted postgresql. I've
madesure no disk or proc activity is happening before I start the next 1M rows. None of that seems to matter. 
>
> I have a total of about 70M rows to load, but am at a standstill. I've read up on whatever performance docs I can
findonline, but I am not getting anywhere. 
>
> I've increased shared_buffers to 256MB, and I've tried it with fsync commented out as per the default config. I've
alsotried it with fsync=off. No difference. 
>
> Ideas? Thanks in advance,
> Jon

The initial 1M load if the table has just been truncated or created
has no WAL logging. You can boost maintenance_work_mem to increase
index creation/update performance. You are severely I/O limited and
would be better off dropping your indexes during the load and re-
creating them afterwards. If you are starting with an empty table,
truncate it and then load all the data in a single transaction, all
7 COPY commands. Then COMMIT and build the indexes. Your question
is also missing key information like config details, PostgreSQL version,
...

Cheers,
Ken

Re: Disk Performance Problem on Large DB

От

"Jonathan Hoover"

Дата:

05 ноября 2010 г., 00:23:49

Just FYI, I removed the PK and the indexes for now. Load times for 1M rows is now 7 SECONDS instead of 7 MINUTES (or
justnever happening). Granted, I made the changes in #1 below, but WOW! So, question: what helped the most: 1) no PK,
2)no indexes, 3) the maintenance_work_mem being uncommented? I will test myself when I have time, but I'd like to know
everyone'sthoughts. 

Jon

-----Original Message-----
From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Jonathan Hoover
Sent: Thursday, November 04, 2010 10:03 PM
To: pgsql-admin@postgresql.org
Cc: Kenneth Marshall
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

1. I have now set maintenance_work_mem to 256 MB (which was previously commented by the default config)
2. The version is PostgreSQL 8.1.18 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat
4.1.2-46)
3. What would be the best and cheapest thing I could for IO performance?
4. I need to read up on TRUNCATE, which I have not used before. Care to give a quick overview before I RTFM?

Thanks,
jon

-----Original Message-----
From: Kenneth Marshall [mailto:ktm@rice.edu]
Sent: Thursday, November 04, 2010 4:03 PM
To: Jonathan Hoover
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

On Thu, Nov 04, 2010 at 01:42:49PM -0700, Jonathan  Hoover wrote:
> Hello,
>
> I have a RHEL 5 box, 4 GB RAM, single hard drive SATA, Intel Core 2 Duo 2.4. A basic workstation.
>
> I have a simple database, with one table for now. It has 4 columns:
>
> anid serial primary key unique,
> time timestamp,
> source varchar(5),
> unitid varchar(15),
> guid varchar(32)
>
> There is a btree index on each.
>
> I am loading data 1,000,000 (1M) rows at a time using psql and a COPY command. Once I hit 2M rows, my performance
justdrops out, and the next 1M never finishes. It takes 7 minutes for 1M rows to load. Once 2M are in there, I've
waitedan hour, and nothing. It doesn't seem to matter which 1M rows I try to load next, none ever finish. Each 1M rows
isabout 70MB on disk in the raw input file. 
>
> I have "atop" installed, and it reports the drives at 100%, which it reports for the first 1M rows too. The MBw/s
goesfrom 20+ on the first 2M rows, down to about 4 MBw/s or less now. The processor usage is at about 2 to 8% at this
time(used by postgres). 
>
> I have even waited for 1M rows to load, then done a vacuum for no good reason, then even restarted postgresql. I've
madesure no disk or proc activity is happening before I start the next 1M rows. None of that seems to matter. 
>
> I have a total of about 70M rows to load, but am at a standstill. I've read up on whatever performance docs I can
findonline, but I am not getting anywhere. 
>
> I've increased shared_buffers to 256MB, and I've tried it with fsync commented out as per the default config. I've
alsotried it with fsync=off. No difference. 
>
> Ideas? Thanks in advance,
> Jon

The initial 1M load if the table has just been truncated or created
has no WAL logging. You can boost maintenance_work_mem to increase
index creation/update performance. You are severely I/O limited and
would be better off dropping your indexes during the load and re-
creating them afterwards. If you are starting with an empty table,
truncate it and then load all the data in a single transaction, all
7 COPY commands. Then COMMIT and build the indexes. Your question
is also missing key information like config details, PostgreSQL version,
...

Cheers,
Ken

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Disk Performance Problem on Large DB

От

Samuel Stearns

Дата:

05 ноября 2010 г., 00:27:34

TRUNCATE removes all data from the table leaving the schema structure in place.

What helped the most was probably the drop of the indexes.

-----Original Message-----
From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Jonathan Hoover
Sent: Friday, 5 November 2010 1:53 PM
To: pgsql-admin@postgresql.org
Cc: Kenneth Marshall
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

Just FYI, I removed the PK and the indexes for now. Load times for 1M rows is now 7 SECONDS instead of 7 MINUTES (or
justnever happening). Granted, I made the changes in #1 below, but WOW! So, question: what helped the most: 1) no PK,
2)no indexes, 3) the maintenance_work_mem being uncommented? I will test myself when I have time, but I'd like to know
everyone'sthoughts. 

Jon

-----Original Message-----
From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Jonathan Hoover
Sent: Thursday, November 04, 2010 10:03 PM
To: pgsql-admin@postgresql.org
Cc: Kenneth Marshall
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

1. I have now set maintenance_work_mem to 256 MB (which was previously commented by the default config)
2. The version is PostgreSQL 8.1.18 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat
4.1.2-46)
3. What would be the best and cheapest thing I could for IO performance?
4. I need to read up on TRUNCATE, which I have not used before. Care to give a quick overview before I RTFM?

Thanks,
jon

-----Original Message-----
From: Kenneth Marshall [mailto:ktm@rice.edu]
Sent: Thursday, November 04, 2010 4:03 PM
To: Jonathan Hoover
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

On Thu, Nov 04, 2010 at 01:42:49PM -0700, Jonathan  Hoover wrote:
> Hello,
>
> I have a RHEL 5 box, 4 GB RAM, single hard drive SATA, Intel Core 2 Duo 2.4. A basic workstation.
>
> I have a simple database, with one table for now. It has 4 columns:
>
> anid serial primary key unique,
> time timestamp,
> source varchar(5),
> unitid varchar(15),
> guid varchar(32)
>
> There is a btree index on each.
>
> I am loading data 1,000,000 (1M) rows at a time using psql and a COPY command. Once I hit 2M rows, my performance
justdrops out, and the next 1M never finishes. It takes 7 minutes for 1M rows to load. Once 2M are in there, I've
waitedan hour, and nothing. It doesn't seem to matter which 1M rows I try to load next, none ever finish. Each 1M rows
isabout 70MB on disk in the raw input file. 
>
> I have "atop" installed, and it reports the drives at 100%, which it reports for the first 1M rows too. The MBw/s
goesfrom 20+ on the first 2M rows, down to about 4 MBw/s or less now. The processor usage is at about 2 to 8% at this
time(used by postgres). 
>
> I have even waited for 1M rows to load, then done a vacuum for no good reason, then even restarted postgresql. I've
madesure no disk or proc activity is happening before I start the next 1M rows. None of that seems to matter. 
>
> I have a total of about 70M rows to load, but am at a standstill. I've read up on whatever performance docs I can
findonline, but I am not getting anywhere. 
>
> I've increased shared_buffers to 256MB, and I've tried it with fsync commented out as per the default config. I've
alsotried it with fsync=off. No difference. 
>
> Ideas? Thanks in advance,
> Jon

The initial 1M load if the table has just been truncated or created
has no WAL logging. You can boost maintenance_work_mem to increase
index creation/update performance. You are severely I/O limited and
would be better off dropping your indexes during the load and re-
creating them afterwards. If you are starting with an empty table,
truncate it and then load all the data in a single transaction, all
7 COPY commands. Then COMMIT and build the indexes. Your question
is also missing key information like config details, PostgreSQL version,
...

Cheers,
Ken

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Disk Performance Problem on Large DB

От

Samuel Stearns

Дата:

05 ноября 2010 г., 00:33:45

TRUNCATE doesn't generate any rollback data, which makes it lightning fast. It just de-allocates the data pages used by
thetable. 

Sam

-----Original Message-----
From: Jonathan Hoover [mailto:jhoover@yahoo-inc.com]
Sent: Friday, 5 November 2010 1:59 PM
To: Samuel Stearns; pgsql-admin@postgresql.org
Cc: Kenneth Marshall
Subject: RE: [ADMIN] Disk Performance Problem on Large DB

How does TRUNCATE differ from DELETE FROM <table>? Sorry, probably an easy RTFM question, but I'll ask anyhow.

-----Original Message-----
From: Samuel Stearns [mailto:SStearns@internode.com.au]
Sent: Thursday, November 04, 2010 10:27 PM
To: Jonathan Hoover; pgsql-admin@postgresql.org
Cc: Kenneth Marshall
Subject: RE: [ADMIN] Disk Performance Problem on Large DB

TRUNCATE removes all data from the table leaving the schema structure in place.

What helped the most was probably the drop of the indexes.

-----Original Message-----
From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Jonathan Hoover
Sent: Friday, 5 November 2010 1:53 PM
To: pgsql-admin@postgresql.org
Cc: Kenneth Marshall
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

Just FYI, I removed the PK and the indexes for now. Load times for 1M rows is now 7 SECONDS instead of 7 MINUTES (or
justnever happening). Granted, I made the changes in #1 below, but WOW! So, question: what helped the most: 1) no PK,
2)no indexes, 3) the maintenance_work_mem being uncommented? I will test myself when I have time, but I'd like to know
everyone'sthoughts. 

Jon

-----Original Message-----
From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Jonathan Hoover
Sent: Thursday, November 04, 2010 10:03 PM
To: pgsql-admin@postgresql.org
Cc: Kenneth Marshall
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

1. I have now set maintenance_work_mem to 256 MB (which was previously commented by the default config)
2. The version is PostgreSQL 8.1.18 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat
4.1.2-46)
3. What would be the best and cheapest thing I could for IO performance?
4. I need to read up on TRUNCATE, which I have not used before. Care to give a quick overview before I RTFM?

Thanks,
jon

-----Original Message-----
From: Kenneth Marshall [mailto:ktm@rice.edu]
Sent: Thursday, November 04, 2010 4:03 PM
To: Jonathan Hoover
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

On Thu, Nov 04, 2010 at 01:42:49PM -0700, Jonathan  Hoover wrote:
> Hello,
>
> I have a RHEL 5 box, 4 GB RAM, single hard drive SATA, Intel Core 2 Duo 2.4. A basic workstation.
>
> I have a simple database, with one table for now. It has 4 columns:
>
> anid serial primary key unique,
> time timestamp,
> source varchar(5),
> unitid varchar(15),
> guid varchar(32)
>
> There is a btree index on each.
>
> I am loading data 1,000,000 (1M) rows at a time using psql and a COPY command. Once I hit 2M rows, my performance
justdrops out, and the next 1M never finishes. It takes 7 minutes for 1M rows to load. Once 2M are in there, I've
waitedan hour, and nothing. It doesn't seem to matter which 1M rows I try to load next, none ever finish. Each 1M rows
isabout 70MB on disk in the raw input file. 
>
> I have "atop" installed, and it reports the drives at 100%, which it reports for the first 1M rows too. The MBw/s
goesfrom 20+ on the first 2M rows, down to about 4 MBw/s or less now. The processor usage is at about 2 to 8% at this
time(used by postgres). 
>
> I have even waited for 1M rows to load, then done a vacuum for no good reason, then even restarted postgresql. I've
madesure no disk or proc activity is happening before I start the next 1M rows. None of that seems to matter. 
>
> I have a total of about 70M rows to load, but am at a standstill. I've read up on whatever performance docs I can
findonline, but I am not getting anywhere. 
>
> I've increased shared_buffers to 256MB, and I've tried it with fsync commented out as per the default config. I've
alsotried it with fsync=off. No difference. 
>
> Ideas? Thanks in advance,
> Jon

The initial 1M load if the table has just been truncated or created
has no WAL logging. You can boost maintenance_work_mem to increase
index creation/update performance. You are severely I/O limited and
would be better off dropping your indexes during the load and re-
creating them afterwards. If you are starting with an empty table,
truncate it and then load all the data in a single transaction, all
7 COPY commands. Then COMMIT and build the indexes. Your question
is also missing key information like config details, PostgreSQL version,
...

Cheers,
Ken

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Disk Performance Problem on Large DB

От

Scott Marlowe

Дата:

05 ноября 2010 г., 00:34:06

On Thu, Nov 4, 2010 at 9:03 PM, Jonathan  Hoover <jhoover@yahoo-inc.com> wrote:
> 1. I have now set maintenance_work_mem to 256 MB (which was previously commented by the default config)
> 2. The version is PostgreSQL 8.1.18 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat
4.1.2-46)

If performance matters, look at upgrading to at LEAST 8.3.latest or
8.4.latest.  Preferably 9.0.latest.

> 3. What would be the best and cheapest thing I could for IO performance?

Get an SSD drive.  Or a RAID controller with battery backed cache.  If
you're data isn't valuable (i.e. you can reproduce it at will) then
turning off things like fsync and full page writes (I don't think 8.1
has the ability to turn off full page writes). can help a lot.

> 4. I need to read up on TRUNCATE, which I have not used before. Care to give a quick overview before I RTFM?

Truncate is basically a DDL (data definition language) command as
opposed to a DML (data manipulation language) command.  It baseically
drops the table underneath the table def and recreates it as empty.

Re: Disk Performance Problem on Large DB

От

"Jonathan Hoover"

Дата:

05 ноября 2010 г., 00:40:25

How does TRUNCATE differ from DELETE FROM <table>? Sorry, probably an easy RTFM question, but I'll ask anyhow.

-----Original Message-----
From: Samuel Stearns [mailto:SStearns@internode.com.au]
Sent: Thursday, November 04, 2010 10:27 PM
To: Jonathan Hoover; pgsql-admin@postgresql.org
Cc: Kenneth Marshall
Subject: RE: [ADMIN] Disk Performance Problem on Large DB

TRUNCATE removes all data from the table leaving the schema structure in place.

What helped the most was probably the drop of the indexes.

-----Original Message-----
From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Jonathan Hoover
Sent: Friday, 5 November 2010 1:53 PM
To: pgsql-admin@postgresql.org
Cc: Kenneth Marshall
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

Just FYI, I removed the PK and the indexes for now. Load times for 1M rows is now 7 SECONDS instead of 7 MINUTES (or
justnever happening). Granted, I made the changes in #1 below, but WOW! So, question: what helped the most: 1) no PK,
2)no indexes, 3) the maintenance_work_mem being uncommented? I will test myself when I have time, but I'd like to know
everyone'sthoughts. 

Jon

-----Original Message-----
From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Jonathan Hoover
Sent: Thursday, November 04, 2010 10:03 PM
To: pgsql-admin@postgresql.org
Cc: Kenneth Marshall
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

1. I have now set maintenance_work_mem to 256 MB (which was previously commented by the default config)
2. The version is PostgreSQL 8.1.18 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat
4.1.2-46)
3. What would be the best and cheapest thing I could for IO performance?
4. I need to read up on TRUNCATE, which I have not used before. Care to give a quick overview before I RTFM?

Thanks,
jon

-----Original Message-----
From: Kenneth Marshall [mailto:ktm@rice.edu]
Sent: Thursday, November 04, 2010 4:03 PM
To: Jonathan Hoover
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

On Thu, Nov 04, 2010 at 01:42:49PM -0700, Jonathan  Hoover wrote:
> Hello,
>
> I have a RHEL 5 box, 4 GB RAM, single hard drive SATA, Intel Core 2 Duo 2.4. A basic workstation.
>
> I have a simple database, with one table for now. It has 4 columns:
>
> anid serial primary key unique,
> time timestamp,
> source varchar(5),
> unitid varchar(15),
> guid varchar(32)
>
> There is a btree index on each.
>
> I am loading data 1,000,000 (1M) rows at a time using psql and a COPY command. Once I hit 2M rows, my performance
justdrops out, and the next 1M never finishes. It takes 7 minutes for 1M rows to load. Once 2M are in there, I've
waitedan hour, and nothing. It doesn't seem to matter which 1M rows I try to load next, none ever finish. Each 1M rows
isabout 70MB on disk in the raw input file. 
>
> I have "atop" installed, and it reports the drives at 100%, which it reports for the first 1M rows too. The MBw/s
goesfrom 20+ on the first 2M rows, down to about 4 MBw/s or less now. The processor usage is at about 2 to 8% at this
time(used by postgres). 
>
> I have even waited for 1M rows to load, then done a vacuum for no good reason, then even restarted postgresql. I've
madesure no disk or proc activity is happening before I start the next 1M rows. None of that seems to matter. 
>
> I have a total of about 70M rows to load, but am at a standstill. I've read up on whatever performance docs I can
findonline, but I am not getting anywhere. 
>
> I've increased shared_buffers to 256MB, and I've tried it with fsync commented out as per the default config. I've
alsotried it with fsync=off. No difference. 
>
> Ideas? Thanks in advance,
> Jon

The initial 1M load if the table has just been truncated or created
has no WAL logging. You can boost maintenance_work_mem to increase
index creation/update performance. You are severely I/O limited and
would be better off dropping your indexes during the load and re-
creating them afterwards. If you are starting with an empty table,
truncate it and then load all the data in a single transaction, all
7 COPY commands. Then COMMIT and build the indexes. Your question
is also missing key information like config details, PostgreSQL version,
...

Cheers,
Ken

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Disk Performance Problem on Large DB

От

Samuel Stearns

Дата:

05 ноября 2010 г., 00:40:46

Actually, this is a better definition of TRUNCATE than my previous post.  From the doco:

TRUNCATE quickly removes all rows from a set of tables. It has the same effect as an unqualified DELETE on each table,
butsince it does not actually scan the tables it is faster. Furthermore, it reclaims disk space immediately, rather
thanrequiring a subsequent VACUUM operation. This is most useful on large tables. 

Sam

-----Original Message-----
From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Scott Marlowe
Sent: Friday, 5 November 2010 2:04 PM
To: Jonathan Hoover
Cc: pgsql-admin@postgresql.org; Kenneth Marshall
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

On Thu, Nov 4, 2010 at 9:03 PM, Jonathan  Hoover <jhoover@yahoo-inc.com> wrote:
> 1. I have now set maintenance_work_mem to 256 MB (which was previously commented by the default config)
> 2. The version is PostgreSQL 8.1.18 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat
4.1.2-46)

If performance matters, look at upgrading to at LEAST 8.3.latest or
8.4.latest.  Preferably 9.0.latest.

> 3. What would be the best and cheapest thing I could for IO performance?

Get an SSD drive.  Or a RAID controller with battery backed cache.  If
you're data isn't valuable (i.e. you can reproduce it at will) then
turning off things like fsync and full page writes (I don't think 8.1
has the ability to turn off full page writes). can help a lot.

> 4. I need to read up on TRUNCATE, which I have not used before. Care to give a quick overview before I RTFM?

Truncate is basically a DDL (data definition language) command as
opposed to a DML (data manipulation language) command.  It baseically
drops the table underneath the table def and recreates it as empty.

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Disk Performance Problem on Large DB

От

Scott Marlowe

Дата:

05 ноября 2010 г., 00:42:08

On Thu, Nov 4, 2010 at 9:33 PM, Samuel Stearns
<SStearns@internode.com.au> wrote:
> TRUNCATE doesn't generate any rollback data, which makes it lightning fast. It just de-allocates the data pages used
bythe table. 

Also truncate, like nearly everything in pgsql, can be rolled back.  I
still remember showing my oracle co-dbas in my last job freak out when
I showed them things like

begin;
update table yada...
truncate table;
rollback;

or
begin;
drop index xyz;
explain select ...
rollback;

transactable everything is pretty cool. (note database and tablespace
craete / drop are the only things that aren't transactable, which
makes some sense.)

Re: Disk Performance Problem on Large DB

От

Samuel Stearns

Дата:

05 ноября 2010 г., 00:45:09

Yep!

Coming from previous Oracle job into Postgres, discovering the transactable stuff, is indeed, pretty cool.

-----Original Message-----
From: Scott Marlowe [mailto:scott.marlowe@gmail.com]
Sent: Friday, 5 November 2010 2:12 PM
To: Samuel Stearns
Cc: Jonathan Hoover; pgsql-admin@postgresql.org; Kenneth Marshall
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

On Thu, Nov 4, 2010 at 9:33 PM, Samuel Stearns
<SStearns@internode.com.au> wrote:
> TRUNCATE doesn't generate any rollback data, which makes it lightning fast. It just de-allocates the data pages used
bythe table. 

Also truncate, like nearly everything in pgsql, can be rolled back.  I
still remember showing my oracle co-dbas in my last job freak out when
I showed them things like

begin;
update table yada...
truncate table;
rollback;

or
begin;
drop index xyz;
explain select ...
rollback;

transactable everything is pretty cool. (note database and tablespace
craete / drop are the only things that aren't transactable, which
makes some sense.)

Re: Disk Performance Problem on Large DB

От

"Jonathan Hoover"

Дата:

05 ноября 2010 г., 00:46:57

I am in the middle of re-creating the indexes now, and what is interesting is how atop is not reporting heavy use of
thehard drive now. Instead, I see postgres using 80% of the proc (instead of 8% earlier) and drive usage is 20+ MBr/s
and16+ MBw/s now (instead of .1 and 3.0 respectively earlier). Could it really be the PK causing the delay, or is it
reallythe maintenance_work_mem or simply the idea of creating the indexing after? Good info, hopefully I can do some
testingover these ideas over the next few days. For now, I'm hoping I can just get things moving enough. 

As I finished this up, I have noticed disk performance is down to 4+ MBw/s and MBr/s, but it is not "red" in atop any
longer,and proc usage now seems to be the limiting factor. 

Good stuff...
Jon

-----Original Message-----
From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Jonathan Hoover
Sent: Thursday, November 04, 2010 10:29 PM
To: Samuel Stearns; pgsql-admin@postgresql.org
Cc: Kenneth Marshall
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

How does TRUNCATE differ from DELETE FROM <table>? Sorry, probably an easy RTFM question, but I'll ask anyhow.

-----Original Message-----
From: Samuel Stearns [mailto:SStearns@internode.com.au]
Sent: Thursday, November 04, 2010 10:27 PM
To: Jonathan Hoover; pgsql-admin@postgresql.org
Cc: Kenneth Marshall
Subject: RE: [ADMIN] Disk Performance Problem on Large DB

TRUNCATE removes all data from the table leaving the schema structure in place.

What helped the most was probably the drop of the indexes.

-----Original Message-----
From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Jonathan Hoover
Sent: Friday, 5 November 2010 1:53 PM
To: pgsql-admin@postgresql.org
Cc: Kenneth Marshall
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

Just FYI, I removed the PK and the indexes for now. Load times for 1M rows is now 7 SECONDS instead of 7 MINUTES (or
justnever happening). Granted, I made the changes in #1 below, but WOW! So, question: what helped the most: 1) no PK,
2)no indexes, 3) the maintenance_work_mem being uncommented? I will test myself when I have time, but I'd like to know
everyone'sthoughts. 

Jon

-----Original Message-----
From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Jonathan Hoover
Sent: Thursday, November 04, 2010 10:03 PM
To: pgsql-admin@postgresql.org
Cc: Kenneth Marshall
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

1. I have now set maintenance_work_mem to 256 MB (which was previously commented by the default config)
2. The version is PostgreSQL 8.1.18 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat
4.1.2-46)
3. What would be the best and cheapest thing I could for IO performance?
4. I need to read up on TRUNCATE, which I have not used before. Care to give a quick overview before I RTFM?

Thanks,
jon

-----Original Message-----
From: Kenneth Marshall [mailto:ktm@rice.edu]
Sent: Thursday, November 04, 2010 4:03 PM
To: Jonathan Hoover
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

On Thu, Nov 04, 2010 at 01:42:49PM -0700, Jonathan  Hoover wrote:
> Hello,
>
> I have a RHEL 5 box, 4 GB RAM, single hard drive SATA, Intel Core 2 Duo 2.4. A basic workstation.
>
> I have a simple database, with one table for now. It has 4 columns:
>
> anid serial primary key unique,
> time timestamp,
> source varchar(5),
> unitid varchar(15),
> guid varchar(32)
>
> There is a btree index on each.
>
> I am loading data 1,000,000 (1M) rows at a time using psql and a COPY command. Once I hit 2M rows, my performance
justdrops out, and the next 1M never finishes. It takes 7 minutes for 1M rows to load. Once 2M are in there, I've
waitedan hour, and nothing. It doesn't seem to matter which 1M rows I try to load next, none ever finish. Each 1M rows
isabout 70MB on disk in the raw input file. 
>
> I have "atop" installed, and it reports the drives at 100%, which it reports for the first 1M rows too. The MBw/s
goesfrom 20+ on the first 2M rows, down to about 4 MBw/s or less now. The processor usage is at about 2 to 8% at this
time(used by postgres). 
>
> I have even waited for 1M rows to load, then done a vacuum for no good reason, then even restarted postgresql. I've
madesure no disk or proc activity is happening before I start the next 1M rows. None of that seems to matter. 
>
> I have a total of about 70M rows to load, but am at a standstill. I've read up on whatever performance docs I can
findonline, but I am not getting anywhere. 
>
> I've increased shared_buffers to 256MB, and I've tried it with fsync commented out as per the default config. I've
alsotried it with fsync=off. No difference. 
>
> Ideas? Thanks in advance,
> Jon

The initial 1M load if the table has just been truncated or created
has no WAL logging. You can boost maintenance_work_mem to increase
index creation/update performance. You are severely I/O limited and
would be better off dropping your indexes during the load and re-
creating them afterwards. If you are starting with an empty table,
truncate it and then load all the data in a single transaction, all
7 COPY commands. Then COMMIT and build the indexes. Your question
is also missing key information like config details, PostgreSQL version,
...

Cheers,
Ken

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Disk Performance Problem on Large DB

От

Samuel Stearns

Дата:

05 ноября 2010 г., 00:49:54

Your biggest benefit was dropping the indexes before the load, most likely.

-----Original Message-----
From: Jonathan Hoover [mailto:jhoover@yahoo-inc.com]
Sent: Friday, 5 November 2010 2:16 PM
To: Samuel Stearns; pgsql-admin@postgresql.org
Cc: Kenneth Marshall
Subject: RE: [ADMIN] Disk Performance Problem on Large DB

I am in the middle of re-creating the indexes now, and what is interesting is how atop is not reporting heavy use of
thehard drive now. Instead, I see postgres using 80% of the proc (instead of 8% earlier) and drive usage is 20+ MBr/s
and16+ MBw/s now (instead of .1 and 3.0 respectively earlier). Could it really be the PK causing the delay, or is it
reallythe maintenance_work_mem or simply the idea of creating the indexing after? Good info, hopefully I can do some
testingover these ideas over the next few days. For now, I'm hoping I can just get things moving enough. 

As I finished this up, I have noticed disk performance is down to 4+ MBw/s and MBr/s, but it is not "red" in atop any
longer,and proc usage now seems to be the limiting factor. 

Good stuff...
Jon

-----Original Message-----
From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Jonathan Hoover
Sent: Thursday, November 04, 2010 10:29 PM
To: Samuel Stearns; pgsql-admin@postgresql.org
Cc: Kenneth Marshall
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

How does TRUNCATE differ from DELETE FROM <table>? Sorry, probably an easy RTFM question, but I'll ask anyhow.

-----Original Message-----
From: Samuel Stearns [mailto:SStearns@internode.com.au]
Sent: Thursday, November 04, 2010 10:27 PM
To: Jonathan Hoover; pgsql-admin@postgresql.org
Cc: Kenneth Marshall
Subject: RE: [ADMIN] Disk Performance Problem on Large DB

TRUNCATE removes all data from the table leaving the schema structure in place.

What helped the most was probably the drop of the indexes.

-----Original Message-----
From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Jonathan Hoover
Sent: Friday, 5 November 2010 1:53 PM
To: pgsql-admin@postgresql.org
Cc: Kenneth Marshall
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

Just FYI, I removed the PK and the indexes for now. Load times for 1M rows is now 7 SECONDS instead of 7 MINUTES (or
justnever happening). Granted, I made the changes in #1 below, but WOW! So, question: what helped the most: 1) no PK,
2)no indexes, 3) the maintenance_work_mem being uncommented? I will test myself when I have time, but I'd like to know
everyone'sthoughts. 

Jon

-----Original Message-----
From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Jonathan Hoover
Sent: Thursday, November 04, 2010 10:03 PM
To: pgsql-admin@postgresql.org
Cc: Kenneth Marshall
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

1. I have now set maintenance_work_mem to 256 MB (which was previously commented by the default config)
2. The version is PostgreSQL 8.1.18 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat
4.1.2-46)
3. What would be the best and cheapest thing I could for IO performance?
4. I need to read up on TRUNCATE, which I have not used before. Care to give a quick overview before I RTFM?

Thanks,
jon

-----Original Message-----
From: Kenneth Marshall [mailto:ktm@rice.edu]
Sent: Thursday, November 04, 2010 4:03 PM
To: Jonathan Hoover
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Disk Performance Problem on Large DB

On Thu, Nov 04, 2010 at 01:42:49PM -0700, Jonathan  Hoover wrote:
> Hello,
>
> I have a RHEL 5 box, 4 GB RAM, single hard drive SATA, Intel Core 2 Duo 2.4. A basic workstation.
>
> I have a simple database, with one table for now. It has 4 columns:
>
> anid serial primary key unique,
> time timestamp,
> source varchar(5),
> unitid varchar(15),
> guid varchar(32)
>
> There is a btree index on each.
>
> I am loading data 1,000,000 (1M) rows at a time using psql and a COPY command. Once I hit 2M rows, my performance
justdrops out, and the next 1M never finishes. It takes 7 minutes for 1M rows to load. Once 2M are in there, I've
waitedan hour, and nothing. It doesn't seem to matter which 1M rows I try to load next, none ever finish. Each 1M rows
isabout 70MB on disk in the raw input file. 
>
> I have "atop" installed, and it reports the drives at 100%, which it reports for the first 1M rows too. The MBw/s
goesfrom 20+ on the first 2M rows, down to about 4 MBw/s or less now. The processor usage is at about 2 to 8% at this
time(used by postgres). 
>
> I have even waited for 1M rows to load, then done a vacuum for no good reason, then even restarted postgresql. I've
madesure no disk or proc activity is happening before I start the next 1M rows. None of that seems to matter. 
>
> I have a total of about 70M rows to load, but am at a standstill. I've read up on whatever performance docs I can
findonline, but I am not getting anywhere. 
>
> I've increased shared_buffers to 256MB, and I've tried it with fsync commented out as per the default config. I've
alsotried it with fsync=off. No difference. 
>
> Ideas? Thanks in advance,
> Jon

The initial 1M load if the table has just been truncated or created
has no WAL logging. You can boost maintenance_work_mem to increase
index creation/update performance. You are severely I/O limited and
would be better off dropping your indexes during the load and re-
creating them afterwards. If you are starting with an empty table,
truncate it and then load all the data in a single transaction, all
7 COPY commands. Then COMMIT and build the indexes. Your question
is also missing key information like config details, PostgreSQL version,
...

Cheers,
Ken

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Disk Performance Problem on Large DB

От

Kenneth Marshall

Дата:

05 ноября 2010 г., 10:17:47

Correct, with a single drive and no write cache, once you get more
than 1 I/O running simultaneously, i.e. 1 writing the data and 1
writing each index = 5 I/Os at once, you effectively devolve to
your drives random I/O rate which can be an order of magnitude
slower than its sequential I/O rate. You can use bonnie or some
other disk speed test to get those numbers for your system. When
you do the indexes after the load, each step can use sequential
I/O much more of the time which is why it runs so much faster.

Cheers,
Ken

On Fri, Nov 05, 2010 at 02:19:41PM +1030, Samuel Stearns wrote:
> Your biggest benefit was dropping the indexes before the load, most likely.
>
> -----Original Message-----
> From: Jonathan Hoover [mailto:jhoover@yahoo-inc.com]
> Sent: Friday, 5 November 2010 2:16 PM
> To: Samuel Stearns; pgsql-admin@postgresql.org
> Cc: Kenneth Marshall
> Subject: RE: [ADMIN] Disk Performance Problem on Large DB
>
> I am in the middle of re-creating the indexes now, and what is interesting is how atop is not reporting heavy use of
thehard drive now. Instead, I see postgres using 80% of the proc (instead of 8% earlier) and drive usage is 20+ MBr/s
and16+ MBw/s now (instead of .1 and 3.0 respectively earlier). Could it really be the PK causing the delay, or is it
reallythe maintenance_work_mem or simply the idea of creating the indexing after? Good info, hopefully I can do some
testingover these ideas over the next few days. For now, I'm hoping I can just get things moving enough. 
>
> As I finished this up, I have noticed disk performance is down to 4+ MBw/s and MBr/s, but it is not "red" in atop any
longer,and proc usage now seems to be the limiting factor. 
>
> Good stuff...
> Jon
>
> -----Original Message-----
> From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Jonathan Hoover
> Sent: Thursday, November 04, 2010 10:29 PM
> To: Samuel Stearns; pgsql-admin@postgresql.org
> Cc: Kenneth Marshall
> Subject: Re: [ADMIN] Disk Performance Problem on Large DB
>
> How does TRUNCATE differ from DELETE FROM <table>? Sorry, probably an easy RTFM question, but I'll ask anyhow.
>
> -----Original Message-----
> From: Samuel Stearns [mailto:SStearns@internode.com.au]
> Sent: Thursday, November 04, 2010 10:27 PM
> To: Jonathan Hoover; pgsql-admin@postgresql.org
> Cc: Kenneth Marshall
> Subject: RE: [ADMIN] Disk Performance Problem on Large DB
>
> TRUNCATE removes all data from the table leaving the schema structure in place.
>
> What helped the most was probably the drop of the indexes.
>
> -----Original Message-----
> From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Jonathan Hoover
> Sent: Friday, 5 November 2010 1:53 PM
> To: pgsql-admin@postgresql.org
> Cc: Kenneth Marshall
> Subject: Re: [ADMIN] Disk Performance Problem on Large DB
>
> Just FYI, I removed the PK and the indexes for now. Load times for 1M rows is now 7 SECONDS instead of 7 MINUTES (or
justnever happening). Granted, I made the changes in #1 below, but WOW! So, question: what helped the most: 1) no PK,
2)no indexes, 3) the maintenance_work_mem being uncommented? I will test myself when I have time, but I'd like to know
everyone'sthoughts. 
>
> Jon
>
> -----Original Message-----
> From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Jonathan Hoover
> Sent: Thursday, November 04, 2010 10:03 PM
> To: pgsql-admin@postgresql.org
> Cc: Kenneth Marshall
> Subject: Re: [ADMIN] Disk Performance Problem on Large DB
>
> 1. I have now set maintenance_work_mem to 256 MB (which was previously commented by the default config)
> 2. The version is PostgreSQL 8.1.18 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat
4.1.2-46)
> 3. What would be the best and cheapest thing I could for IO performance?
> 4. I need to read up on TRUNCATE, which I have not used before. Care to give a quick overview before I RTFM?
>
> Thanks,
> jon
>
> -----Original Message-----
> From: Kenneth Marshall [mailto:ktm@rice.edu]
> Sent: Thursday, November 04, 2010 4:03 PM
> To: Jonathan Hoover
> Cc: pgsql-admin@postgresql.org
> Subject: Re: [ADMIN] Disk Performance Problem on Large DB
>
> On Thu, Nov 04, 2010 at 01:42:49PM -0700, Jonathan  Hoover wrote:
> > Hello,
> >
> > I have a RHEL 5 box, 4 GB RAM, single hard drive SATA, Intel Core 2 Duo 2.4. A basic workstation.
> >
> > I have a simple database, with one table for now. It has 4 columns:
> >
> > anid serial primary key unique,
> > time timestamp,
> > source varchar(5),
> > unitid varchar(15),
> > guid varchar(32)
> >
> > There is a btree index on each.
> >
> > I am loading data 1,000,000 (1M) rows at a time using psql and a COPY command. Once I hit 2M rows, my performance
justdrops out, and the next 1M never finishes. It takes 7 minutes for 1M rows to load. Once 2M are in there, I've
waitedan hour, and nothing. It doesn't seem to matter which 1M rows I try to load next, none ever finish. Each 1M rows
isabout 70MB on disk in the raw input file. 
> >
> > I have "atop" installed, and it reports the drives at 100%, which it reports for the first 1M rows too. The MBw/s
goesfrom 20+ on the first 2M rows, down to about 4 MBw/s or less now. The processor usage is at about 2 to 8% at this
time(used by postgres). 
> >
> > I have even waited for 1M rows to load, then done a vacuum for no good reason, then even restarted postgresql. I've
madesure no disk or proc activity is happening before I start the next 1M rows. None of that seems to matter. 
> >
> > I have a total of about 70M rows to load, but am at a standstill. I've read up on whatever performance docs I can
findonline, but I am not getting anywhere. 
> >
> > I've increased shared_buffers to 256MB, and I've tried it with fsync commented out as per the default config. I've
alsotried it with fsync=off. No difference. 
> >
> > Ideas? Thanks in advance,
> > Jon
>
> The initial 1M load if the table has just been truncated or created
> has no WAL logging. You can boost maintenance_work_mem to increase
> index creation/update performance. You are severely I/O limited and
> would be better off dropping your indexes during the load and re-
> creating them afterwards. If you are starting with an empty table,
> truncate it and then load all the data in a single transaction, all
> 7 COPY commands. Then COMMIT and build the indexes. Your question
> is also missing key information like config details, PostgreSQL version,
> ...
>
> Cheers,
> Ken
>
> --
> Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin
>
> --
> Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin
>
> --
> Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin
>

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Disk Performance Problem on Large DB