Обсуждение: Why so long between archive calls?
What controls how often the archive_command is called?
The actual running of my archive script is taking 1-2 seconds, so I'm not sure why we are so far behind.
Thanks for any advise,
Chris
PG 8.1.3
RH AS 4
Chris, I take it you've checked that the archive_command is actually succeeding (check the postgresql log file.) Perhaps try running it manually with an early log file as the PostgreSQL user account? When you say you're 500 files behind, how old is the earliest log? Andy. Chris Hoover wrote: > I am working on getting my development system working with archive > logging so I can use the pitr features of Postgres. However, I am > finding that the actual archiving is not keeping up with the > generation of the logs. I am now over 500 wal archive files behind > (8+ GB). > > What controls how often the archive_command is called? > > The actual running of my archive script is taking 1-2 seconds, so I'm > not sure why we are so far behind. > > Thanks for any advise, > > Chris > > PG 8.1.3 > RH AS 4 > !DSPAM:37,44f45f03143293039237985!
-rw------- 1 postgres postgres 16777216 Aug 28 15:24 00000001000001930000005A
The command I am using is running successfully and I am getting good entries in my log files.
,15282,,2006-08-29 11:07:04.443 EDT,,13552,,,,LOG: archived transaction log file "00000001000001930000004E"
,15282,,2006-08-29 11:22:54.982 EDT,,13553,,,,LOG: archived transaction log file "00000001000001930000004F"
,15282,,2006-08-29 11:24:40.126 EDT,,13554,,,,LOG: archived transaction log file "000000010000019300000050"
,15282,,2006-08-29 11:32:04.650 EDT,,13555,,,,LOG: archived transaction log file "000000010000019300000051"
,15282,,2006-08-29 11:33:30.900 EDT,,13556,,,,LOG: archived transaction log file "000000010000019300000052"
,15282,,2006-08-29 11:37:21.333 EDT,,13557,,,,LOG: archived transaction log file "000000010000019300000053"
,15282,,2006-08-29 11:44:50.792 EDT,,13558,,,,LOG: archived transaction log file "000000010000019300000054"
,15282,,2006-08-29 12:24:54.566 EDT,,13559,,,,LOG: archived transaction log file "000000010000019300000055"
,15282,,2006-08-29 12:25:08.530 EDT,,13560,,,,LOG: archived transaction log file "000000010000019300000056"
,15282,,2006-08-29 12:25:10.754 EDT,,13561,,,,LOG: archived transaction log file "000000010000019300000057"
,15282,,2006-08-29 12:25:14.619 EDT,,13562,,,,LOG: archived transaction log file "000000010000019300000058"
So, it looks like everything is good, just the archive command is not getting called enough.
Thanks,
Chris
Chris,
I take it you've checked that the archive_command is actually succeeding
(check the postgresql log file.)
Perhaps try running it manually with an early log file as the PostgreSQL
user account?
When you say you're 500 files behind, how old is the earliest log?
Andy.
Chris Hoover wrote:
> I am working on getting my development system working with archive
> logging so I can use the pitr features of Postgres. However, I am
> finding that the actual archiving is not keeping up with the
> generation of the logs. I am now over 500 wal archive files behind
> (8+ GB).
>
> What controls how often the archive_command is called?
>
> The actual running of my archive script is taking 1-2 seconds, so I'm
> not sure why we are so far behind.
>
> Thanks for any advise,
>
> Chris
>
> PG 8.1.3
> RH AS 4
> !DSPAM:37,44f45f03143293039237985!
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
Here is my postgresql.conf
listen_addresses = '*'
port = 50001
max_connections = 256
shared_buffers = 32768
max_prepared_transactions = 256
work_mem = 4096
maintenance_work_mem = 1048576
max_fsm_pages = 8000000
max_fsm_relations = 10000
vacuum_cost_delay = 1000
vacuum_cost_page_hit = 1
vacuum_cost_page_miss = 10
vacuum_cost_page_dirty = 20
vacuum_cost_limit = 8000
wal_buffers = 64
checkpoint_segments = 256
checkpoint_timeout = 1800
archive_command = '/home/postgres/bin/archive_pg_xlog.sh %p %f 50001'
effective_cache_size = 383490
random_page_cost = 2
default_statistics_target = 100
constraint_exclusion = on
redirect_stderr = on
log_directory = 'pg_log'
log_truncate_on_rotation = on
log_rotation_age = 1440
log_rotation_size = 1048576
log_min_messages = debug2
log_min_duration_statement = 0
log_connections = on
log_disconnections = on
log_duration = on
log_line_prefix = '%d,%p,%u,%m,%c,%l,%s,%x,%i,'
log_statement = 'all'
stats_start_collector = on
stats_command_string = on
stats_block_level = on
stats_row_level = on
stats_reset_on_server_start = on
autovacuum = on
autovacuum_naptime = 60
autovacuum_vacuum_threshold = 1000
autovacuum_analyze_threshold = 500
autovacuum_vacuum_scale_factor = 0.001
autovacuum_analyze_scale_factor = 0.0005
autovacuum_vacuum_cost_delay = -1
autovacuum_vacuum_cost_limit = -1
statement_timeout = 0
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'
add_missing_from = on
The oldest log file that is not archived is:
-rw------- 1 postgres postgres 16777216 Aug 28 15:24 00000001000001930000005A
The command I am using is running successfully and I am getting good entries in my log files.
,15282,,2006-08-29 11:07:04.443 EDT,,13552,,,,LOG: archived transaction log file "00000001000001930000004E"
,15282,,2006-08-29 11:22:54.982 EDT,,13553,,,,LOG: archived transaction log file "00000001000001930000004F"
,15282,,2006-08-29 11:24:40.126 EDT,,13554,,,,LOG: archived transaction log file "000000010000019300000050"
,15282,,2006-08-29 11:32:04.650 EDT,,13555,,,,LOG: archived transaction log file "000000010000019300000051"
,15282,,2006-08-29 11:33:30.900 EDT,,13556,,,,LOG: archived transaction log file "000000010000019300000052"
,15282,,2006-08-29 11:37:21.333 EDT,,13557,,,,LOG: archived transaction log file "000000010000019300000053"
,15282,,2006-08-29 11:44:50.792 EDT,,13558,,,,LOG: archived transaction log file "000000010000019300000054"
,15282,,2006-08-29 12:24:54.566 EDT,,13559,,,,LOG: archived transaction log file "000000010000019300000055"
,15282,,2006-08-29 12:25:08.530 EDT,,13560,,,,LOG: archived transaction log file "000000010000019300000056"
,15282,,2006-08-29 12:25:10.754 EDT,,13561,,,,LOG: archived transaction log file "000000010000019300000057"
,15282,,2006-08-29 12:25:14.619 EDT,,13562,,,,LOG: archived transaction log file "000000010000019300000058"
So, it looks like everything is good, just the archive command is not getting called enough.
Thanks,
ChrisOn 8/29/06, Andy Shellam < andy.shellam-lists@mailnetwork.co.uk > wrote:Chris,
I take it you've checked that the archive_command is actually succeeding
(check the postgresql log file.)
Perhaps try running it manually with an early log file as the PostgreSQL
user account?
When you say you're 500 files behind, how old is the earliest log?
Andy.
Chris Hoover wrote:
> I am working on getting my development system working with archive
> logging so I can use the pitr features of Postgres. However, I am
> finding that the actual archiving is not keeping up with the
> generation of the logs. I am now over 500 wal archive files behind
> (8+ GB).
>
> What controls how often the archive_command is called?
>
> The actual running of my archive script is taking 1-2 seconds, so I'm
> not sure why we are so far behind.
>
> Thanks for any advise,
>
> Chris
>
> PG 8.1.3
> RH AS 4
> !DSPAM:37,44f45f03143293039237985!
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
"Chris Hoover" <revoohc@gmail.com> writes: > Any other ideas? I am still 508 archive log files behind. Try "strace -f -p <pid>" on the archiver subprocess to see what it's doing and what the launched scripts are doing. regards, tom lane
I just ran this for several minutes and have about an 11MB trace file. Is there anything particular you want to see?
Also, is the data in the file deciferable or have actual database data in it? The reason I ask, is that I work under HIPPA regulations, so I can't release anything if it has potential personal/patient data in it.
Chris
"Chris Hoover" <revoohc@gmail.com> writes:
> Any other ideas? I am still 508 archive log files behind.
Try "strace -f -p <pid>" on the archiver subprocess to see what it's
doing and what the launched scripts are doing.
regards, tom lane
"Chris Hoover" <revoohc@gmail.com> writes: > I just ran this for several minutes and have about an 11MB trace file. Is > there anything particular you want to see? > Also, is the data in the file deciferable or have actual database data in > it? The reason I ask, is that I work under HIPPA regulations, so I can't > release anything if it has potential personal/patient data in it. Yeah, you probably want to strip out the data portion of any read() or write() calls shown in the trace. BTW, I forgot to ask for timestamp information in the trace --- if you didn't use -t please do it over. It might be possible to get strace to skip the read/write data by itself so you don't have to do it manually, but I don't see anything obvious in its help output. regards, tom lane
Tom Lane wrote: > "Chris Hoover" <revoohc@gmail.com> writes: > > I just ran this for several minutes and have about an 11MB trace file. Is > > there anything particular you want to see? > > > Also, is the data in the file deciferable or have actual database data in > > it? The reason I ask, is that I work under HIPPA regulations, so I can't > > release anything if it has potential personal/patient data in it. > > Yeah, you probably want to strip out the data portion of any read() or > write() calls shown in the trace. > > BTW, I forgot to ask for timestamp information in the trace --- if you > didn't use -t please do it over. It might be possible to get strace > to skip the read/write data by itself so you don't have to do it > manually, but I don't see anything obvious in its help output. strace -s0 That'll cut any strings though, not only for read/writes. You'll still get file names. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > strace -s0 > That'll cut any strings though, not only for read/writes. You'll still > get file names. Oh, that would work then. I saw that but assumed it would trim file names too, which would be a problem here. regards, tom lane
Hi,
My config file has the following setting:
log_min_messages = fatal
This setting assures that postgres logging is minimal. However, I would like to change this setting dynamically – while our embedded application is running, without restarting postgres. For example, in the case where the application appears to be running sluggishly, I would like to change this setting to debug1 so that I can collect detailed information. The problem is the SET command applies only to the current session:
set log_min_messages='debug1';
Is there a way to accomplish this?
Currently, we have postgres generate a lot of debug messages (debug1) and then have syslog filter the messages as required. This works since I can edit the syslog.conf and restart syslog independent of our application. This is a less than ideal situation since we normally don’t want postgres producing lots of log data (wasting cpu) since we require it only when doing debugging.
Thanks
Ike
Did you get my reply to this message with my data? I have not seen it come onto the list. I sent it out yesterday.
Chris
Alvaro Herrera < alvherre@commandprompt.com> writes:
> strace -s0
> That'll cut any strings though, not only for read/writes. You'll still
> get file names.
Oh, that would work then. I saw that but assumed it would trim file
names too, which would be a problem here.
regards, tom lane
Chris Hoover wrote: > Tom, > > Did you get my reply to this message with my data? I have not seen it come > onto the list. I sent it out yesterday. I got it at least (but then, I'm on Cc). Not sure if the list received it. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
"Chris Hoover" <revoohc@gmail.com> writes: > Right now, I have 510 log files waiting to be archived totaling 8GB. > ... > Why is the server so far behind? Good question. The trace shows the archiver scanning pg_xlog/archive_status/ and apparently not finding any .ready files. What do you see in that directory? regards, tom lane
Like I mentioned, the db server was stoped and restarted on 9/5 am, but the oldest unarchived log file is:
-rw------- 1 postgres postgres 16M Sep 1 04:04 0000000100000197000000F1
I'm not sure what is going on, since the db was stopped and restarted, there should be no hanging transactions. I would have believed all of the archive files from before the reboot would have been archived by now.
Let me know if I can provide you with further information. This is my development/non-production db server, so I do have a little bit of lattitude on what I do with it.
Chris
"Chris Hoover" < revoohc@gmail.com> writes:
> Right now, I have 510 log files waiting to be archived totaling 8GB.
> ...
> Why is the server so far behind?
Good question. The trace shows the archiver scanning pg_xlog/archive_status/
and apparently not finding any .ready files. What do you see in that
directory?
regards, tom lane
"Chris Hoover" <revoohc@gmail.com> writes: > Currenty, there are no .ready files in the pg_xlog/archive_status file. Well, that explains why the archiver thinks it has nothing to do. > Like I mentioned, the db server was stoped and restarted on 9/5 am, but the > oldest unarchived log file is: > -rw------- 1 postgres postgres 16M Sep 1 04:04 0000000100000197000000F1 Are you sure that's an unarchived file, and not a file waiting to be used later? What's the current WAL endpoint address? (pg_controldata would tell you ... but your trace showed 000000010000019700000078 being archived, which would suggest that this one is just waiting to be reused.) Have you got a particularly large checkpoint_segments setting? regards, tom lane
I am able to tell how far behind the archiving is running since every minute, I copy the current archive_logs to a backup directory. This is allowing me to be able to do an up to the minute pitr if required. The archive_command removes the file from this backup dir when the file is archived by PostgreSQL.
Here my script for that:
#!/bin/bash
## archive the current log file every minute to make sure we can recover within 1 minute
###set -vx
source ~postgres/etc/postgres.profile
date >~/log
if [ -z $1 ]
then
echo "archive_current_pg_xlog.sh <port>"
exit
fi
if [ -f /tmp/archive_log ]
then
echo "archive is already running - exiting"
echo "`/bin/hostname` - archive is running long" | mail -s "`/bin/hostname` problem" $DBAPAGER $DBAMAIL
exit
fi
touch /tmp/archive_log
serverPort=$1
serverName=`/bin/hostname`
currentArchiveDir=`echo ${archiveBaseDir}/${serverName}/${serverPort}/current_log`
echo "currentArchiveDir: "$currentArchiveDir >> ~/log
mkdir -p $currentArchiveDir
find $DB_PATH/50001/pg_xlog/ -cmin 1 -type f | grep -v "archive_status" > ~/cp_list
ls $DB_PATH/50001/pg_xlog/archive_status/*.done | while read FILE
do
FILE2=`echo $FILE|sed 's/\.done//'`
cat ~/cp_list | grep -v $FILE2 > ~/cp_list2
mv ~/cp_list2 ~/cp_list
rm $currentArchiveDir/$FILE2 2>/dev/null
done
cat ~/cp_list | while read FILE
do
echo "archiving $FILE" >> ~/log
cp $FILE ${currentArchiveDir}/.
done
find ${archiveBaseDir}/${serverName}/${serverPort}/pg_xlog_arch/ -type f -mmin -5 |
sed 's/.*pg_xlog_arch\///' |
while read FILE
do
rm -v ${currentArchiveDir}/$FILE >> log 2>/dev/null
done
rm /tmp/archive_log
date >> ~/log
--------------------
Here is my postgresql.conf
listen_addresses = '*'
port = 50001
max_connections = 256
shared_buffers = 32768
max_prepared_transactions = 256
work_mem = 4096
maintenance_work_mem = 1048576
max_fsm_pages = 8000000
max_fsm_relations = 10000
vacuum_cost_delay = 1000
vacuum_cost_page_hit = 1
vacuum_cost_page_miss = 10
vacuum_cost_page_dirty = 20
vacuum_cost_limit = 8000
wal_buffers = 64
checkpoint_segments = 256
checkpoint_timeout = 1800
archive_command = '/home/postgres/bin/archive_pg
effective_cache_size = 383490
random_page_cost = 2
default_statistics_target = 100
constraint_exclusion = on
redirect_stderr = on
log_directory = 'pg_log'
log_truncate_on_rotation = on
log_rotation_age = 1440
log_rotation_size = 1048576
log_min_messages = debug2
log_min_duration_statement = 0
log_connections = on
log_disconnections = on
log_duration = on
log_line_prefix = '%d,%p,%u,%m,%c,%l,%s,%x,%i,'
log_statement = 'all'
stats_start_collector = on
stats_command_string = on
stats_block_level = on
stats_row_level = on
stats_reset_on_server_start = on
autovacuum = on
autovacuum_naptime = 60
autovacuum_vacuum_threshold = 1000
autovacuum_analyze_threshold = 500
autovacuum_vacuum_scale_factor = 0.001
autovacuum_analyze_scale_factor = 0.0005
autovacuum_vacuum_cost_delay = -1
autovacuum_vacuum_cost_limit = -1
statement_timeout = 0
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'
add_missing_from = on
Chris
"Chris Hoover" <revoohc@gmail.com> writes:
> Currenty, there are no .ready files in the pg_xlog/archive_status file.
Well, that explains why the archiver thinks it has nothing to do.
> Like I mentioned, the db server was stoped and restarted on 9/5 am, but the
> oldest unarchived log file is:
> -rw------- 1 postgres postgres 16M Sep 1 04:04 0000000100000197000000F1
Are you sure that's an unarchived file, and not a file waiting to be
used later? What's the current WAL endpoint address? (pg_controldata
would tell you ... but your trace showed 000000010000019700000078 being
archived, which would suggest that this one is just waiting to be reused.)
Have you got a particularly large checkpoint_segments setting?
regards, tom lane
"Chris Hoover" <revoohc@gmail.com> writes: > I am able to tell how far behind the archiving is running since every > minute, I copy the current archive_logs to a backup directory. This is > allowing me to be able to do an up to the minute pitr if required. The > archive_command removes the file from this backup dir when the file is > archived by PostgreSQL. > Here my script for that: It looks to me like your script does not understand the difference between xlog files that have been used and xlog files that haven't been used yet. Since you've got checkpoint_segments = 256, the xlog code will allow as many as 513 (2*checkpoint_segments+1) not-yet-used xlog files to be allocated. The only simple way to tell where the end of WAL is at the moment is to look for the xlog file with the latest mod time. (There'll be more support for this in 8.2.) Any files with names numerically larger than that one are just sitting there waiting to be used, they are not interesting for archiving purposes. regards, tom lane