out of memory errors
От | Bruce McAlister |
---|---|
Тема | out of memory errors |
Дата | |
Msg-id | 539EE977.4020906@blueface.com обсуждение исходный текст |
Ответы |
Re: out of memory errors
|
Список | pgsql-general |
Hi All, I need some assistance with a particular out of memory issue I am currently experiencing, your thoughts would be greatly appreciated. Configuration: [1] 3 x ESX VM's [a] 8 vCPU's each [b] 16GB memory each [2] CentOS 6.5 64-bit on each [a] Kernel Rev: 2.6.32-431.17.1.el6.x86_64 [3] Postgresql from official repository [a] Version 9.3.4 [4] Configured as a master-slave pacemaker/cman/pgsql cluster [a] Pacemaker version: 1.1.10-14 [b] CMAN version: 3.0.12.1-59 [c] pgsql RA version: taken from clusterlabs git repo 3 months ago (cant find version in ra file) I did not tune any OS IPC parameters as I believe Postgresql v9.3 doesnt use those anymore (Please correct me if I am wrong). I have the following OS settings in place to try get optimal use of memory and smooth out fsync operations (comments may not be 100% accurate :) ): # Shrink FS cache before paging to swap vm.swappiness = 0 # Dont hand out more memory than neccesary vm.overcommit_memory = 2 # Smooth out FS Sync vm.dirty_ratio = 10 vm.dirty_background_ratio = 5 I have the following memory related settings for Postgresql: work_mem = 1MB maintenance_work_mem = 128MB effective_cache_size = 6GB max_connections = 700 shared_buffers = 4GB temp_buffers = 8MB wal_buffers = 16MB max_stack_depth = 2MB Currently there are roughly 300 client connections active when this error occurs. What appears to have happened here is that there is an autovacuum process that attempts to kick off and fails with an out of memory error, then shortly after that, the cluster resource agent attempts a connection to template1 to try and see if the database is up, this connection then fails with an out of memory error as well, at which point the cluster fails over the database to another node. Looking at the system memory usage, there is roughly 4GB - 5GB free physical memory, swap (21GB) is not in use at all when this error occurs, page cache is roughly 3GB in size when this occurs. I have attached the two memory dump logs where the first error is related to autovacuum and the second is the cluster ra connection attempt which fails too. I do not know how to read that memory information to come up with any ideas to correct this issue. The OS default for stack depth is 10MB, shall I attempt to increase the max_stack_depth to 10MB too? The system does not appear to be running out of memory, so I'm wondering if I have some issue with limits or some memory related settings. Any thoughts, tips, suggestions would be greatly appreciated. If you need any additional info from me please dont hesitate to ask. Thanks Bruce
Вложения
В списке pgsql-general по дате отправления: