Re: Performance problems testing with Spamassassin 3.1.0
От | Jim C. Nasby |
---|---|
Тема | Re: Performance problems testing with Spamassassin 3.1.0 |
Дата | |
Msg-id | 20050731171012.GS60019@decibel.org обсуждение исходный текст |
Ответ на | Re: Performance problems testing with Spamassassin 3.1.0 (Matthew Schumacher <matt.s@aptalaska.net>) |
Ответы |
Re: Performance problems testing with Spamassassin 3.1.0
|
Список | pgsql-performance |
On Sun, Jul 31, 2005 at 08:51:06AM -0800, Matthew Schumacher wrote: > Ok, here is the current plan. > > Change the spamassassin API to pass a hash of tokens into the storage > module, pass the tokens to the proc as an array, start a transaction, > load the tokens into a temp table using copy, select the tokens distinct > into the token table for new tokens, update the token table for known > tokens, then commit. You might consider: UPDATE tokens FROM temp_table (this updates existing records) INSERT INTO tokens SELECT ... FROM temp_table WHERE NOT IN (SELECT ... FROM tokens) This way you don't do an update to newly inserted tokens, which helps keep vacuuming needs in check. > This solves the following problems: > > 1. Each email is a transaction instead of each token. > 2. The update statement is only called when we really need an update > which avoids all of those searches. > 3. The looping work is done inside the proc instead of perl calling a > method a zillion times per email. > > I'm not sure how vacuuming will be done yet, if we vacuum once per email > that may be too often, so I may do that every 5 mins in cron. I would suggest leaving an option to have SA vacuum every n emails, since some people may not want to mess with cron, etc. I suspect that pg_autovacuum would be able to keep up with things pretty well, though. -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
В списке pgsql-performance по дате отправления: