Re: Performance Question
От | Tom Lane |
---|---|
Тема | Re: Performance Question |
Дата | |
Msg-id | 7741.918142074@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Performance Question ("Tim Perdue" <tim@directricity.com>) |
Список | pgsql-sql |
"Tim Perdue" <tim@directricity.com> writes: > Am I going to run into huge performance problems on this? Should each > mailing list be archived in its own table?? > When I do a > SELECT * FROM tbl_mail_archive WHERE fld_mail_list=1 AND > fld_mail_body~~'%keyword%'; > am I going to get killed with a performance hit?? Should be OK as long as you make an index on fld_mail_list (and don't forget to vacuum regularly). Of course, maintaining that index is not zero-cost. The appropriate thing to ask is what your usage patterns will be. If you frequently make searches across multiple mailing lists, then you undoubtedly want to do it as you show above. If you never (or hardly ever) do that, you might as well keep each mailing list in its own table and live with having to do multiple SELECTs when you do want to look across multiple lists. > It's running OK now, with the table at 20MB, but I have 100MB more > worth of letters to drop into the table..... 8-) I suspect your real problem is going to be that searching 100MB with "fld_mail_body~~'%keyword%'" is going to be dog-slow. I think you are going to want a full-text index if you expect to do that a lot. There is a simple all-Postgres FTI in the contrib part of the distribution, but I think it'd probably run out of steam long before you got to 100MB. What I'd probably do in your situation is to use Glimpse (http://glimpse.cs.arizona.edu/) for the text index. That'd likely mean storing the message bodies in separate files outside the database proper, and keeping only the file names in the database rows. (But that'd get rid of the message-over-8K problem, so it isn't all bad...) regards, tom lane
В списке pgsql-sql по дате отправления: