Re: to pickle or not to pickle
От | Jurgen Defurne |
---|---|
Тема | Re: to pickle or not to pickle |
Дата | |
Msg-id | 3938A2A4.69CA09AC@glo.be обсуждение исходный текст |
Ответ на | to pickle or not to pickle (Marc Tardif <intmktg@CAM.ORG>) |
Список | pgsql-general |
Marc Tardif wrote: > I'm writing a search engine using python and postgresql which requires to > store a temporary list of results in an sql table for each request. This > list will contain at least 50 records and could grow to about 300. My > options are either to pickle the list and store a single entry or use the > postgresql COPY command (as opposed to INSERT which would be too slow) to > store each of the temporary records. > > You are writing a search engine : does that mean that you need to search > the > web and that you want to store your temporary results in a table, OR > does that mean that you are writing a QUERY screen, from which you > generate a SELECT statement to query your POSTGRES database ? > > Also what size are your tuples ? > > Do you need these temporary results within the same program, or do you > need to pass them somewhere to another program ? > > > Question is, how can I make an educated decision on which option to > select? What kind of questions should I be asking myself? Should I > actually go through the trouble of implementing both alternatives and > profiling each seperately? If so, how can I predict what will happen under > a heavy load which is hard to simulate when benchmarking each option? > Always go for a simple solution. This may (paradoxically) need some more study. One of the first questions you should ask yourself, is it really necessary to store this temporary result ? If so, then why take the pickle option ? Pickling is meant for persistent data, which is really more a mechanism to store data between sessions. Maybe you should consider the option which is used in traditional IT : just store your data in a sequential file. Much less overhead, because your OS handles it directly. Concerning the benchmarking, it seems as if the only way to do this is to automatically start scripts which do what needs to be done and then measure what happens : nr of processes, CPU and IO-load. Jurgen Defurne defurnj@glo.be
В списке pgsql-general по дате отправления: