Обсуждение: Temporary Tables and Web Application
Hi all I assume this is not an uncommon problem, but so far, I haven't been able to find a good answer to it. I've got a table that holds log entries and fills up very fast during the day, it gets approx. 25 million rows per day. I'm now building a web application using apache/mod_php where you can query the database and then should be able to page through the results. My idea was that whenever a user constructs a query, I create a temporary table holding the results and then page through this table, which should work very well in principle. But from what I've been able to find out, temporary tables live only in the Postgres Session they have been created in and are destroyed upon session descructuion. Now, with apache/php in a mpm environment, I have no guarantee that a user will get the same postgresql session for a subsequent request, thus he will not see the temporary table. Is there a way to create temporary tables in another way, so they are visible between sessions, or do I need to create real tables for my purpose? And is the perfomance penalty big for real tables, as they have been written to disk/read from disk? Tim
In response to Tim Tassonis <timtas@cubic.ch>: > Hi all > > I assume this is not an uncommon problem, but so far, I haven't been > able to find a good answer to it. > > I've got a table that holds log entries and fills up very fast during > the day, it gets approx. 25 million rows per day. I'm now building a web > application using apache/mod_php where you can query the database and > then should be able to page through the results. > > My idea was that whenever a user constructs a query, I create a > temporary table holding the results and then page through this table, > which should work very well in principle. > > But from what I've been able to find out, temporary tables live only in > the Postgres Session they have been created in and are destroyed upon > session descructuion. > > Now, with apache/php in a mpm environment, I have no guarantee that a > user will get the same postgresql session for a subsequent request, thus > he will not see the temporary table. > > Is there a way to create temporary tables in another way, so they are > visible between sessions, or do I need to create real tables for my > purpose? And is the perfomance penalty big for real tables, as they have > been written to disk/read from disk? Build a framework that creates the tables in a special schema, and then can access them through any session. Use some method to generate unique table names and store the names in the HTTP session. Create some sort of garbage collection routines that removes tables when they're no longer needed. The details of exactly how you pull this off are going to depend heavily on the rest of your application architecture. -- Bill Moran Collaborative Fusion Inc. http://people.collaborativefusion.com/~wmoran/ wmoran@collaborativefusion.com Phone: 412-422-3463x4023
Hi, Tim Tassonis wrote: > Hi all > > I assume this is not an uncommon problem, but so far, I haven't been > able to find a good answer to it. > > I've got a table that holds log entries and fills up very fast during > the day, it gets approx. 25 million rows per day. I'm now building a web > application using apache/mod_php where you can query the database and > then should be able to page through the results. you should be aware that PHP isnt the only scripting language with an apache module and not neccessary the best choice among them. > My idea was that whenever a user constructs a query, I create a > temporary table holding the results and then page through this table, > which should work very well in principle. That means you are more or less constructing materialized views :-) But if you hold the session anyway, then see below. > But from what I've been able to find out, temporary tables live only in > the Postgres Session they have been created in and are destroyed upon > session descructuion. > > Now, with apache/php in a mpm environment, I have no guarantee that a > user will get the same postgresql session for a subsequent request, thus > he will not see the temporary table. Thats the problem and if you have failover/loadbalancing situations, even more so. > Is there a way to create temporary tables in another way, so they are > visible between sessions, or do I need to create real tables for my > purpose? And is the perfomance penalty big for real tables, as they have > been written to disk/read from disk? To start with, you should avoid reconnecting to the database for every request. Not only because of loosing the session context but also to avoid connection overhead. Usually this is done by connection pooling. You can then try to trac user:connection relationship as much as possible thru the connection pool. If you have that, there is actually no need for the temp tables. Instead you can just use a regular cursor and scroll it as neccessary. Almost all frameworks should give you reasonable pool implementations, some additional memory caching on top of it and there are also a lot of other methods to help you with that, for example pgpool and pgbouncer. Regards Tino
Вложения
In response to Tim Tassonis <timtas@cubic.ch>: > > Bill Moran wrote: > > In response to Tim Tassonis <timtas@cubic.ch>: > > > >> > >> Now, with apache/php in a mpm environment, I have no guarantee that a > >> user will get the same postgresql session for a subsequent request, thus > >> he will not see the temporary table. > >> > >> Is there a way to create temporary tables in another way, so they are > >> visible between sessions, or do I need to create real tables for my > >> purpose? And is the perfomance penalty big for real tables, as they have > >> been written to disk/read from disk? > > > > Build a framework that creates the tables in a special schema, and then > > can access them through any session. Use some method to generate unique > > table names and store the names in the HTTP session. Create some sort > > of garbage collection routines that removes tables when they're no longer > > needed. > > > > The details of exactly how you pull this off are going to depend heavily > > on the rest of your application architecture. > > > > What you describe is what I referred to as "create real tables". I've > done that and it works, but I wondered if there's something similar > built in postgres apart from classical temporary tables. Not that I'm aware of. If you keep the mailing list in the CC, others can answer as well. -- Bill Moran Collaborative Fusion Inc. http://people.collaborativefusion.com/~wmoran/ wmoran@collaborativefusion.com Phone: 412-422-3463x4023
Tino Wildenhain wrote: > Hi, > > Tim Tassonis wrote: >> Hi all >> >> I assume this is not an uncommon problem, but so far, I haven't been >> able to find a good answer to it. >> >> I've got a table that holds log entries and fills up very fast during >> the day, it gets approx. 25 million rows per day. I'm now building a >> web application using apache/mod_php where you can query the database >> and then should be able to page through the results. > > you should be aware that PHP isnt the only scripting language with an > apache module and not neccessary the best choice among them. There's no need to become insulting. I am aware of the truly astonishing fact that there are other scripting languages apart from php and that not everybody loves php. Apart from the sad fact that I quite like php, the problem is not the choice of scripting language, but the nature of apache mpm processing, making the postgres connection stuck to an apache process. > >> My idea was that whenever a user constructs a query, I create a >> temporary table holding the results and then page through this table, >> which should work very well in principle. > > That means you are more or less constructing materialized views :-) No, I want the data to remain fixed after the query is executed. > But if you hold the session anyway, then see below. I don't hold the session, see above. > > >> But from what I've been able to find out, temporary tables live only >> in the Postgres Session they have been created in and are destroyed >> upon session descructuion. >> >> Now, with apache/php in a mpm environment, I have no guarantee that a >> user will get the same postgresql session for a subsequent request, >> thus he will not see the temporary table. > > Thats the problem and if you have failover/loadbalancing situations, > even more so. > >> Is there a way to create temporary tables in another way, so they are >> visible between sessions, or do I need to create real tables for my >> purpose? And is the perfomance penalty big for real tables, as they >> have been written to disk/read from disk? > > To start with, you should avoid reconnecting to the database for every > request. Not only because of loosing the session context but also > to avoid connection overhead. I don't reconnect after every request, but I'm not guaranteed by mpm that I get the same session/process. I might, but that's hardly what I'd call a stable application, even as a php programmer. > > Usually this is done by connection pooling. You can then try to trac > user:connection relationship as much as possible thru the connection pool. As far as I can see, there is no implementation of a multi client process connection pool in mod_php. I admit that my interprocess communication know-how is not very deep, but that would mean the client postgres/tcpip connection part would have to be held somewhere in shared memory between the different apache processes. From reading the documentation, php does not do that. > > If you have that, there is actually no need for the temp tables. Instead > you can just use a regular cursor and scroll it as neccessary. My problem ist that I don't have that. > > Almost all frameworks should give you reasonable pool implementations, > some additional memory caching on top of it and there are also a lot > of other methods to help you with that, for example pgpool and > pgbouncer. I'm afraid you somehow missed the point, but thanks for your response. Bye Tim
On Thu, Jun 5, 2008 at 5:36 PM, Tim Tassonis <timtas@cubic.ch> wrote: > Is there a way to create temporary tables in another way, so they are > visible between sessions, or do I need to create real tables for my purpose? > And is the perfomance penalty big for real tables, as they have been written > to disk/read from disk? You could create a real table on disk, inserting just the primary keys of the table; then, you could join on the main table, to get the real results. Regards Marco -- Marco Bizzarri http://iliveinpisa.blogspot.com/
Hi Tim, Off the top of my head, from somewhat left field, using filesystems to manage this sort of effect. Would "real" tables in a tablespace defined on a ramdisk meet this need? So the functionality/accessibility of a physical table is provided, along with the performance of a filesystem actually residing in memory. Presumeably viable ifyou have the memory to spare & know the size of the temp tables won't exceed this. You could also mount a tablespace on a physical disk with a filesystem which has delayed/deferred writes to disk, so thatif it is created & deleted quickly enough, it is never actually written to disk, but just generally sits in the cache. Cheers, Brent Wood >>> Bill Moran <wmoran@collaborativefusion.com> 06/06/08 8:01 AM >>> In response to Tim Tassonis <timtas@cubic.ch>: > > Bill Moran wrote: > > In response to Tim Tassonis <timtas@cubic.ch>: > > > >> > >> Now, with apache/php in a mpm environment, I have no guarantee that a > >> user will get the same postgresql session for a subsequent request, thus > >> he will not see the temporary table. > >> > >> Is there a way to create temporary tables in another way, so they are > >> visible between sessions, or do I need to create real tables for my > >> purpose? And is the perfomance penalty big for real tables, as they have > >> been written to disk/read from disk? > > > > Build a framework that creates the tables in a special schema, and then > > can access them through any session. Use some method to generate unique > > table names and store the names in the HTTP session. Create some sort > > of garbage collection routines that removes tables when they're no longer > > needed. > > > > The details of exactly how you pull this off are going to depend heavily > > on the rest of your application architecture. > > > > What you describe is what I referred to as "create real tables". I've > done that and it works, but I wondered if there's something similar > built in postgres apart from classical temporary tables. Not that I'm aware of. If you keep the mailing list in the CC, others can answer as well. -- Bill Moran Collaborative Fusion Inc. http://people.collaborativefusion.com/~wmoran/ wmoran@collaborativefusion.com Phone: 412-422-3463x4023 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
On 2008-06-06 07:25, Brent Wood wrote:
> Would "real" tables in a tablespace defined on a ramdisk meet this
> need?
Bad idea. This would mean an unusable database after a restart.
> You could also mount a tablespace on a physical disk with a
> filesystem which has delayed/deferred writes to disk
Not a vary good idea. This could mean an unusable database after a crash
or power failure.
Regards
Tometzky
--
...although Eating Honey was a very good thing to do, there was a
moment just before you began to eat it which was better than when you
were...
Winnie the Pooh
Tomasz Ostrowski wrote: > On 2008-06-06 07:25, Brent Wood wrote: > >> Would "real" tables in a tablespace defined on a ramdisk meet this >> need? > > Bad idea. This would mean an unusable database after a restart. Funnily, I was thinking the same this night, somehow defining a tablespace on tmpfs or something, so I really liked Brents suggestion. Sadly, it seems this would have some real downsides... > >> You could also mount a tablespace on a physical disk with a >> filesystem which has delayed/deferred writes to disk > > Not a vary good idea. This could mean an unusable database after a crash > or power failure. > > Regards > Tometzky