Обсуждение: logging stuff
(Responding to the deafening silence regarding my posts a couple of days ago about logging dbnames and disconnections) ;-) The dbname patch is now done. If nobody objects to the format ("[db:yourdbname]") I'll submit it - I did it that way to make it fairly easy to split a log file based on it, although you would have to be careful with multiline log entries such as query strings. It is intentionally minimalist. I had some thoughts about logging disconnections - I can see a way to do it via an on_proc_exit handler, I think. Then I started wondering if it might be useful to log session times instead of just noting a disconnect and letting the user have to calculate the time. But I won't bother with this if there's no interest. *I* have no current use for it, but I could well imagine others might. (I might too in the future if I wanted to debug my connection pooling app). cheers andrew
On Tue, 5 Aug 2003, Andrew Dunstan wrote: > > (Responding to the deafening silence regarding my posts a couple of days > ago about logging dbnames and disconnections) ;-) > > The dbname patch is now done. If nobody objects to the format > ("[db:yourdbname]") I'll submit it - I did it that way to make it fairly > easy to split a log file based on it, although you would have to be > careful with multiline log entries such as query strings. It is > intentionally minimalist. > > I had some thoughts about logging disconnections - I can see a way to do > it via an on_proc_exit handler, I think. Then I started wondering if it > might be useful to log session times instead of just noting a disconnect > and letting the user have to calculate the time. > > But I won't bother with this if there's no interest. *I* have no current > use for it, but I could well imagine others might. (I might too in the > future if I wanted to debug my connection pooling app). Actually, I'd certainly like to see it done (both dbname and disconnect). I'd guess the deafening silence was more because of no objctions than lack of interest. I know for me it was.
I think we need a more general variable that can take several values, separated by commas, like: log_line: dbname,user or something like that. In fact, looking at the postgresql.conf file, I see only two setting that print on every line: log_pid and log_timestamp. Perhaps those two should be merged into log_line. Of course, this is all for 7.5. --------------------------------------------------------------------------- scott.marlowe wrote: > On Tue, 5 Aug 2003, Andrew Dunstan wrote: > > > > > (Responding to the deafening silence regarding my posts a couple of days > > ago about logging dbnames and disconnections) ;-) > > > > The dbname patch is now done. If nobody objects to the format > > ("[db:yourdbname]") I'll submit it - I did it that way to make it fairly > > easy to split a log file based on it, although you would have to be > > careful with multiline log entries such as query strings. It is > > intentionally minimalist. > > > > I had some thoughts about logging disconnections - I can see a way to do > > it via an on_proc_exit handler, I think. Then I started wondering if it > > might be useful to log session times instead of just noting a disconnect > > and letting the user have to calculate the time. > > > > But I won't bother with this if there's no interest. *I* have no current > > use for it, but I could well imagine others might. (I might too in the > > future if I wanted to debug my connection pooling app). > > Actually, I'd certainly like to see it done (both dbname and disconnect). > > I'd guess the deafening silence was more because of no objctions than lack > of interest. I know for me it was. > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
It's more work for a very small cosmetic benefit, ISTM. Setting up atomic GUC variables is trivially easy, I'm happy to say. Not that I can't write a simple parser, but why bother for something so tiny? Also, would we be saying (or implying) that the order given on the line would affect the output? If so, that could make LOTS more work, again for very small benefit, I think. (Yes, of course this is for 7.5). cheers andrew Bruce Momjian wrote: >I think we need a more general variable that can take several values, >separated by commas, like: > > log_line: dbname,user > >or something like that. > >In fact, looking at the postgresql.conf file, I see only two setting >that print on every line: log_pid and log_timestamp. Perhaps those two >should be merged into log_line. > >Of course, this is all for 7.5. > > >
Andrew Dunstan <andrew@dunslane.net> writes: > The dbname patch is now done. If nobody objects to the format > ("[db:yourdbname]") I'll submit it - I did it that way to make it fairly > easy to split a log file based on it, although you would have to be > careful with multiline log entries such as query strings. I'd tend to just put the dbname in a known column, and not bother with the decoration --- decoration adds up fast when it's on every log line, and I don't think it helps make the log more parseable. Compare what we do with timestamps and pids. regards, tom lane
Bruce Momjian <pgman@candle.pha.pa.us> writes: > I think we need a more general variable that can take several values, > separated by commas, like: > log_line: dbname,user > or something like that. Strikes me as overkill --- adding two more booleans comparable to log_pid would do the job just as well with lots less coding effort. regards, tom lane
Not quite the same - timestamps and pids have known formats, while db names are almost arbitrary. I know including spaces in names is horrible to my *nix way of thinking, but others might not have my prejudices. (interesting question - what characters are NOT allowed in a database name?). BTW, we're talking about 5 extra chars per line here. I know it adds up, but these days disk space is cheap and plentiful - for a million log lines we're still only talking 5Mb (says he whose first machine had a whopping 20Mb of disk space, and who paid hundreds of (aussie) dollars for the extra 10 Mb.) Syslog puts [] around pids, and typically has a lot of redundancy. I'm prepared to be guided by concensus, though. cheers andrew Tom Lane wrote: >Andrew Dunstan <andrew@dunslane.net> writes: > > >>The dbname patch is now done. If nobody objects to the format >>("[db:yourdbname]") I'll submit it - I did it that way to make it fairly >>easy to split a log file based on it, although you would have to be >>careful with multiline log entries such as query strings. >> >> > >I'd tend to just put the dbname in a known column, and not bother with >the decoration --- decoration adds up fast when it's on every log line, >and I don't think it helps make the log more parseable. Compare what we >do with timestamps and pids. > > regards, tom lane > > >
Andrew Dunstan <andrew@dunslane.net> writes: > I'm prepared to be guided by concensus, though. I'm not dead set on it either, just wanted to raise a flag. Who else has an opinion? regards, tom lane
Tom, > > I'm prepared to be guided by concensus, though. > > I'm not dead set on it either, just wanted to raise a flag. Who else > has an opinion? From my perspective, we could really use a "delimiter" between the "fields" of log output which is unlikely to appear within those fields instead of parsing by character count, rather than making dbname a special case. Or do we already do this and I'm not parsing my log right? -- Josh Berkus Aglio Database Solutions San Francisco
Josh Berkus <josh@agliodbs.com> writes: > From my perspective, we could really use a "delimiter" between the > "fields" of log output which is unlikely to appear within those fields > instead of parsing by character count, rather than making dbname a > special case. Well, Andrew was arguing that space is a likely character in dbnames, but then again it's possible to put "]" into dbnames. I think the only way to make this completely unambiguous would be to introduce a quoting convention for dbnames (and usernames when those get added). But if the goal is to allow simple parsing of log entries, that won't improve matters at all. My feeling about it is that DBAs who want to automatically parse log entries can just forbid spaces in the dbnames and usernames they allow. The KISS principle applies here. A variant (which'd be okay with me) is to separate these fields with tabs instead of spaces; then the rule for DBAs would be "don't allow tabs in db/user names". regards, tom lane
--On Tuesday, August 05, 2003 16:27:55 -0400 Tom Lane <tgl@sss.pgh.pa.us> wrote: > Josh Berkus <josh@agliodbs.com> writes: >> From my perspective, we could really use a "delimiter" between the >> "fields" of log output which is unlikely to appear within those fields >> instead of parsing by character count, rather than making dbname a >> special case. > > Well, Andrew was arguing that space is a likely character in dbnames, > but then again it's possible to put "]" into dbnames. I think the only > way to make this completely unambiguous would be to introduce a quoting > convention for dbnames (and usernames when those get added). But if > the goal is to allow simple parsing of log entries, that won't improve > matters at all. > > My feeling about it is that DBAs who want to automatically parse log > entries can just forbid spaces in the dbnames and usernames they allow. > The KISS principle applies here. > > A variant (which'd be okay with me) is to separate these fields with > tabs instead of spaces; then the rule for DBAs would be "don't allow > tabs in db/user names". Tabs cause issues in syslog output if I remember correctly. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend > -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 972-414-9812 E-Mail: ler@lerctr.org US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749
Larry Rosenman <ler@lerctr.org> writes: > --On Tuesday, August 05, 2003 16:27:55 -0400 Tom Lane <tgl@sss.pgh.pa.us> >> A variant (which'd be okay with me) is to separate these fields with >> tabs instead of spaces; then the rule for DBAs would be "don't allow >> tabs in db/user names". > Tabs cause issues in syslog output if I remember correctly. Really? What kind of issues? We're already quite likely to be sticking tabs into the log output, e.g. if log_statement is turned on. regards, tom lane
--On Tuesday, August 05, 2003 16:41:34 -0400 Tom Lane <tgl@sss.pgh.pa.us> wrote: > Larry Rosenman <ler@lerctr.org> writes: >> --On Tuesday, August 05, 2003 16:27:55 -0400 Tom Lane >> <tgl@sss.pgh.pa.us> >>> A variant (which'd be okay with me) is to separate these fields with >>> tabs instead of spaces; then the rule for DBAs would be "don't allow >>> tabs in db/user names". > >> Tabs cause issues in syslog output if I remember correctly. > > Really? What kind of issues? We're already quite likely to be sticking > tabs into the log output, e.g. if log_statement is turned on. I remember dealing with \n's, and don't recall if \t is a problem as well in SCO's syslogd. I'll play tonite, and see what it does/doesn't deal with. LER > > regards, tom lane -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 972-414-9812 E-Mail: ler@lerctr.org US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749
My issue is that I think there are reasonable people might want username, dbname, hostname, host ip, and host port on the log lines. This is the information that we currently report during a connection, if enabled. I have no problem adding those five booleans if people prefer booleans --- my big point is that dbname isn't the only possible thing folks would want. And of course, we already have pid and timestamp, so once we are done, we will have seven possible data items on each line, and with booleans there will be no control over their order on the line. --------------------------------------------------------------------------- Andrew Dunstan wrote: > Not quite the same - timestamps and pids have known formats, while db > names are almost arbitrary. I know including spaces in names is horrible > to my *nix way of thinking, but others might not have my prejudices. > (interesting question - what characters are NOT allowed in a database > name?). > > BTW, we're talking about 5 extra chars per line here. I know it adds up, > but these days disk space is cheap and plentiful - for a million log > lines we're still only talking 5Mb (says he whose first machine had a > whopping 20Mb of disk space, and who paid hundreds of (aussie) dollars > for the extra 10 Mb.) > > Syslog puts [] around pids, and typically has a lot of redundancy. > > I'm prepared to be guided by concensus, though. > > cheers > > andrew > > > Tom Lane wrote: > > >Andrew Dunstan <andrew@dunslane.net> writes: > > > > > >>The dbname patch is now done. If nobody objects to the format > >>("[db:yourdbname]") I'll submit it - I did it that way to make it fairly > >>easy to split a log file based on it, although you would have to be > >>careful with multiline log entries such as query strings. > >> > >> > > > >I'd tend to just put the dbname in a known column, and not bother with > >the decoration --- decoration adds up fast when it's on every log line, > >and I don't think it helps make the log more parseable. Compare what we > >do with timestamps and pids. > > > > regards, tom lane > > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Another interesting idea, if we do a single log variable with parameters separated by commas, is to allow some syntax where you could specify the delimiter between fields, so it could be: log_line: "|", dbname, username or log_line: " ", dbname, username --------------------------------------------------------------------------- Josh Berkus wrote: > Tom, > > > > I'm prepared to be guided by concensus, though. > > > > I'm not dead set on it either, just wanted to raise a flag. Who else > > has an opinion? > > >From my perspective, we could really use a "delimiter" between the "fields" of > log output which is unlikely to appear within those fields instead of parsing > by character count, rather than making dbname a special case. > > Or do we already do this and I'm not parsing my log right? > > -- > Josh Berkus > Aglio Database Solutions > San Francisco > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
If we're looking at this, we might want to look at how apache does it with it's customlog feature. This allows you to first define custom log types, then set them according to which virtual server you're setting up. I could see that being nice so you could create a couple of different custom log types, then use one for one database, and another for a different database. For people running large numbers of databases on a single cluster, it might be quite useful to be able to log each database differently. On Tue, 5 Aug 2003, Bruce Momjian wrote: > > Another interesting idea, if we do a single log variable with parameters > separated by commas, is to allow some syntax where you could specify the > delimiter between fields, so it could be: > > log_line: "|", dbname, username > > or > > log_line: " ", dbname, username > > --------------------------------------------------------------------------- > > Josh Berkus wrote: > > Tom, > > > > > > I'm prepared to be guided by concensus, though. > > > > > > I'm not dead set on it either, just wanted to raise a flag. Who else > > > has an opinion? > > > > >From my perspective, we could really use a "delimiter" between the "fields" of > > log output which is unlikely to appear within those fields instead of parsing > > by character count, rather than making dbname a special case. > > > > Or do we already do this and I'm not parsing my log right? > > > > -- > > Josh Berkus > > Aglio Database Solutions > > San Francisco > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 8: explain analyze is your friend > > > >
Bruce Momjian <pgman@candle.pha.pa.us> writes: > ... And of course, we already have pid and timestamp, so once > we are done, we will have seven possible data items on each line, and > with booleans there will be no control over their order on the line. Which is exactly the way I want it ;-). I can't see any use that would justify the amount of extra logic needed to allow user-specified ordering of the entries. This feature discussion seems to be degenerating into a gild-the-lily contest ... regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > ... And of course, we already have pid and timestamp, so once > > we are done, we will have seven possible data items on each line, and > > with booleans there will be no control over their order on the line. > > Which is exactly the way I want it ;-). I can't see any use that would > justify the amount of extra logic needed to allow user-specified > ordering of the entries. This feature discussion seems to be > degenerating into a gild-the-lily contest ... Depends if someone needs a lilly, though I have not heard anyone say they do. ;-) Adding several new variables is fine, but what do we call the hostname option if we already have log_hostname? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
> I could see that being nice so you could create a couple of different > custom log types, then use one for one database, and another for a > different database. Point them to different log files and you've got yourself a great feature. > For people running large numbers of databases on a single cluster, it > might be quite useful to be able to log each database differently. You bet. Hosting companies give clients the option to pick up their weblogs. This would be a good feature for those. Make the log come out similarly as apache, and you've got yourself some webalizer screens with funny URLs (queries ;).
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Adding several new variables is fine, but what do we call the hostname > option if we already have log_hostname? <shrug> We've renamed GUC variables before for consistency. I'd opt for picking names that show the common purpose, maybe log_line_FOO? regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Adding several new variables is fine, but what do we call the hostname > > option if we already have log_hostname? > > <shrug> We've renamed GUC variables before for consistency. I'd opt > for picking names that show the common purpose, maybe log_line_FOO? That's what I was thinking, that we might rename log_pid and log_timestamp to log_line_pid and log_line_timestamp to indicate it is printed on every line, and have the new booleans follow that pattern. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > Josh Berkus <josh@agliodbs.com> writes: > > From my perspective, we could really use a "delimiter" between the > > "fields" of log output which is unlikely to appear within those fields > > instead of parsing by character count, rather than making dbname a > > special case. > > Well, Andrew was arguing that space is a likely character in dbnames, > but then again it's possible to put "]" into dbnames. I think the only > way to make this completely unambiguous would be to introduce a quoting > convention for dbnames (and usernames when those get added). But if > the goal is to allow simple parsing of log entries, that won't improve > matters at all. > > My feeling about it is that DBAs who want to automatically parse log > entries can just forbid spaces in the dbnames and usernames they allow. > The KISS principle applies here. > > A variant (which'd be okay with me) is to separate these fields with > tabs instead of spaces; then the rule for DBAs would be "don't allow > tabs in db/user names". The tabs are not properly treated by syslogd, look this piece of log: Aug 6 01:02:16 dell-01 postgres[19490]: [157296] LOG: connection received: host=[local] Aug 6 01:02:16 dell-01 postgres[19490]: [157297] LOG: connection authorized: user=kalman database=kalman Aug 6 01:02:16 dell-01 postgres[19490]: [157298] DEBUG: /usr/bin/postmaster child[19490]: starting with ( Aug 6 01:02:16 dell-01 postgres[19490]: [157299] DEBUG: ^Ipostgres Aug 6 01:02:16 dell-01 postgres[19490]: [157300] DEBUG: ^I-v131072 Aug 6 01:02:16 dell-01 postgres[19490]: [157301] DEBUG: ^I-p Aug 6 01:02:16 dell-01 postgres[19490]: [157302] DEBUG: ^Ikalman Aug 6 01:02:16 dell-01 postgres[19490]: [157303] DEBUG: ) do you see: ^I that are "chars" that syslogd hate. Regards Gaeatano Mendola
"Mendola Gaetano" <mendola@bigfoot.com> writes: > "Tom Lane" <tgl@sss.pgh.pa.us> writes: >> A variant (which'd be okay with me) is to separate these fields with >> tabs instead of spaces; then the rule for DBAs would be "don't allow >> tabs in db/user names". > do you see: ^I that are "chars" that syslogd hate. Ugh. Okay, scratch that idea ... regards, tom lane
Gathering some of the threads, here's what I see: 1. There is some demand for per database logging. Doing that is rather beyond the scope of what I intended - in fact what I intended was to enable sensible log splitting out of band. I'll even write a perl script to do it for you if you like :-) 2. There is a suggestion of custom log formats, beyond the current on/off flags used, similar to Apache's logging mechanism. Of course, there are some problems - web logs cover one thing - http transactions. Even there, custom logging has led people to grief, and led to the common logging format that web log analysers usually use. And Pg logs cover various kinds of events. Some don't have a sensible db name, for example (like server startup logs messages). Writing code that would allow administrators to specify the log format almost arbitrarily, or even just to order the fields arbitrarily, would be quite a bit of work, and I suspect a constant source of irritating bugs, and I have no interest in doing the work. There are far more valuable things that need to be done. 3. There is debate about field delimiters in the logs. Currently, timestamp when used is a fixed length field at the start of the record, and pid when used is inside [ ]. This is complicated in the case of dbname by the facts that a) dbnames can be almost anything and b) not all records will have a (sensible) dbname to report. I chose [ ] originally to conform to what was done with pid, but now I'm leaning towards something else - too many M$ people have used [ ] in db work for years, so it is likely to crop up in names somehow. Maybe < > or | | - whatever seems least liketo to occur inside a dbname. I want to logs to be readable as well as parseable, or I'd use something like ^ or ~ or `. The other reason to move away from [ ] is that Tom (quite reasonably) wants to keep the char count down, and that would let me drop the "db:" prefix, so the fixed overhead would be 3 chars per line instead of 6 (if you count the trailing space). The bottom line for me is to get the info needed in the logs, and then make them look pretty, split them, or whatever, out of band - that isn't (and shouldn't, IMNSHO) be the server's job. andrew