Обсуждение: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
TLDR; If I spend the time necessary to instrument the many functions that are the equivalent of the Oracle counterparts, would anyone pull those changes and use them? Specifically, for those who know Oracle, I'm talking about implementing:
- The portion of the ALTER SESSION that enables extended SQL trace
- Most of the DBMS_MONITOR and DBMS_APPLICATION_INFO packages
- Instrument the thousand or so functions that are the equivalent of those found in Oracle's V$EVENT_NAME
- Dynamic performance view V$DIAG_INFO
For the last 35 years, I've made my living helping people solve Oracle performance problems by looking at it, which means:
Trace a user experience and profile the trace file to (a) reveal where the time has gone and its algorithm and (b) make it easy to imagine the cost of possible solutions as well as the savings in response time or resources.
I've even submitted change requests to improve Oracle's tracing features while working for them and since those glorious five years.
Now looking closely at postgreSQL, I see an opportunity to more quickly implement Oracle's current feature list.
I've come to this point because I see many roadblocks for users who want to see a detailed "receipt" for their response time. The biggest roadblock is that without a lot of automation, a user of any kind must log into the server and attempt to get the data that are now traditionally child's play for Oracle. The second biggest roadblock I see is the recompilation that is required for the server components (i.e., postgreSQL, operating system). My initial attempts to get anything useful out of postgreSQL were dismal failures and I think it should be infinitely easier.
Running either dtrace and eBPF scripts on the server should not be required. The instrumentation and the code being instrumented should be tightly coupled. Doing so will allow anyone on any platform for any PostgreSQL version to get a trace file just as easily as people do for Oracle.
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
On 10/4/21 02:34, Laurenz Albe wrote: > On Fri, 2021-10-01 at 15:06 -0500, Jeff Holt wrote: >> TLDR; If I spend the time necessary to instrument the many functions that are the equivalent >> of the Oracle counterparts, would anyone pull those changes and use them? >> Specifically, for those who know Oracle, I'm talking about implementing: >> 1. The portion of the ALTER SESSION that enables extended SQL trace >> 2. Most of the DBMS_MONITOR and DBMS_APPLICATION_INFO packages >> 3. Instrument the thousand or so functions that are the equivalent of those found in Oracle's V$EVENT_NAME >> 4. Dynamic performance view V$DIAG_INFO >> For the last 35 years, I've made my living helping people solve Oracle performance problems by looking at it >> > [...] >> Now looking closely at postgreSQL, I see an opportunity to more quickly implement Oracle's current feature list. > Anything that improves user experience in that respect is welcome, but consider > that each database has different approaches to solve the same problems. > > Before you go to the length of implementing a lot of stuff, check in with > the -hackers list and discuss your ideas. > > Please be a lot more specific than in this e-mail. While it is certainly > fine to sketch your ambitios vision, focus on one specific thing you can > imagine implementing and come up with a design for that. > > Note that "Oracle has it" is not a good enough reason for a PostgreSQL > feature. We think we can do better than they do (at least in many respects). > Also, don't assume that everyone on the -hackers list will be familiar with > certain PostgreSQL features. > > One think that you should keep in mind is that Oracle has to provide different > features in that area because they are not open source. In PostgreSQL, I can > simply read the code or attach a debugger to a backend, and when it comes to > profiling, "perf" works pretty well. So there is less need for these things. > > I don't want to discourage you, but contributing to PostgreSQL can be a lengthy > and tedious process. On the upside, things that make it into core are usually > fairly mature. > > Yours, > Laurenz Albe Laurenz, you are obviously not aware who are you talking to. Let me introduce you: Cary Millsap and Jeff Holt are authors of the "Optimizing Oracle for Performance", one of the most influential books in the entire realm of Oracle literature. The book describes the method of tuning Oracle applications by examining where are they spending time and what are they waiting for. The book can be found on Amazon and I would seriously advise you to read it: https://www.amazon.com/Optimizing-Oracle-Performance-Practitioners-Response-ebook/dp/B00BJ9A8SU/ref=sr_1_1?dchild=1&keywords=Optimizing+Oracle+for+Performance&qid=1633395886&s=books&sr=1-1 Haughty lectures about "Oracle has it" not being good enough could hardly be more out of place here. To put it as politely as is possible in this case, shut your pie hole. What Jeff is asking for is not something that "Oracle has", it's something that customers want. That was the case few years ago when I was asking for the optimizer hints. I was castigated by the former pastry baker turned Postgres guru and my reaction was simple: I threw Postgres out of the company that I was a working for as the lead DBA. You see, customer is always right, whether the database is open source or not. Needless to say, Postgres has optimizer hints these days. It still has them in "we do not want" part of the Wiki, which is hilarious. You see, without proper event instrumentation, and knowing where the application spends time, it is not possible to exactly tune that application. Oracle used to have a witchcraft based lore like that, where the performance was estimated, based on buffer cache hit ratio, the famous "BCHR". That was known as "Method C". The name comes from Cary's and Jeff's book. Jeff and Cary are the ones who made the BCHR based black magic - obsolete. In other words, Jeff is asking for a method to fine tune the applications with precision. Instead of being an a....rrogant person, you should have given him the answer: https://github.com/postgrespro/pg_wait_sampling Postgres already has an extension which implements around 60% of what Oracle has. Of course, Oracle's mechanism is somewhat more refined but it is also 20 years older. Cary Millsap, Anjo Kolk and Jeff Holt were implementing the instrumentation for Oracle 7. There was a huge pile of paper, printed off Metalink, a predecessor of "My Oracle Support", describing Oracle 7 events and explaining what Oracle was actually waiting for. At that time Cary Millsap was a VP in Oracle development. The book came out for Oracle8. You see, Jeff Holt really knows what he's asking for. You are the ignorant one, the one who engaged in talking at Jeff, not knowing that there already is an answer. There is no shame in not knowing something, people ask questions all the time. Arrogantly talking at someone and giving unsolicited lectures in what is appropriate and what is not is another thing altogether. Finally, about the tone of this message: you really pissed me off. I had to restrain myself from using even stronger language, that was surprisingly hard to do. I wouldn't be surprised to see you giving haughty lectures about programming to Brian Kernighan or Dennis Ritchie. And yes, those two have allegedly also written a book. Regards -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
Mladen, Shame on u lecturing a top notch guy in the PostgreSQL world, Laurenz Albe. I think Laurenz knows “a little bit” about Oraclehaving written the popular extension, fdw_oracle, among his many other contributions to the PG world. So ironic thatLaurenz was just named “PostgReSQL person of the week”, and then has to be subjected to this “tirade” of yours! Follow the PG protocol in submitting your change requests to core PG and stop your Bitchin! Michael Vitale Sent from my iPad > On Oct 4, 2021, at 9:51 PM, Mladen Gogala <gogala.mladen@gmail.com> wrote: > > >> On 10/4/21 02:34, Laurenz Albe wrote: >>> On Fri, 2021-10-01 at 15:06 -0500, Jeff Holt wrote: >>> TLDR; If I spend the time necessary to instrument the many functions that are the equivalent >>> of the Oracle counterparts, would anyone pull those changes and use them? >>> Specifically, for those who know Oracle, I'm talking about implementing: >>> 1. The portion of the ALTER SESSION that enables extended SQL trace >>> 2. Most of the DBMS_MONITOR and DBMS_APPLICATION_INFO packages >>> 3. Instrument the thousand or so functions that are the equivalent of those found in Oracle's V$EVENT_NAME >>> 4. Dynamic performance view V$DIAG_INFO >>> For the last 35 years, I've made my living helping people solve Oracle performance problems by looking at it >>> >> [...] >>> Now looking closely at postgreSQL, I see an opportunity to more quickly implement Oracle's current feature list. >> Anything that improves user experience in that respect is welcome, but consider >> that each database has different approaches to solve the same problems. >> >> Before you go to the length of implementing a lot of stuff, check in with >> the -hackers list and discuss your ideas. >> >> Please be a lot more specific than in this e-mail. While it is certainly >> fine to sketch your ambitios vision, focus on one specific thing you can >> imagine implementing and come up with a design for that. >> >> Note that "Oracle has it" is not a good enough reason for a PostgreSQL >> feature. We think we can do better than they do (at least in many respects). >> Also, don't assume that everyone on the -hackers list will be familiar with >> certain PostgreSQL features. >> >> One think that you should keep in mind is that Oracle has to provide different >> features in that area because they are not open source. In PostgreSQL, I can >> simply read the code or attach a debugger to a backend, and when it comes to >> profiling, "perf" works pretty well. So there is less need for these things. >> >> I don't want to discourage you, but contributing to PostgreSQL can be a lengthy >> and tedious process. On the upside, things that make it into core are usually >> fairly mature. >> >> Yours, >> Laurenz Albe > > Laurenz, you are obviously not aware who are you talking to. Let me introduce you: Cary Millsap and Jeff Holt are authorsof the "Optimizing Oracle for Performance", one of the most influential books in the entire realm of Oracle literature. The book describes the method of tuning Oracle applications by examining where are they spending time and whatare they waiting for. The book can be found on Amazon and I would seriously advise you to read it: > > https://www.amazon.com/Optimizing-Oracle-Performance-Practitioners-Response-ebook/dp/B00BJ9A8SU/ref=sr_1_1?dchild=1&keywords=Optimizing+Oracle+for+Performance&qid=1633395886&s=books&sr=1-1 > > Haughty lectures about "Oracle has it" not being good enough could hardly be more out of place here. To put it as politelyas is possible in this case, shut your pie hole. What Jeff is asking for is not something that "Oracle has", it'ssomething that customers want. That was the case few years ago when I was asking for the optimizer hints. I was castigatedby the former pastry baker turned Postgres guru and my reaction was simple: I threw Postgres out of the companythat I was a working for as the lead DBA. You see, customer is always right, whether the database is open source ornot. Needless to say, Postgres has optimizer hints these days. It still has them in "we do not want" part of the Wiki,which is hilarious. > > You see, without proper event instrumentation, and knowing where the application spends time, it is not possible to exactlytune that application. Oracle used to have a witchcraft based lore like that, where the performance was estimated,based on buffer cache hit ratio, the famous "BCHR". That was known as "Method C". The name comes from Cary's andJeff's book. Jeff and Cary are the ones who made the BCHR based black magic - obsolete. > > In other words, Jeff is asking for a method to fine tune the applications with precision. Instead of being an a....rrogantperson, you should have given him the answer: > > https://github.com/postgrespro/pg_wait_sampling > > Postgres already has an extension which implements around 60% of what Oracle has. Of course, Oracle's mechanism is somewhatmore refined but it is also 20 years older. Cary Millsap, Anjo Kolk and Jeff Holt were implementing the instrumentationfor Oracle 7. There was a huge pile of paper, printed off Metalink, a predecessor of "My Oracle Support",describing Oracle 7 events and explaining what Oracle was actually waiting for. At that time Cary Millsap was aVP in Oracle development. The book came out for Oracle8. You see, Jeff Holt really knows what he's asking for. You are theignorant one, the one who engaged in talking at Jeff, not knowing that there already is an answer. There is no shame innot knowing something, people ask questions all the time. Arrogantly talking at someone and giving unsolicited lecturesin what is appropriate and what is not is another thing altogether. > > Finally, about the tone of this message: you really pissed me off. I had to restrain myself from using even stronger language,that was surprisingly hard to do. I wouldn't be surprised to see you giving haughty lectures about programming toBrian Kernighan or Dennis Ritchie. And yes, those two have allegedly also written a book. > > Regards > > > -- > Mladen Gogala > Database Consultant > Tel: (347) 321-1217 > https://dbwhisperer.wordpress.com > > >
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
On Mon, Oct 4, 2021 at 6:51 PM Mladen Gogala <gogala.mladen@gmail.com> wrote: > Haughty lectures about "Oracle has it" not being good enough could > hardly be more out of place here. To put it as politely as is possible > in this case, shut your pie hole. What Jeff is asking for is not > something that "Oracle has", it's something that customers want. That > was the case few years ago when I was asking for the optimizer hints. I > was castigated by the former pastry baker turned Postgres guru and my > reaction was simple: I threw Postgres out of the company that I was a > working for as the lead DBA. You see, customer is always right, whether > the database is open source or not. Needless to say, Postgres has > optimizer hints these days. It still has them in "we do not want" part > of the Wiki, which is hilarious. In all sincerity: Chill out. I don't think that this is worth getting into an argument over. I think that there is a good chance that you'd have had a much better experience if the conversation had been in person. Text has a way of losing a lot of important nuance. I have personally met and enjoyed talking to quite a few people that personally worked on Oracle, in various capacities -- the world of database internals experts is not huge. I find Tanel Poder very interesting, too -- never met the man, but we follow each other on Twitter. Oracle is a system that has some interesting properties in general (not just as a counterpoint to PostgreSQL), and I definitely respect it. Same with SQL Server. There are lots of smart and accomplished people in the world, including Jeff. I think that it's worth understanding these differences in perspective. There is likely to be merit in all of the specific points made by both Laurenze and Jeff. They may not be irreconcilable, or anything like it. -- Peter Geoghegan
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
On 10/4/21 22:25, Michaeldba@sqlexec.com wrote: > Mladen, > > Shame on u lecturing a top notch guy in the PostgreSQL world, Laurenz Albe. I think Laurenz knows “a little bit” aboutOracle having written the popular extension, fdw_oracle, among his many other contributions to the PG world. So ironicthat Laurenz was just named “PostgReSQL person of the week”, and then has to be subjected to this “tirade” of yours! > > Follow the PG protocol in submitting your change requests to core PG and stop your Bitchin! > > Michael Vitale First, a matter of format: please don't top-post. Replies go under the original posts. That's an unwritten rule, but a very time honored one. Second, I know very well who Laurenz Albe is. We have met on the oracle-l few decades ago. Third, I think that my reproach to Laurenz's tone is very justified. You don't say "the argument that Python has it is not good enough" to Dennis Ritchie. Hopefully, you get my analogy, but one cannot ever be sure. Last, I didn't request any new features from the Postgres community. That's a mistake that I'll never commit again. Last time I tried, this has happened: https://www.toolbox.com/tech/data-management/blogs/why-postgresql-doesnt-have-query-hints-020411/ I still keep it in my bookmark folder, under "Humor". I used that article several times on the oracle-l as an illustration some properties of Postgres community. That article was a gift and I am sincerely grateful. Of course, PostgreSQL now has query hints. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
On 10/4/21 23:08, Peter Geoghegan wrote: > n all sincerity: Chill out. I don't think that this is worth getting > into an argument over. I think that there is a good chance that you'd > have had a much better experience if the conversation had been in > person. Text has a way of losing a lot of important nuance. > > I have personally met and enjoyed talking to quite a few people that > personally worked on Oracle, in various capacities -- the world of > database internals experts is not huge. I find Tanel Poder very > interesting, too -- never met the man, but we follow each other on > Twitter. Oracle is a system that has some interesting properties in > general (not just as a counterpoint to PostgreSQL), and I definitely > respect it. Same with SQL Server. > > There are lots of smart and accomplished people in the world, > including Jeff. I think that it's worth understanding these > differences in perspective. There is likely to be merit in all of the > specific points made by both Laurenze and Jeff. They may not be > irreconcilable, or anything like it. What angered me was the presumptuous tone of voice directed to an Oracle legend. I have probably talked to many more Oracle people than you, including Tanel, whom I have met personally. I am not on Twitter, unfortunately I am older than 20. Before you ask, I am not on Instagram, Facebook or Tiktok. I am not on OnlyFans either. I have never understood the need to share one's every thought in real time. Being rather private person has probably stymied my career of an internet influencer. I'll never rival Kim Kardashian. As for Jeff Holt, I believe that a person of his stature needs to be taken seriously and not lectured "how are things done in Postgres community". I am rather confused by the thinly veiled hostility toward Oracle. In my opinion, Postgres community should be rather welcoming to Oracle people like Frits Hoogland, Frank Pachot or Jeff Holt. After all, we are using Postgres and telling us "you can't have what you used to get from Oracle" is not either encouraging or smart. If you want consulting jobs in big banks and for a decent money, you might just take Oracle people seriously. Have you ever wondered why Oracle has so many customers despite the fact that it's so freakishly expensive? -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
On Mon, Oct 4, 2021 at 9:04 PM Mladen Gogala <gogala.mladen@gmail.com> wrote: > What angered me was the presumptuous tone of voice directed to an Oracle > legend. I have probably talked to many more Oracle people than you, > including Tanel, whom I have met personally. I am not on Twitter, > unfortunately I am older than 20. Before you ask, I am not on Instagram, > Facebook or Tiktok. I am not on OnlyFans either. I have never understood > the need to share one's every thought in real time. Being rather private > person has probably stymied my career of an internet influencer. I'll > never rival Kim Kardashian. You do seem shy. > As for Jeff Holt, I believe that a person of his stature needs to be > taken seriously and not lectured "how are things done in Postgres > community". I haven't met Jeff Holt either, but I believe that he is also older than 20. I have to imagine that he doesn't particularly need you to defend his honor. > I am rather confused by the thinly veiled hostility toward > Oracle. In my opinion, Postgres community should be rather welcoming to > Oracle people like Frits Hoogland, Frank Pachot or Jeff Holt. After all, > we are using Postgres and telling us "you can't have what you used to > get from Oracle" is not either encouraging or smart. I agree with all that. I am also friendly with Frank, as it happens. I think that Laurenze was just trying to establish common terms of reference for discussion -- it's easy for two groups of people with similar but different terminology to talk past each other. For example, I think that there may be confusion about what is possible with a tool like eBPF today, and what might be possible in an ideal world. -- Peter Geoghegan
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
As for Jeff Holt, I believe that a person of his stature needs to be
taken seriously and not lectured "how are things done in Postgres
community". I am rather confused by the thinly veiled hostility toward
Oracle. In my opinion, Postgres community should be rather welcoming to
Oracle people like Frits Hoogland, Frank Pachot or Jeff Holt.
Just follow the pgsql-hackers list, and you'll see that newbies are very unwelcome,
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
Comments in-line:
have never heard of Jeff Holt, but then there are a lot of wonderful and smart people I have never heard of. I tend to be respectful in my conversation, regardless if I know the other person or not.
That much is apparent. However, that's no excuse.
Haughty lectures about "Oracle has it" not being good enough could hardly be more out of place here.I have no idea how you arrive at the conclusion that I was delivering a haughty lecture. Somebody asked if PostgreSQL would consider applying patches he is ready to write, somebody who seems not to be familiar with the way PostgreSQL development works, so I tried to give helpful pointers.
Your tone of voice did. Plus, you took it on yourself to explain "how things are done in the Postgres community". I always use hints and Josh Berkus as an example "how things are done in the Postgres community" and why is the Postgres progress so slow. You have just provided me another perfect example of the "community spirit".
If you are under impression that I want to take part in a conversation, then you're sorely mistaken. And I have to adjust my style of writing to make things even more clear. As for Jeff, I don't need to 'champion him'. He did that all by himself. In his place, I would simply ignore both this topic and you, Mr. Postgres Community.To put it as politely as is possible in this case, shut your pie hole.I think you have just disqualified yourself from taking part in this conversation. I recommend that you don't embarrass Jeff Holt by trying to champion him.
-- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
On Fri, Oct 1, 2021 at 1:06 PM Jeff Holt <jeff.holt@method-r.com> wrote: > Now looking closely at postgreSQL, I see an opportunity to more quickly implement Oracle's current feature list. > > I've come to this point because I see many roadblocks for users who want to see a detailed "receipt" for their responsetime. I have heard of method R. Offhand it seems roughly comparable to something like the Top-down Microarchitecture Analysis Method that low level systems programmers sometimes use, along with Intel's pmu-tools -- at least at a very high level. The point seems to be to provide a workflow that can plausibly zero in on low-level bottlenecks, by providing high level context. Many tricky real world problems are in some sense a high level problem that is disguised as a low level problem. And so all of the pieces need to be present on the board, so to speak. Does that sound accurate? One obvious issue with much of the Postgres instrumentation is that it makes it hard to see how things change over time. I think that that is often *way* more informative than static snapshots. I can see why you'd emphasize the need for PostgreSQL to more or less own the end to end experience for something like this. It doesn't necessarily follow that the underlying implementation cannot make use of infrastructure like eBPF, though. Fast user space probes provably have no overhead, and can be compiled-in by distros that can support it. There hasn't been a consistent effort to make that stuff available, but I doubt that that tells us much about what is possible. The probes that we have today are somewhat of a grab-bag, that aren't particularly useful -- so it's a chicken-and-egg thing. It would probably be helpful if you could describe what you feel is missing in more general terms -- while perhaps giving specific practical examples of specific scenarios that give us some sense of what the strengths of the model are. ISTM that it's not so much a lack of automation in PostgreSQL. It's more like a lack of a generalized model, which includes automation, but also some high level top-down theory. -- Peter Geoghegan
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
Comments in-line On 10/5/21 16:24, Peter Geoghegan wrote: > On Fri, Oct 1, 2021 at 1:06 PM Jeff Holt <jeff.holt@method-r.com> wrote: >> Now looking closely at postgreSQL, I see an opportunity to more quickly implement Oracle's current feature list. >> >> I've come to this point because I see many roadblocks for users who want to see a detailed "receipt" for their responsetime. > I have heard of method R. Offhand it seems roughly comparable to > something like the Top-down Microarchitecture Analysis Method that low > level systems programmers sometimes use, along with Intel's pmu-tools > -- at least at a very high level. The point seems to be to provide a > workflow that can plausibly zero in on low-level bottlenecks, by > providing high level context. Many tricky real world problems are in > some sense a high level problem that is disguised as a low level > problem. And so all of the pieces need to be present on the board, so > to speak. > > Does that sound accurate? Yes, that is pretty accurate. It is essentially the same method described in the "High Performance Computing" books. The trick is to figure what the process is waiting for and then reduce the wait times. All computers wait at the same speed. > One obvious issue with much of the Postgres instrumentation is that it > makes it hard to see how things change over time. I think that that is > often *way* more informative than static snapshots. > > I can see why you'd emphasize the need for PostgreSQL to more or less > own the end to end experience for something like this. It doesn't > necessarily follow that the underlying implementation cannot make use > of infrastructure like eBPF, though. Fast user space probes provably > have no overhead, and can be compiled-in by distros that can support > it. There hasn't been a consistent effort to make that stuff > available, but I doubt that that tells us much about what is possible. > The probes that we have today are somewhat of a grab-bag, that aren't > particularly useful -- so it's a chicken-and-egg thing. Not exactly. There already is a very good extension for Postgres called pg_wait_sampling: https://github.com/postgrespro/pg_wait_sampling What is missing here is mostly the documentation. This extension should become a part of Postgres proper and the events should be documented as they are (mostly) documented for Oracle. Oracle uses trace files instead. However, with Postgres equivalence of files and tables, this is not a big difference. > > It would probably be helpful if you could describe what you feel is > missing in more general terms -- while perhaps giving specific > practical examples of specific scenarios that give us some sense of > what the strengths of the model are. ISTM that it's not so much a lack > of automation in PostgreSQL. It's more like a lack of a generalized > model, which includes automation, but also some high level top-down > theory. I am not Jeff and my opinion is not as valuable and doesn't carry the same weight, by far. However, I do believe that we may not see Jeff Holt again on this group so I am providing my opinion instead. At least I would, in Jeff's place, be reluctant to return to this group. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Comments in-line
On 10/5/21 16:24, Peter Geoghegan wrote:
> On Fri, Oct 1, 2021 at 1:06 PM Jeff Holt <jeff.holt@method-r.com> wrote:
>> Now looking closely at postgreSQL, I see an opportunity to more quickly implement Oracle's current feature list.
>>
>> I've come to this point because I see many roadblocks for users who want to see a detailed "receipt" for their response time.
> I have heard of method R. Offhand it seems roughly comparable to
> something like the Top-down Microarchitecture Analysis Method that low
> level systems programmers sometimes use, along with Intel's pmu-tools
> -- at least at a very high level. The point seems to be to provide a
> workflow that can plausibly zero in on low-level bottlenecks, by
> providing high level context. Many tricky real world problems are in
> some sense a high level problem that is disguised as a low level
> problem. And so all of the pieces need to be present on the board, so
> to speak.
>
> Does that sound accurate?
Yes, that is pretty accurate. It is essentially the same method
described in the "High Performance Computing" books. The trick is to
figure what the process is waiting for and then reduce the wait times.
All computers wait at the same speed.
> One obvious issue with much of the Postgres instrumentation is that it
> makes it hard to see how things change over time. I think that that is
> often *way* more informative than static snapshots.
>
> I can see why you'd emphasize the need for PostgreSQL to more or less
> own the end to end experience for something like this. It doesn't
> necessarily follow that the underlying implementation cannot make use
> of infrastructure like eBPF, though. Fast user space probes provably
> have no overhead, and can be compiled-in by distros that can support
> it. There hasn't been a consistent effort to make that stuff
> available, but I doubt that that tells us much about what is possible.
> The probes that we have today are somewhat of a grab-bag, that aren't
> particularly useful -- so it's a chicken-and-egg thing.
Not exactly. There already is a very good extension for Postgres called
pg_wait_sampling:
https://github.com/postgrespro/pg_wait_sampling
What is missing here is mostly the documentation. This extension should
become a part of Postgres proper and the events should be documented as
they are (mostly) documented for Oracle. Oracle uses trace files
instead. However, with Postgres equivalence of files and tables, this is
not a big difference.
>
> It would probably be helpful if you could describe what you feel is
> missing in more general terms -- while perhaps giving specific
> practical examples of specific scenarios that give us some sense of
> what the strengths of the model are. ISTM that it's not so much a lack
> of automation in PostgreSQL. It's more like a lack of a generalized
> model, which includes automation, but also some high level top-down
> theory.
I am not Jeff and my opinion is not as valuable and doesn't carry the
same weight, by far. However, I do believe that we may not see Jeff Holt
again on this group so I am providing my opinion instead. At least I
would, in Jeff's place, be reluctant to return to this group.
--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
On 10/5/21 20:02, Tim wrote: > Jeff Holt is probably pretty embarrassed there's some blowhard making > a scene using his name in a casual mailing list thread. Wow! What a contribution to the discussion! Calling me a blowhard, all while top-posting at the same time. Your post will be remembered for generations to come. Or not. Laurenz will probably tell you that we don't top-post in Postgres community. He's good with rules, regulations and the way things are done in Postgres community. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
On 10/5/21 13:24, Peter Geoghegan wrote: > On Fri, Oct 1, 2021 at 1:06 PM Jeff Holt <jeff.holt@method-r.com> wrote: >> Now looking closely at postgreSQL, I see an opportunity to more quickly implement Oracle's current feature list. >> >> I've come to this point because I see many roadblocks for users who want to see a detailed "receipt" for their responsetime. > > It would probably be helpful if you could describe what you feel is > missing in more general terms -- while perhaps giving specific > practical examples of specific scenarios that give us some sense of > what the strengths of the model are. ISTM that it's not so much a lack > of automation in PostgreSQL. It's more like a lack of a generalized > model, which includes automation, but also some high level top-down > theory. Back in my oracle days, I formally used method-R on a few consulting gigs while working with Hotsos (RIP Gary). Method-R is brilliant, and I referenced it in my PostgreSQL user group talk about wait events in PG. https://www.slideshare.net/ardentperf/wait-whats-going-on-inside-my-database-173880246 I'm not the author of Method-R, but I myself would describe it as a methodical approach to consistently solve business problems rooted in database performance faster than any other methodical approach, built on a foundation of wait events, queuing theory and tracing (aka logging). But the most brilliant part is how Cary Millsap's tireless efforts to simplify, automate and educate have made it accessible to ordinary data analysts and project managers all over the world who speak SQL but not C. PostgreSQL added wait events starting in 9.6 and the last thing that's missing is an integrated way to trace or log them. A simple starting point could be a session-level GUC that enables a hook in pgstat_report_wait_start() and pgstat_report_wait_end() to just drop messages in the log. These log messages could then easily be processed to generate the similar profiles to the ones we used with other databases. Basically I agree 100% with Jeff that while you can do these things with perf probes or eBPF, there are massive advantages to having it baked in the database. With the right tools, this makes session profiling available to regular users (who do their day jobs with excel rather than eBPF). However, one problem to watch out for will be whether the existing PostgreSQL logging infrastructure can handle this. Probably need higher precision timestamps (I need to check what csvlog has), and it could still be a lot of volume with some lightweight locks. Whereas Oracle had each individual process write the wait event trace messages to its own file, today PostgreSQL only supports either the single-system-wide-file logging collector, or syslog which I think can only split to 8 destinations (and may be lossy). There's another use case where high logging bandwidth could also be useful - temporarily logging all SQL statements to capture workload. Next time I see someone take down their production database because the pgBadger doc said "log_min_duration_statement = 0" ... WHY PGBADGER WHY? Anyway I do hope there will be some improvements in this area with PostgreSQL. I'm not much of a C coder but maybe I'll take a swing at it some day! Anyway, Jeff, nice to see you here - and this is a topic I've thought about a lot too. PostgreSQL is a pretty cool bit of software, and an even cooler group of people around it. Hope to see you around some more. :) -Jeremy PS. "tracing versus sampling" was the perpetual debate amongst performance engineers... we could have some good fun debating along those lines too. hold my beer -- http://about.me/jeremy_schneider
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
On 10/7/21 22:15, Jeremy Schneider wrote: > On 10/5/21 13:24, Peter Geoghegan wrote: >> On Fri, Oct 1, 2021 at 1:06 PM Jeff Holt <jeff.holt@method-r.com> wrote: >>> Now looking closely at postgreSQL, I see an opportunity to more quickly implement Oracle's current feature list. >>> >>> I've come to this point because I see many roadblocks for users who want to see a detailed "receipt" for their responsetime. >> It would probably be helpful if you could describe what you feel is >> missing in more general terms -- while perhaps giving specific >> practical examples of specific scenarios that give us some sense of >> what the strengths of the model are. ISTM that it's not so much a lack >> of automation in PostgreSQL. It's more like a lack of a generalized >> model, which includes automation, but also some high level top-down >> theory. > Back in my oracle days, I formally used method-R on a few consulting > gigs while working with Hotsos (RIP Gary). Method-R is brilliant, and I > referenced it in my PostgreSQL user group talk about wait events in PG. > > https://www.slideshare.net/ardentperf/wait-whats-going-on-inside-my-database-173880246 > > I'm not the author of Method-R, but I myself would describe it as a > methodical approach to consistently solve business problems rooted in > database performance faster than any other methodical approach, built on > a foundation of wait events, queuing theory and tracing (aka logging). > But the most brilliant part is how Cary Millsap's tireless efforts to > simplify, automate and educate have made it accessible to ordinary data > analysts and project managers all over the world who speak SQL but not C. > > PostgreSQL added wait events starting in 9.6 and the last thing that's > missing is an integrated way to trace or log them. A simple starting > point could be a session-level GUC that enables a hook in > pgstat_report_wait_start() and pgstat_report_wait_end() to just drop > messages in the log. These log messages could then easily be processed > to generate the similar profiles to the ones we used with other > databases. Basically I agree 100% with Jeff that while you can do these > things with perf probes or eBPF, there are massive advantages to having > it baked in the database. With the right tools, this makes session > profiling available to regular users (who do their day jobs with excel > rather than eBPF). > > However, one problem to watch out for will be whether the existing > PostgreSQL logging infrastructure can handle this. Probably need higher > precision timestamps (I need to check what csvlog has), and it could > still be a lot of volume with some lightweight locks. Whereas Oracle had > each individual process write the wait event trace messages to its own > file, today PostgreSQL only supports either the single-system-wide-file > logging collector, or syslog which I think can only split to 8 > destinations (and may be lossy). > > There's another use case where high logging bandwidth could also be > useful - temporarily logging all SQL statements to capture workload. > Next time I see someone take down their production database because the > pgBadger doc said "log_min_duration_statement = 0" ... WHY PGBADGER WHY? > > Anyway I do hope there will be some improvements in this area with > PostgreSQL. I'm not much of a C coder but maybe I'll take a swing at it > some day! > > Anyway, Jeff, nice to see you here - and this is a topic I've thought > about a lot too. PostgreSQL is a pretty cool bit of software, and an > even cooler group of people around it. Hope to see you around some more. :) > > -Jeremy > > > PS. "tracing versus sampling" was the perpetual debate amongst > performance engineers... we could have some good fun debating along > those lines too. hold my beer > > Hi Jeremy, There is an extension which does wait event sampling: https://github.com/postgrespro/pg_wait_sampling It's one of the Postgres Pro extensions, I like it a lot. Postgres Pro is getting very popular on the Azure cloud. It's essentially Microsoft response to Aurora. Also EnterpriseDB has the event interface and the views analogous to Oracle: edb$session_wait_history, edb$session_waits and edb$system_waits views are implementing the event interface in Edb. You can look them up in the documentation, the documentation is available on the web. The foundation is already laid, what is needed are the finishing touches, like the detailed event documentation. I am currently engaged in a pilot porting project, porting an application from Oracle to Postgres. I was looking into the event interface in detail. And we are testing the EDB as well. As an Oraclite to Oraclite, I have to commend EDB, it's an excellent piece of software, 75% cheaper than Oracle. I agree with you about the logging capacity. Postgres is very loquacious when it comes to logging. I love that feature because pgBadger reports are even better than the AWR reports. Oracle is very loquacious and verbose too. $ORACLE_BASE/diag/rdbms/.../trace is chock full of trace files plus the alert log, of course. That is why the adrci utility has parameters for the automatic cleanup of the traceand core dump files. Sometimes they did fill the file system. As for the "tracing vs. sampling" debate, Oracle has both. V$ACTIVE_SESSION_HISTORY is a sampling view. Sampling views are more practical, especially when there are pooled connections. Personally, I would prefer sampling. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
On Oct 7, 2021, at 19:38, Bruce Momjian <bruce@momjian.us> wrote: > > On Thu, Oct 7, 2021 at 07:15:39PM -0700, Jeremy Schneider wrote: >> PostgreSQL added wait events starting in 9.6 and the last thing that's >> missing is an integrated way to trace or log them. A simple starting >> point could be a session-level GUC that enables a hook in >> pgstat_report_wait_start() and pgstat_report_wait_end() to just drop >> messages in the log. These log messages could then easily be processed >> to generate the similar profiles to the ones we used with other >> databases. Basically I agree 100% with Jeff that while you can do these >> things with perf probes or eBPF, there are massive advantages to having >> it baked in the database. With the right tools, this makes session >> profiling available to regular users (who do their day jobs with excel >> rather than eBPF). > > Our wait events reported in pg_stat_activity are really only a first > step --- I always felt it needed an external tool to efficiently collect > and report those wait events. I don't think the server log is the right > place to collect them. What would you think about adding hooks to the functions I mentioned, if someone wrote an open source extension that coulddo things with the wait event start/stop times in a preload library? But we could use parameters too, that’s another gap. For example - which buffer, object, etc for buffer_content? Which filenodeand block for an IO? Which relation OID for a SQL lock? Then you can find which table, whether the hot block is aroot or leaf of a btree, etc. This can be done by extending the wait infra to accept two or three arbitrary “informational”parameters, maybe just numeric for efficiency, or maybe string, and each individual wait event can decidewhat to do with them. We’d want to pass that info out over the hooks too. This is another reason to support wait eventtracing in the DB - sometimes it might be difficult to get all the relevant context with a kernel probe on an externaltool. -Jeremy Sent from my TI-83
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
On Thu, Oct 7, 2021 at 11:35:16PM -0400, Mladen Gogala wrote: > > On 10/7/21 22:15, Jeremy Schneider wrote: > There is an extension which does wait event sampling: > > https://github.com/postgrespro/pg_wait_sampling > > It's one of the Postgres Pro extensions, I like it a lot. Postgres Pro is > getting very popular on the Azure cloud. It's essentially Microsoft response > to Aurora. Also EnterpriseDB has the event interface and the views analogous > to Oracle: edb$session_wait_history, edb$session_waits and edb$system_waits > views are implementing the event interface in Edb. You can look them up in > the documentation, the documentation is available on the web. The foundation > is already laid, what is needed are the finishing touches, like the detailed > event documentation. I am currently engaged in a pilot porting project, Ah, this is exactly what I wanted to know --- what people are using the event waits for. Can you tell if these are done all externally, or if they need internal database changes? > I agree with you about the logging capacity. Postgres is very loquacious > when it comes to logging. I love that feature because pgBadger reports are > even better than the AWR reports. Oracle is very loquacious and verbose too. Nice, I had not heard that before. > As for the "tracing vs. sampling" debate, Oracle has both. > V$ACTIVE_SESSION_HISTORY is a sampling view. Sampling views are more > practical, especially when there are pooled connections. Personally, I would > prefer sampling. Yes, slide 101 here: https://momjian.us/main/writings/pgsql/administration.pdf#page=101 shows the Postgres monitoring options for reporting and alterting/aggegation. Yes, both are needed for wait event, and right now we really don't have either for wait events --- just the raw information. However, I also need to ask how the wait event information, whether tracing or sampling, can be useful for Postgres because that will drive the solution. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
Bruce Momjian schrieb am 08.10.2021 um 17:21: > However, I also need to ask how the wait event information, whether > tracing or sampling, can be useful for Postgres because that will drive > the solution. I guess everyone will use that information in a different way. We typically use the AWR reports as a post-mortem analysis tool if something goes wrong in our application (=customer specific projects) E.g. if there was a slowdown "last monday" or "saving something took minutes yesterday morning", then we usually request an AWR report from the time span in question. Quite frequently this already reveals the culprit. If not, we ask them to poke in more detail into v$session_history. So in our case it's not really used for active monitoring, but for finding the root cause after the fact. I don't know how representative this usage is though. Thomas
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
On Fri, Oct 8, 2021 at 05:28:37PM +0200, Thomas Kellerer wrote: > Bruce Momjian schrieb am 08.10.2021 um 17:21: > > However, I also need to ask how the wait event information, whether > > tracing or sampling, can be useful for Postgres because that will drive > > the solution. > > I guess everyone will use that information in a different way. > > We typically use the AWR reports as a post-mortem analysis tool if > something goes wrong in our application (=customer specific projects) > > E.g. if there was a slowdown "last monday" or "saving something took minutes yesterday morning", > then we usually request an AWR report from the time span in question. Quite frequently > this already reveals the culprit. If not, we ask them to poke in more detail into v$session_history. > > So in our case it's not really used for active monitoring, but for > finding the root cause after the fact. > > I don't know how representative this usage is though. OK, that's a good usecase, and something that certainly would apply to Postgres. Don't you often need more than just wait events to find the cause, like system memory usage, total I/O, etc? -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
On Fri, Oct 8, 2021 at 11:40 PM Bruce Momjian <bruce@momjian.us> wrote: > > On Fri, Oct 8, 2021 at 05:28:37PM +0200, Thomas Kellerer wrote: > > > > We typically use the AWR reports as a post-mortem analysis tool if > > something goes wrong in our application (=customer specific projects) > > > > E.g. if there was a slowdown "last monday" or "saving something took minutes yesterday morning", > > then we usually request an AWR report from the time span in question. Quite frequently > > this already reveals the culprit. If not, we ask them to poke in more detail into v$session_history. > > > > So in our case it's not really used for active monitoring, but for > > finding the root cause after the fact. > > > > I don't know how representative this usage is though. > > OK, that's a good usecase, and something that certainly would apply to > Postgres. Don't you often need more than just wait events to find the > cause, like system memory usage, total I/O, etc? You usually need a variety of metrics to be able to find what is actually causing $random_incident, so the more you can aggregate in your performance tool the better. Wait events are an important piece of that puzzle. As a quick example for wait events, I recently had to diagnose some performance issue, which turned out to be some process reaching the 64 subtransactions with the well known consequences. I had pg_wait_sampling aggregated metrics available so it was really easy to know that the slowdown was due to that. Knowing what application exactly reached those 64 subtransactions is another story.
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
Bruce Momjian schrieb am 08.10.2021 um 17:40: >> I guess everyone will use that information in a different way. >> >> We typically use the AWR reports as a post-mortem analysis tool if >> something goes wrong in our application (=customer specific projects) >> >> E.g. if there was a slowdown "last monday" or "saving something took minutes yesterday morning", >> then we usually request an AWR report from the time span in question. Quite frequently >> this already reveals the culprit. If not, we ask them to poke in more detail into v$session_history. >> >> So in our case it's not really used for active monitoring, but for >> finding the root cause after the fact. >> >> I don't know how representative this usage is though. > > OK, that's a good usecase, and something that certainly would apply to > Postgres. Don't you often need more than just wait events to find the > cause, like system memory usage, total I/O, etc? Yes, the AWR report contains that information as well. e.g. sorts that spilled to disk, shared memory at the start and end, top 10 statements sorted by total time, individual time, I/O, number of executions, segments (tables) that received the highest I/O (read and write) and so on. It's really huge.
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
On 10/8/21 11:21, Bruce Momjian wrote: > Ah, this is exactly what I wanted to know --- what people are using the > event waits for. Can you tell if these are done all externally, or if > they need internal database changes? Well, the methodology goes like this: we get the slow queries from pgBadger report and then run explain (analyze, timing, buffers) on the query. If we still cannot figure out how to improve things, we check the events and see what the query is waiting for. After that we may add an index, partition the table, change index structure or do something like that. Unrelated to this discussion, I discovered Bloom extension. Bloom indexes are phenomenally useful. I apologize for the digression. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
TLDR; If I spend the time necessary to instrument the many functions that are the equivalent of the Oracle counterparts, would anyone pull those changes and use them? Specifically, for those who know Oracle, I'm talking about implementing:
- The portion of the ALTER SESSION that enables extended SQL trace
- Most of the DBMS_MONITOR and DBMS_APPLICATION_INFO packages
- Instrument the thousand or so functions that are the equivalent of those found in Oracle's V$EVENT_NAME
- Dynamic performance view V$DIAG_INFO
For the last 35 years, I've made my living helping people solve Oracle performance problems by looking at it, which means:
Trace a user experience and profile the trace file to (a) reveal where the time has gone and its algorithm and (b) make it easy to imagine the cost of possible solutions as well as the savings in response time or resources.
I've even submitted change requests to improve Oracle's tracing features while working for them and since those glorious five years.
Now looking closely at postgreSQL, I see an opportunity to more quickly implement Oracle's current feature list.
I've come to this point because I see many roadblocks for users who want to see a detailed "receipt" for their response time. The biggest roadblock is that without a lot of automation, a user of any kind must log into the server and attempt to get the data that are now traditionally child's play for Oracle. The second biggest roadblock I see is the recompilation that is required for the server components (i.e., postgreSQL, operating system). My initial attempts to get anything useful out of postgreSQL were dismal failures and I think it should be infinitely easier.
Running either dtrace and eBPF scripts on the server should not be required. The instrumentation and the code being instrumented should be tightly coupled. Doing so will allow anyone on any platform for any PostgreSQL version to get a trace file just as easily as people do for Oracle.
Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle
On Mon, 2021-10-11 at 00:09 +0200, Franck Pachot wrote: > And people will dislike it because it mentions Oracle. I don't think so. While "Oracle has it" is not a good enough reason for a feature, it is certainly no counter-indication. > Or people will dislike it because they think this should be reserved to commercial forks. That is conceivable, but I think most vendors would prefer to have that in standard PostgreSQL rather than having to maintain it on their own. > Or because it may introduce too much dependency on the OS. That is possible. But I think gettimeofday(2) is portable enough. Yours, Laurenz Albe