Обсуждение: Proposal to allow setting cursor options on Portals
Greetings,
My main driver here is to allow the creation of Holdable portals at the protocol level for drivers. Currently the only way to create a holdable cursor is at the SQL level.
DECLARE liahona CURSOR WITH HOLD FOR SELECT * FROM films;
The JDBC driver has an option in the API to have result sets survive commits see https://docs.oracle.com/javase/8/docs/api/java/sql/Connection.html#createStatement-int-int-int-
Doing this at the protocol level is the correct way to do this as modifying the SQL to create a cursor is very cumbersome and we already have existing code to create a portal. Adding the ability to specify options
Looking for feedback.
Dave Cramer
Вложения
Hi, I did not look into this patch in detail yet, but I am +1 for being able to create cursors at the protocol level. I think this should be allowed for regular cursors as well. One big use-case I see is allowing postgres_fdw to create and fetch from cursors at the protocol level rather than SQL (DECLARE CURSOR, FETCH, etc.) -- Sami Imseih Amazon Web Services (AWS)
On Sun, 7 Dec 2025 at 15:38, Dave Cramer <davecramer@gmail.com> wrote: > My main driver here is to allow the creation of Holdable portals at the protocol level for drivers. Overall seems like a sensible feature to want. A somewhat random collection of thoughts: 1. We still have fairly limited experience with protocol options, so afaik not everyone agrees what we should use a version bump for vs a protocol extension. 2. I think I like the idea of optional fields that a client can add to the existing messages. That way "implementing" the new protocol version is a no-op for clients. 3. I think we should mark optional fields more clearly in the docs somehow. e.g. Make the docs say <term>Optional Int32</term> and explain what Optional means in the "Message Data Types" section. 4. I think the server should be strict that it only receives this optional field for the expected protocol version. 5. Do we really need to add the CURSOR_BINARY flag? Seems confusing with our other way of indicating binary support, i.e. what does it mean to say text as the format code but then specify CURSOR_BINARY. 6. What is the benefit of PQsendQueryPreparedWithCursorOptions? I understand the use case for PQsendBindWithCursorOptions, but not for PQsendQueryPreparedWithCursorOptions. 7. The server should check that no unknown flags are passed 8. Docs need to be added for the new libpq function(s) I have one question about your intended usage: I expect you intend to make using this opt-in for the users of pgjdbc? (i.e. by using some flag/different method to use this HOLD behaviour)
On Mon, Dec 8, 2025 at 4:43 PM Jelte Fennema-Nio <postgres@jeltef.nl> wrote:
On Sun, 7 Dec 2025 at 15:38, Dave Cramer <davecramer@gmail.com> wrote:
> My main driver here is to allow the creation of Holdable portals at the protocol level for drivers.
Overall seems like a sensible feature to want. A somewhat random
collection of thoughts:
1. We still have fairly limited experience with protocol options, so
afaik not everyone agrees what we should use a version bump for vs a
protocol extension.
2. I think I like the idea of optional fields that a client can add to
the existing messages. That way "implementing" the new protocol
version is a no-op for clients.
3. I think we should mark optional fields more clearly in the docs
somehow. e.g. Make the docs say <term>Optional Int32</term> and
explain what Optional means in the "Message Data Types" section.
4. I think the server should be strict that it only receives this
optional field for the expected protocol version.
5. Do we really need to add the CURSOR_BINARY flag? Seems confusing
with our other way of indicating binary support, i.e. what does it
mean to say text as the format code but then specify CURSOR_BINARY.
6. What is the benefit of PQsendQueryPreparedWithCursorOptions? I
understand the use case for PQsendBindWithCursorOptions, but not for
PQsendQueryPreparedWithCursorOptions.
7. The server should check that no unknown flags are passed
8. Docs need to be added for the new libpq function(s)
I have one question about your intended usage: I expect you intend to
make using this opt-in for the users of pgjdbc? (i.e. by using some
flag/different method to use this HOLD behaviour)
Thx for the comments. Yes JDBC has a holdable resultset as a standard part of the API
Dave
On Mon, 8 Dec 2025 at 23:08, Dave Cramer <davecramer@gmail.com> wrote: > Thx for the comments. One more comment: It would be good to enable tracing[1][2] for your test, especially because I think you still need to update the tracing logic in libpq for your new packet type. [1]: https://github.com/postgres/postgres/blob/f00484c170f56199c3eeacc82bd72f8c1e3baf6b/src/test/modules/libpq_pipeline/README#L29-L34 [2]: https://github.com/postgres/postgres/blob/f00484c170f56199c3eeacc82bd72f8c1e3baf6b/src/test/modules/libpq_pipeline/t/001_libpq_pipeline.pl#L39-L42
On Mon, Dec 8, 2025 at 1:43 PM Jelte Fennema-Nio <postgres@jeltef.nl> wrote: > 1. We still have fairly limited experience with protocol options, so > afaik not everyone agrees what we should use a version bump for vs a > protocol extension. I think it'd be helpful for proposals to describe why a minor version bump was chosen over a protocol extension parameter (or vice versa), so that we can begin to develop some consensus. To me, the conversation on the wire for this feature seems perfect for an extension parameter: "Hello server, do you support this optional thing in this one message type? If not, let me know." Especially since the optional thing is itself an extensible bitmap! With the minor-version strategy, if we added new bits in 3.6, clients who just wanted those new bits would then have to implement support for every feature in versions 3.4, 3.5, and 3.6 just to improve that one use case, and that incentive mismatch leads to more ossification IMO. = Soapbox Follows = I've talked about it face-to-face with people, but to go on the public record: I don't think this is a wise use of a minor version upgrade strategy. I prefer protocol architectures that introduce separate extensions first, then periodically bundle the critical and highly-used extensions into a new minor version once they're sure that _everyone_ should support those things. I know that 3.2 didn't do that. My view of 3.2 is that it was a big compromise to get some things unstuck, so overall I'm glad we have it -- but now that we have it, I'd rather that 3.next be more intentional. Plus I think it's unwise to introduce a 3.3 before we're confident that 3.2 can be widely deployed, and I'm trying to put effort into the latter for 19, so that I'm not just sitting here gatekeeping. IETF has a bunch of related case studies [1,2,3] that might be useful reading, even if we decide that their experience differs heavily from ours. --Jacob [1] https://www.rfc-editor.org/rfc/rfc5218 [2] https://www.rfc-editor.org/rfc/rfc8170 [3] https://www.rfc-editor.org/rfc/rfc9170
On Wed, 10 Dec 2025 at 18:41, Jacob Champion <jacob.champion@enterprisedb.com> wrote: > I think it'd be helpful for proposals to describe why a minor version > bump was chosen over a protocol extension parameter (or vice versa), > so that we can begin to develop some consensus. Agreed. > With the > minor-version strategy, if we added new bits in 3.6, clients who just > wanted those new bits would then have to implement support for every > feature in versions 3.4, 3.5, and 3.6 just to improve that one use > case, and that incentive mismatch leads to more ossification IMO. I think in this optional bitmap field case, there's no work for the client to "implement" it. It can simply request 3.3, but not send the bitmap field. Similarly for my proposed GoAway message, a client can simply ignore that message completely when it receives it. If we keep the features that are bundled with a protocol version bump of the kind where a client, either has to do nothing to implement it, or at worst has to ignore the contents of a new message/field. Then implementing support becomes so trivial for clients that I don't think it'd be a hurdle for client authors to implement support for 3.3, 3.4, 3.5 and if they only wanted a feature from the 3.6 protocol.^1 I'll call these things "no-op implementations" from now on. > I've talked about it face-to-face with people, but to go on the public > record: I don't think this is a wise use of a minor version upgrade > strategy. I prefer protocol architectures that introduce separate > extensions first, then periodically bundle the critical and > highly-used extensions into a new minor version once they're sure that > _everyone_ should support those things. I think we disagree on this. I think the downside of using protocol extensions for everything is that we then end up with N*N different combinations of features in the wild that servers and clients need to deal with. We have to start to define what happens when features interact, but either of them is not enabled. With incrementing versions you don't have that problem, which results in simpler logic in the spec, servers and clients. Finally, because we don't have any protocol extensions yet. All clients still need to build infrastructure for them, including libpq. So I'd argue that if we make such "no-op implementation" features use protocol extensions, then it'd be more work for everyone. > I know that 3.2 didn't do that. My view of 3.2 is that it was a big > compromise to get some things unstuck, so overall I'm glad we have it > -- but now that we have it, I'd rather that 3.next be more > intentional. > Plus I think it's unwise to introduce a 3.3 before we're > confident that 3.2 can be widely deployed, and I'm trying to put > effort into the latter for 19, so that I'm not just sitting here > gatekeeping. I'm not sure what you mean with this. People use libpq18 and PG18, and I've heard no complaints about protocol problems. So I think it was a success. Do you mean widely deployed by default? Why exactly does that matter for 3.3? Anything that stands default deployment in the way for 3.2, will continue to stand default deployment in the way for 3.3. Personally, if we flip the default in e.g. 5 years from now. I'd much rather have it be flipped to a very nice 3.6 protocol, than still only having the single new feature that was added in 3.2. > IETF has a bunch of related case studies [1,2,3] that might be useful > reading, even if we decide that their experience differs heavily from > ours. I gave them a skim and they seem like a good read (which I'll do later). But I'm not sure part of them you thought was actionable for the discussion about version bumps vs protocol extensions. (I did see useful stuff for the grease thread, but that seems better to discuss there) ^1: You and I only talked about clients above, but obviously there's also proxies and other servers that implement the protocol to consider. If a feature that is "no-op implementation" on the client is a complicated implementation on the proxy/server then maybe a protocol extension is indeed the better choice. I think for GoAway it's trivial to "no-op implement" too on the proxy/server. For this cursor option proposal it's less clear cut imo. Proxies can probably simply forward the message to the server, although maybe PgBouncer would want to throw an error when a client uses a hold cursor (but it also doesn't do that for SQL level hold cursors, so that seems like an optional enhancement). Other servers might not even support hold cursors, but then they could simply throw a clear error (like pgbouncer would do). If throwing an error is an acceptable server implementation, then I think a "no-op implementation" is again trivial.
[I considered splitting this off into a new thread, but I think Dave has to wait for it to be resolved before much can happen with the patch. Sorry Dave.] On Wed, Dec 10, 2025 at 3:01 PM Jelte Fennema-Nio <postgres@jeltef.nl> wrote: > If we keep the features that are bundled with a protocol version bump > of the kind where a client, either has to do nothing to implement it, > or at worst has to ignore the contents of a new message/field. Then > implementing support becomes so trivial for clients that I don't think > it'd be a hurdle for client authors to implement support for 3.3, 3.4, > 3.5 and if they only wanted a feature from the 3.6 protocol.^1 I'll > call these things "no-op implementations" from now on. It's too late for that, isn't it? 3.2's only feature doesn't work that way (and couldn't have been designed that way, as far as I can tell). So I don't have any confidence that all future features will fall in line with this new rule. NegotiateProtocolVersion is the only in-band tool we have to ratchet the protocol forward. Why go through all this pain of getting NPV packets working, only to immediately limit its power to the most trivial cases? > I think we disagree on this. I think the downside of using protocol > extensions for everything is that we then end up with N*N different > combinations of features in the wild that servers and clients need to > deal with. We have to start to define what happens when features > interact, but either of them is not enabled. In the worst case? Yes. (That worst case doesn't really bother me. Many other protocols regularly navigate extension combinations.) But! The two extension proposals in flight at the moment -- GoAway and cursor options -- are completely orthogonal, no? Both to each other, and to the functionality in 3.2. There are no combinatorics yet. So it seems strange to optimize for combinatorics out of the gate, by burning through a client-mandatory minor version every year. > With incrementing > versions you don't have that problem, You still have N*M. Implementers have to test each feature of their 3.10 client against server versions 3.0-9, rather than testing against a single server that turns individual extension support on and off. I prefer the latter (but maybe that's just because it's what I'm used to). Middleboxes increase the matrix further, as you point out below. Paradoxically, if all N features happen to be orthogonal, the testing burden for the extension strategy collapses to... N. Minor-version-per-year is worse for that case. > which results in simpler logic > in the spec, servers and clients. I don't want to dissuade a proof of concept for this, because simpler logic everywhere sounds amazing. But it sounds like magical thinking to me. A bit like telling Christoph that the dpkg dependency graph is too complicated, so it should be a straight line instead -- if that worked, presumably everyone would have done it that way, right? Convince me that you're not just ignoring necessary complexity in an attempt to stamp out unnecessary complexity. An example of an established network protocol that follows this same strategy would be helpful. How do their clients deal with the minor-version treadmill? > Finally, because we don't have any protocol extensions yet. All > clients still need to build infrastructure for them, including libpq. For clients still on 3.0 (the vast majority of them), they'd have to add infrastructure for sliding minor version ranges, too. > So I'd argue that if we make such "no-op implementation" features use > protocol extensions, then it'd be more work for everyone. Why advertise a protocol extension if you plan to ignore it? Don't advertise it. Do nothing. That's even less work than retrofitting packet parsers to correctly ignore a byte range when minorversion > X. > > Plus I think it's unwise to introduce a 3.3 before we're > > confident that 3.2 can be widely deployed, and I'm trying to put > > effort into the latter for 19, so that I'm not just sitting here > > gatekeeping. > > I'm not sure what you mean with this. People use libpq18 and PG18, and > I've heard no complaints about protocol problems. So I think it was a > success. Do you mean widely deployed by default? Yes. Or even just "deployed". GitHub shows zero hits outside of the Postgres fork graph. Google's results show that an organization called "cardo" tried max_protocol_version=latest. They had to revert it. :( Time for grease. > Why exactly does that > matter for 3.3? Anything that stands default deployment in the way for > 3.2, will continue to stand default deployment in the way for 3.3. Exactly. Don't you want to make sure that clients in the ecosystem are able to use this _before_ we rev the version again, and again? We don't ever get these numbers back. Like, I'm arguing as hard as I can against the very existence of the treadmill. But if I'm outvoted on that, *please* don't start the treadmill before other people can climb on -- otherwise, they won't be able to give us any feedback at all! > Personally, if we flip the default in e.g. 5 years from now. I'd much > rather have it be flipped to a very nice 3.6 protocol, than still only > having the single new feature that was added in 3.2. Those are not the only two choices. I'd rather we get a bunch of nice features without any flipping at all, if that's possible. It looks possible to me. > > IETF has a bunch of related case studies [1,2,3] that might be useful > > reading, even if we decide that their experience differs heavily from > > ours. > > I gave them a skim and they seem like a good read (which I'll do > later). But I'm not sure part of them you thought was actionable for > the discussion about version bumps vs protocol extensions. (I did see > useful stuff for the grease thread, but that seems better to discuss > there) For this conversation, I'm focused on RFC 8170. Specifically the concepts of incremental transitions and incentive alignment (cost/benefit to individual community members). I view minor-version-per-year as violating both of those principles. It instead focuses on the ease of the people who are most plugged into this mailing list, and who have the most power to change things on a whim. > ^1: You and I only talked about clients above, but obviously there's > also proxies and other servers that implement the protocol to > consider. If a feature that is "no-op implementation" on the client is > a complicated implementation on the proxy/server then maybe a protocol > extension is indeed the better choice. I think for GoAway it's trivial > to "no-op implement" too on the proxy/server. For this cursor option > proposal it's less clear cut imo. Proxies can probably simply forward > the message to the server, although maybe PgBouncer would want to > throw an error when a client uses a hold cursor (but it also doesn't > do that for SQL level hold cursors, so that seems like an optional > enhancement). I think proposals should attempt to answer those questions as a prerequisite to commit, personally. Or at least, we should be moving in that direction, if that's too harsh on the first authors who are trying to get things moving inside the protocol. More generally, it bothers me that we still don't have a clear mental model of middlebox extensibility. We're just retreading the discussions from [1] instead of starting from where we stopped, and that's exhausting for me. (As a reminder: 3.2 broke my testing rig, which relied on implicit assumptions around minor-version extensibility for middleboxes. I didn't speak up until very late, because it was just a testing rig, and I could change it. I should have spoken up immediately, because IIRC, pgpool then broke as well.) > Other servers might not even support hold cursors, but > then they could simply throw a clear error (like pgbouncer would do). > If throwing an error is an acceptable server implementation, then I > think a "no-op implementation" is again trivial. A server is always free to decide at the _application_ layer that it will error out for a particular packet that it can parse at the _network_ layer. But it seems a lot more user-friendly to just decline the protocol bit, if it's directly tied to an application-level feature that isn't implemented. I think we should encourage that when possible; otherwise we've traded protocol fragmentation for application fragmentation. --Jacob [1] https://postgr.es/m/CAGECzQR5PMud4q8Atyz0gOoJ1xNH33g7g-MLXFML1_Vrhbzs6Q%40mail.gmail.com
Let me start with this: I agree with you that both HOLD and GoAway would work well as protocol extensions. And if that's what is needed to get stuff to continue moving in the protocol space, then fine that's what I'll do... But I have some reasons to prefer a protocol version bump at least for GoAway The primary reason I have is that protocol extensions are currently enormously underspecified. Especially in regards to what their values can be and also in regards to their own versioning and compatibility. e.g. if we add _pq_.goaway=true. And later we want to add a field to it, how do we do that? I can see a few options: 1. _pq_.goaway=v2,on (first acceptable value is used, this would need immediate and constant grease) 2. _pq_.goaway=true _pq_.goaway_v2=true (have clients specify both a new and old one and specify how the server should behave it gets both of these) I feel like this requires significant discussion, design and implementation work. I tried to do that in patch 7 and 8 here[2], but those parts of the patchset got very little review and/or feedback. I think a big reason was that the proposal went in a too complex direction, by trying to handle too many of the possible usecases for the protocol extensions. I think as long as we limit the discussion to protocol extensions that don't need to be changed on an active connection using something like SetProtocolParameter and those protocol extensions only have an "on/off" style value, then I think we can make some incremental progress here. This would apply to both GoAway (middleware should just not forward GoAway messages to clients) and Hold (server does not send anything different for this feature). Still this seems like quite a bit more work than "simply" including "no-op implementation" features in a protocol bump. Especially because I think the benefit of protocol parameters for these features is negligible, or even negative because of the secondary reason: The secondary reason is that I'd really like clients to actually support the longer cancel token feature in 3.2. It's not that hard to implement for client authors, but I don't think many users care about it (because the primary beneficiary are server implementers, but those only benefit if there's enough clients that implement it, so chicken and egg). By giving people some extra goodies in 3.3 my hope is that clients will actually implement it. So basically I agree that protocol versions do require some additional work on client author side, but I (selfishly) think that would be a good thing in this case. Because it resolves this chicken and egg problem. To take advice from RFC 8170, I'd like to align incentives better, by having protocol 3.3 contain features that are beneficial for both client and server authors. [2]: https://www.postgresql.org/message-id/CAGECzQRbAGqJnnJJxTdKewTsNOovUt4bsx3NFfofz3m2j-t7tA%40mail.gmail.com -- detailed response below (things I did not respond to I agree with) -- On Thu, 11 Dec 2025 at 20:21, Jacob Champion <jacob.champion@enterprisedb.com> wrote: > NegotiateProtocolVersion is the only in-band tool we have to ratchet > the protocol forward. Why go through all this pain of getting NPV > packets working, only to immediately limit its power to the most > trivial cases? I think it's a fairly easy test to uphold. To be clear, I'm not saying we should indefinitely limit that power. Eventually we'd probably want to add things that are more difficult to implement for clients (possibly after evaluating them as a protocol extension), but that discussion can be punted to when we get there imo. > So it > seems strange to optimize for combinatorics out of the gate, by > burning through a client-mandatory minor version every year. To me 2 protocol extensions a year is strictly more complexity added than 1 minor version a year. i.e. IF the changes are "no-op implementable", why not group them together in a single identifier. > You still have N*M. Implementers have to test each feature of their > 3.10 client against server versions 3.0-9, rather than testing against > a single server that turns individual extension support on and off. I don't understand this argument. If you can have a single server version that turns protocol extensions on and off, then why couldn't you have a single server version that can turn different protocol versions on and off. > An example of an established network protocol that follows this same > strategy would be helpful. How do their clients deal with the > minor-version treadmill? I agree that it would be helpful, but I'm not sure there's such a network protocol. All protocols I know have infrequent version bumps, which then often results in ossification. So frequent version bumps seem like a good way to avoid that from happening. Using protocol extension for everything might mean we ossify the protocol version (again). > > Finally, because we don't have any protocol extensions yet. All > > clients still need to build infrastructure for them, including libpq. > > For clients still on 3.0 (the vast majority of them), they'd have to > add infrastructure for sliding minor version ranges, too. Yes, but adding infrastructure for both protocol versions (which we already have now) and protocol extensions is even more work. libpq still has no support for protocol parameters. > Yes. Or even just "deployed". GitHub shows zero hits outside of the > Postgres fork graph. Yeah, that's sad, but unsurprising. Almost no-one cares about security and that's the only end-user feature of 3.2. > Google's results show that an organization called "cardo" tried > max_protocol_version=latest. They had to revert it. :( Time for > grease. While I totally agree that we need grease, this case actually involved people that did not update their PgBouncer version to a new enough version that supports NegotiateProtocolVersion. [3] [3]: https://www.cardogis.com/AenderungenIwan7#oktober-2025 > > Why exactly does that > > matter for 3.3? Anything that stands default deployment in the way for > > 3.2, will continue to stand default deployment in the way for 3.3. > > Exactly. Don't you want to make sure that clients in the ecosystem are > able to use this _before_ we rev the version again, and again? We > don't ever get these numbers back. To me not every protocol version needs to be implemented by every client. If 3.2 is never used by anyone in the wild, then half of the world immediately switches to 3.3, and then the other half implements 3.4, then I'll be extremely happy. > I'd rather we get a bunch of nice > features without any flipping at all, if that's possible. It looks > possible to me. Me too, but I don't understand how that would work. Sending protocol extensions is just as much of a breaking change for this ungreased middleware as a protocol version bump. So having libpq request _pq_.bindhold=true by default would also need some flip. > I think proposals should attempt to answer those questions as a > prerequisite to commit, personally. Or at least, we should be moving > in that direction, if that's too harsh on the first authors who are > trying to get things moving inside the protocol. Agreed, but I don't think that has to come from the author necessarily. I'm happy to provide that input on proposals and explain if and why it would be hard for something like pgbouncer or other servers. > More generally, it bothers me that we still don't have a clear mental > model of middlebox extensibility. We're just retreading the > discussions from [1] instead of starting from where we stopped, and > that's exhausting for me. I'm still of the opinion that the requirements for [1] are good enough for middleboxes to handle extensibility. I think those requirements could be extended to allow GoAway too, by adding possibility 3 with "The new message sent by the server can be dropped completely by the middleware to imitate the lower protocol version". Remembering and re-reading the thread and this email, it's unclear to me what your thoughts on this are. > (As a reminder: 3.2 broke my testing rig, which relied on implicit > assumptions around minor-version extensibility for middleboxes. I > didn't speak up until very late, because it was just a testing rig, > and I could change it. I should have spoken up immediately, because > IIRC, pgpool then broke as well.) I'm not sure what exactly you're talking about here. You mean libpq complaining about not receiving a BackendKeyData? If so, I agree that wasn't a great situation. But I don't think it was related to the current protocol being under specified, more than the new feature. > A server is always free to decide at the _application_ layer that it > will error out for a particular packet that it can parse at the > _network_ layer. But it seems a lot more user-friendly to just decline > the protocol bit, if it's directly tied to an application-level > feature that isn't implemented. I think we should encourage that when > possible; otherwise we've traded protocol fragmentation for > application fragmentation. I agree in principle, but does it really matter in practice in the case of Hold in practice? If the network layer does not support it, then really all that the user's application can do is throw an error. Whether that error is thrown by the database/middleware or by the client doesn't matter much in the end I think. The main reason where it would matter is if the client could fall back to something else, but in the case of HOLD that something else would probably be send HOLD with SQL. And any server that would throw an error for protocol based HOLD probably (should) also throw one for application level HOLD. > [1] https://postgr.es/m/CAGECzQR5PMud4q8Atyz0gOoJ1xNH33g7g-MLXFML1_Vrhbzs6Q%40mail.gmail.com
On Wed, 10 Dec 2025 at 12:41, Jacob Champion <jacob.champion@enterprisedb.com> wrote:
On Mon, Dec 8, 2025 at 1:43 PM Jelte Fennema-Nio <postgres@jeltef.nl> wrote:
> 1. We still have fairly limited experience with protocol options, so
> afaik not everyone agrees what we should use a version bump for vs a
> protocol extension.
I think it'd be helpful for proposals to describe why a minor version
bump was chosen over a protocol extension parameter (or vice versa),
so that we can begin to develop some consensus.
The reasons I chose a protocol bump include:
1/ I actually think this was an oversight from the original spec. I am not adding any new features to the server, only implementing existing options on a portal/cursor that should have been in the original protocol
2/ I'm hoping and expect that there will be other additions to the protocol for 3.3 such as returning the LSN after commit, binary return values per session
To me, the conversation on the wire for this feature seems perfect for
an extension parameter: "Hello server, do you support this optional
thing in this one message type? If not, let me know." Especially since
the optional thing is itself an extensible bitmap! With the
minor-version strategy, if we added new bits in 3.6, clients who just
wanted those new bits would then have to implement support for every
feature in versions 3.4, 3.5, and 3.6 just to improve that one use
case, and that incentive mismatch leads to more ossification IMO.
= Soapbox Follows =
I've talked about it face-to-face with people, but to go on the public
record: I don't think this is a wise use of a minor version upgrade
strategy. I prefer protocol architectures that introduce separate
extensions first, then periodically bundle the critical and
highly-used extensions into a new minor version once they're sure that
_everyone_ should support those things.
I know that 3.2 didn't do that. My view of 3.2 is that it was a big
compromise to get some things unstuck, so overall I'm glad we have it
-- but now that we have it, I'd rather that 3.next be more
intentional. Plus I think it's unwise to introduce a 3.3 before we're
confident that 3.2 can be widely deployed, and I'm trying to put
effort into the latter for 19, so that I'm not just sitting here
gatekeeping.
pgjdbc already supports 3.2. Unfortunately we have no idea how many people actually use it.
IETF has a bunch of related case studies [1,2,3] that might be useful
reading, even if we decide that their experience differs heavily from
ours.
I read the articles which sadly gloss over protocol negotiation issues.
Dave
On Thu, 11 Dec 2025 at 14:21, Jacob Champion <jacob.champion@enterprisedb.com> wrote:
[I considered splitting this off into a new thread, but I think Dave
has to wait for it to be resolved before much can happen with the
patch. Sorry Dave.]
No worries, I expected discussion.
On Wed, Dec 10, 2025 at 3:01 PM Jelte Fennema-Nio <postgres@jeltef.nl> wrote:
> If we keep the features that are bundled with a protocol version bump
> of the kind where a client, either has to do nothing to implement it,
> or at worst has to ignore the contents of a new message/field. Then
> implementing support becomes so trivial for clients that I don't think
> it'd be a hurdle for client authors to implement support for 3.3, 3.4,
> 3.5 and if they only wanted a feature from the 3.6 protocol.^1 I'll
> call these things "no-op implementations" from now on.
It's too late for that, isn't it? 3.2's only feature doesn't work that
way (and couldn't have been designed that way, as far as I can tell).
So I don't have any confidence that all future features will fall in
line with this new rule.
NegotiateProtocolVersion is the only in-band tool we have to ratchet
the protocol forward. Why go through all this pain of getting NPV
packets working, only to immediately limit its power to the most
trivial cases?
> I think we disagree on this. I think the downside of using protocol
> extensions for everything is that we then end up with N*N different
> combinations of features in the wild that servers and clients need to
> deal with. We have to start to define what happens when features
> interact, but either of them is not enabled.
In the worst case? Yes. (That worst case doesn't really bother me.
Many other protocols regularly navigate extension combinations.)
But! The two extension proposals in flight at the moment -- GoAway and
cursor options -- are completely orthogonal, no? Both to each other,
and to the functionality in 3.2. There are no combinatorics yet. So it
seems strange to optimize for combinatorics out of the gate, by
burning through a client-mandatory minor version every year.
> With incrementing
> versions you don't have that problem,
You still have N*M. Implementers have to test each feature of their
3.10 client against server versions 3.0-9, rather than testing against
a single server that turns individual extension support on and off. I
prefer the latter (but maybe that's just because it's what I'm used
to). Middleboxes increase the matrix further, as you point out below.
As a client author we test against multiple options all the time. I don't think this should be an argument against changing the protocol, otherwise we will never change it.
Paradoxically, if all N features happen to be orthogonal, the testing
burden for the extension strategy collapses to... N.
Minor-version-per-year is worse for that case.
I had never contemplated features being dependent on one another, only additive and orthogonal.
> which results in simpler logic
> in the spec, servers and clients.
I don't want to dissuade a proof of concept for this, because simpler
logic everywhere sounds amazing. But it sounds like magical thinking
to me. A bit like telling Christoph that the dpkg dependency graph is
too complicated, so it should be a straight line instead -- if that
worked, presumably everyone would have done it that way, right?
Convince me that you're not just ignoring necessary complexity in an
attempt to stamp out unnecessary complexity.
An example of an established network protocol that follows this same
strategy would be helpful. How do their clients deal with the
minor-version treadmill?
> Finally, because we don't have any protocol extensions yet. All
> clients still need to build infrastructure for them, including libpq.
For clients still on 3.0 (the vast majority of them), they'd have to
add infrastructure for sliding minor version ranges, too.
> So I'd argue that if we make such "no-op implementation" features use
> protocol extensions, then it'd be more work for everyone.
Why advertise a protocol extension if you plan to ignore it? Don't
advertise it. Do nothing. That's even less work than retrofitting
packet parsers to correctly ignore a byte range when minorversion > X.
> > Plus I think it's unwise to introduce a 3.3 before we're
> > confident that 3.2 can be widely deployed, and I'm trying to put
> > effort into the latter for 19, so that I'm not just sitting here
> > gatekeeping.
>
> I'm not sure what you mean with this. People use libpq18 and PG18, and
> I've heard no complaints about protocol problems. So I think it was a
> success. Do you mean widely deployed by default?
Yes. Or even just "deployed". GitHub shows zero hits outside of the
Postgres fork graph.
As mentioned pgjdbc supports 3.2. It was trivial to implement.
Google's results show that an organization called "cardo" tried
max_protocol_version=latest. They had to revert it. :( Time for
grease.
> Why exactly does that
> matter for 3.3? Anything that stands default deployment in the way for
> 3.2, will continue to stand default deployment in the way for 3.3.
Exactly. Don't you want to make sure that clients in the ecosystem are
able to use this _before_ we rev the version again, and again? We
don't ever get these numbers back.
Well there are 97 of them. 1 per year is a long time.
Like, I'm arguing as hard as I can against the very existence of the
treadmill. But if I'm outvoted on that, *please* don't start the
treadmill before other people can climb on -- otherwise, they won't be
able to give us any feedback at all!
> Personally, if we flip the default in e.g. 5 years from now. I'd much
> rather have it be flipped to a very nice 3.6 protocol, than still only
> having the single new feature that was added in 3.2.
Those are not the only two choices. I'd rather we get a bunch of nice
features without any flipping at all, if that's possible. It looks
possible to me.
> > IETF has a bunch of related case studies [1,2,3] that might be useful
> > reading, even if we decide that their experience differs heavily from
> > ours.
>
> I gave them a skim and they seem like a good read (which I'll do
> later). But I'm not sure part of them you thought was actionable for
> the discussion about version bumps vs protocol extensions. (I did see
> useful stuff for the grease thread, but that seems better to discuss
> there)
For this conversation, I'm focused on RFC 8170. Specifically the
concepts of incremental transitions and incentive alignment
(cost/benefit to individual community members).
I view minor-version-per-year as violating both of those principles.
It instead focuses on the ease of the people who are most plugged into
this mailing list, and who have the most power to change things on a
whim.
> ^1: You and I only talked about clients above, but obviously there's
> also proxies and other servers that implement the protocol to
> consider. If a feature that is "no-op implementation" on the client is
> a complicated implementation on the proxy/server then maybe a protocol
> extension is indeed the better choice. I think for GoAway it's trivial
> to "no-op implement" too on the proxy/server. For this cursor option
> proposal it's less clear cut imo. Proxies can probably simply forward
> the message to the server, although maybe PgBouncer would want to
> throw an error when a client uses a hold cursor (but it also doesn't
> do that for SQL level hold cursors, so that seems like an optional
> enhancement).
FWIW, HOLDABLE cursors are not the only option this enables. It enables all of the other cursor options.
I think proposals should attempt to answer those questions as a
prerequisite to commit, personally. Or at least, we should be moving
in that direction, if that's too harsh on the first authors who are
trying to get things moving inside the protocol.
More generally, it bothers me that we still don't have a clear mental
model of middlebox extensibility. We're just retreading the
discussions from [1] instead of starting from where we stopped, and
that's exhausting for me.
(As a reminder: 3.2 broke my testing rig, which relied on implicit
assumptions around minor-version extensibility for middleboxes. I
didn't speak up until very late, because it was just a testing rig,
and I could change it. I should have spoken up immediately, because
IIRC, pgpool then broke as well.)
> Other servers might not even support hold cursors, but
> then they could simply throw a clear error (like pgbouncer would do).
> If throwing an error is an acceptable server implementation, then I
> think a "no-op implementation" is again trivial.
A server is always free to decide at the _application_ layer that it
will error out for a particular packet that it can parse at the
_network_ layer. But it seems a lot more user-friendly to just decline
the protocol bit, if it's directly tied to an application-level
feature that isn't implemented. I think we should encourage that when
possible; otherwise we've traded protocol fragmentation for
application fragmentation.
Are we concerned with servers that are not compatible with Postgres ?
As far as protocol fragmentation goes, I see this more as evolution to a more complete usable implementation. I do see that we will have to be careful with interdependent protocol options.
Dave
On Sun, 14 Dec 2025 at 13:31, Dave Cramer <davecramer@gmail.com> wrote: >> Exactly. Don't you want to make sure that clients in the ecosystem are >> able to use this _before_ we rev the version again, and again? We >> don't ever get these numbers back. > > Well there are 97 of them. 1 per year is a long time. I don't think Jacob was concerned about the actual numbers running out, but in case he was: it's actually 9997 versions that we still have (9996 after we'd commit the grease proposal[1]). [1]: https://commitfest.postgresql.org/patch/6157/ > FWIW, HOLDABLE cursors are not the only option this enables. It enables all of the other cursor options. As mentioned upthread, I'm not sure BINARY makes sense. For any other options, the protocol docs should specify which ones are allowed and what their bits are. Looking at the DECLARE docs[2]. 1. I think supporting ASENSITVE/INSENSITIVE/SENSITIVE bits is unnecessary, since postgres cursors are always INSENSITIVE. 2. For SCROLL vs NO SCROLL, it would be nice if we could get rid of the intermediate mode where if neither SCROLL or NO SCROLL is specified, it's still SCROLL sometimes. I'm not sure backwards compatibility would allow that, i.e. can you currently sometimes do a BACKWARD scan on a portal created with Bind. I guess we could make it so that if you specify the portal flags, then you have to be explicit abuot specifying SCROLL or NO SCROLL 3. All the flags with no SQL variant probably shouldn't be configurable through the protocol too (e.g. CURSOR_OPT_FAST_PLAN) [2]: https://www.postgresql.org/docs/18/sql-declare.html > Are we concerned with servers that are not compatible with Postgres ? I think there's enough re-implementations of the postgres protocol by other databases that it would be a shame if we didn't even try to consider them, but I wouldn't consider it critical to get it right. Since they can always throw application errors for features they don't support, just like they do now for SQL that they don't support. They can always contribute changes to clients to make using unsupported features opt-in in the rare case where they are not.
On Sun, 14 Dec 2025 at 08:42, Jelte Fennema-Nio <postgres@jeltef.nl> wrote:
On Sun, 14 Dec 2025 at 13:31, Dave Cramer <davecramer@gmail.com> wrote:
>> Exactly. Don't you want to make sure that clients in the ecosystem are
>> able to use this _before_ we rev the version again, and again? We
>> don't ever get these numbers back.
>
> Well there are 97 of them. 1 per year is a long time.
I don't think Jacob was concerned about the actual numbers running
out, but in case he was: it's actually 9997 versions that we still
have (9996 after we'd commit the grease proposal[1]).
[1]: https://commitfest.postgresql.org/patch/6157/
> FWIW, HOLDABLE cursors are not the only option this enables. It enables all of the other cursor options.
As mentioned upthread, I'm not sure BINARY makes sense. For any other
options, the protocol docs should specify which ones are allowed and
what their bits are. Looking at the DECLARE docs[2].
Here I was thinking that binary was the one that did make sense. The pgjdbc driver would like the results back in binary, I believe others would as well.
1. I think supporting ASENSITVE/INSENSITIVE/SENSITIVE bits is
unnecessary, since postgres cursors are always INSENSITIVE.
2. For SCROLL vs NO SCROLL, it would be nice if we could get rid of
the intermediate mode where if neither SCROLL or NO SCROLL is
specified, it's still SCROLL sometimes. I'm not sure backwards
compatibility would allow that, i.e. can you currently sometimes do a
BACKWARD scan on a portal created with Bind. I guess we could make it
so that if you specify the portal flags, then you have to be explicit
abuot specifying SCROLL or NO SCROLL
3. All the flags with no SQL variant probably shouldn't be
configurable through the protocol too (e.g. CURSOR_OPT_FAST_PLAN)
[2]: https://www.postgresql.org/docs/18/sql-declare.html
> Are we concerned with servers that are not compatible with Postgres ?
I think there's enough re-implementations of the postgres protocol by
other databases that it would be a shame if we didn't even try to
consider them, but I wouldn't consider it critical to get it right.
Since they can always throw application errors for features they don't
support, just like they do now for SQL that they don't support. They
can always contribute changes to clients to make using unsupported
features opt-in in the rare case where they are not.
Fair, but from my POV, we are only concerned with Postgres. I would say it's up to the other implementations to deal with incompatibilities.
Dave
On Sun, 14 Dec 2025 at 14:49, Dave Cramer <davecramer@gmail.com> wrote: > Here I was thinking that binary was the one that did make sense. The pgjdbc driver would like the results back in binary,I believe others would as well. I agree drivers would like binary results back, but it's unclear to me how CURSOR_OPT_BINARY is different from setting the result column format codes to an array of a single 1? That should also change all columns to be binary right? > Fair, but from my POV, we are only concerned with Postgres. I would say it's up to the other implementations to deal withincompatibilities. I get what you mean, but I feel like we should at least be concerned with popular ecosystem tools like, pgbouncer and pgpool. But then it quickly becomes an exercise in where we draw the line, what about postgres forks like Yugabyte? Or things very similar like cockroachdb. Both of those are distributed, and probably don't use our LSNs. So as a concrete example, if we add LSNs to the protocol, it would be nice to work with their version too if it's not too much effort. e.g. by specifing a length for the commit id in the protocol instead of forcing it at the protocol level to always be a 64bit integer.
On Sun, 14 Dec 2025 at 09:04, Jelte Fennema-Nio <postgres@jeltef.nl> wrote:
On Sun, 14 Dec 2025 at 14:49, Dave Cramer <davecramer@gmail.com> wrote:
> Here I was thinking that binary was the one that did make sense. The pgjdbc driver would like the results back in binary, I believe others would as well.
I agree drivers would like binary results back, but it's unclear to me
how CURSOR_OPT_BINARY is different from setting the result column
format codes to an array of a single 1? That should also change all
columns to be binary right?
Fair point.
> Fair, but from my POV, we are only concerned with Postgres. I would say it's up to the other implementations to deal with incompatibilities.
I get what you mean, but I feel like we should at least be concerned
with popular ecosystem tools like, pgbouncer and pgpool. But then it
quickly becomes an exercise in where we draw the line, what about
postgres forks like Yugabyte? Or things very similar like cockroachdb.
Both of those are distributed, and probably don't use our LSNs. So as
a concrete example, if we add LSNs to the protocol, it would be nice
to work with their version too if it's not too much effort. e.g. by
specifing a length for the commit id in the protocol instead of
forcing it at the protocol level to always be a 64bit integer.
It would make sense to be forward looking here in the event that Postgres ever has wider LSN's agreed.
Dave