Обсуждение: Use uppercase keywords in foreign key tutorial
While browsing the docs I saw that the foreign key tutorial [1] uses some lowercase keywords which are inconsistent with the rest of the docs. The attached patch fixes that. Should be pushed to all stable branches. [1] https://www.postgresql.org/docs/current/tutorial-fk.html -- Erik Wienhold
Вложения
On Thu, Oct 16, 2025, at 5:37 AM, Erik Wienhold wrote: > While browsing the docs I saw that the foreign key tutorial [1] uses > some lowercase keywords which are inconsistent with the rest of the > docs. The attached patch fixes that. Should be pushed to all stable > branches. > Register your patch in the next CF [1] so it won't be forgotten. These are not the only places that SQL keywords use lowercase. There are various cases (I searched for 'primary key') in dml.sgml, logicaldecoding.sgml, plpgsql.sgml, sepgsql.sgml, and textsearch.sgml that you should address as part of your proposal. I don't think there is an easy way to collect all cases. I also checked the most common keywords and I found a few lowercase cases. The SQL commands are usually inside programlisting tag so I tried the following command (to find the occurrence in the first line): cd doc/srg/sgml grep -r -A 1 '<programlisting' * | grep -E 'create |alter |drop |select |insert into|update |delete ' (This command was an easy way to show there are other cases. It is not intended to be a starting point to collect all cases.) Of course, there are other cases too. For example, "between" in config.sgml is lowercase. <programlisting> CREATE TABLE parent(key integer, ...); CREATE TABLE child1000(check (key between 1000 and 1999)) INHERITS(parent); CREATE TABLE child2000(check (key between 2000 and 2999)) INHERITS(parent); ... SELECT * FROM parent WHERE key = 2400; </programlisting> [1] https://commitfest.postgresql.org/56/ -- Euler Taveira EDB https://www.enterprisedb.com/
On 2025-10-21 04:37 +0300, Euler Taveira wrote: > On Thu, Oct 16, 2025, at 5:37 AM, Erik Wienhold wrote: > > While browsing the docs I saw that the foreign key tutorial [1] uses > > some lowercase keywords which are inconsistent with the rest of the > > docs. The attached patch fixes that. Should be pushed to all stable > > branches. > > > > Register your patch in the next CF [1] so it won't be forgotten. https://commitfest.postgresql.org/patch/6159/ > These are not the only places that SQL keywords use lowercase. There > are various cases (I searched for 'primary key') in dml.sgml, > logicaldecoding.sgml, plpgsql.sgml, sepgsql.sgml, and textsearch.sgml > that you should address as part of your proposal. I don't think there > is an easy way to collect all cases. I also checked the most common > keywords and I found a few lowercase cases. The SQL commands are > usually inside programlisting tag so I tried the following command (to > find the occurrence in the first line): > > cd doc/srg/sgml > grep -r -A 1 '<programlisting' * | grep -E 'create |alter |drop |select |insert into|update |delete ' > > (This command was an easy way to show there are other cases. It is not > intended to be a starting point to collect all cases.) > > Of course, there are other cases too. For example, "between" in > config.sgml is lowercase. > > <programlisting> > CREATE TABLE parent(key integer, ...); > CREATE TABLE child1000(check (key between 1000 and 1999)) INHERITS(parent); > CREATE TABLE child2000(check (key between 2000 and 2999)) INHERITS(parent); > ... > SELECT * FROM parent WHERE key = 2400; > </programlisting> Ah, thanks for the tip. I figured that I can grep all defined keywords using the keyword list in the repository with this Bash script: #!/usr/bin/env bash set -eu # Collect all defined keywords and turn them lowercase for searching kwfile=$(mktemp) ( cat doc/src/sgml/keywords/*.txt | tr [:upper:] [:lower:] ; sed -n 's/PG_KEYWORD("\(\w\+\)".*/\1/p' src/include/parser/kwlist.h ) | sort | uniq > "$kwfile" find_lowercase_keywords() { local infile="$1" # Extract program listings and prefix lines with line numbers of the # input file (filename gets prepended after grepping the keywords). # This also covers non-SQL program listings because there's no language # attribute that we could filter by. awk '/<programlisting/ {flag=1} flag {print NR "\t" $0} /<\/programlisting/ {flag=0}' "$infile"\ | sed 's/--.*//'\ | grep --color=always -Fw -f "$kwfile"\ | sed "s@.*@$infile:&@" } for f in $(find doc/src/sgml -name '*.sgml'); do find_lowercase_keywords "$f" done This of course produces a lot of noise, but I managed to find and fix a couple more places with the attached v2 (including the ones you've listed) that I've missed previously. I've searched for "primary key" across the docs previously but did not spot the matches between the false-positive ones from the running text. -- Erik Wienhold
Вложения
On Fri, 24 Oct 2025 at 09:39, Erik Wienhold <ewie@ewie.name> wrote: > This of course produces a lot of noise, but I managed to find and fix a > couple more places with the attached v2 (including the ones you've > listed) that I've missed previously. I've searched for "primary key" > across the docs previously but did not spot the matches between the > false-positive ones from the running text. I've not reviewed in detail, but on a first-pass read, I think this is worth doing. I personally find queries with upper-case keywords and lowercase identifiers much easier on the eyes. I'm not so sure about the "Should be pushed to all stable branches" part as it seems more of a general improvement than fixing a mistake. I imagine there will be varying opinions on that, however. David
On Thu, Oct 23, 2025, at 7:50 PM, David Rowley wrote: > I'm not so sure about the "Should be pushed to all stable branches" > part as it seems more of a general improvement than fixing a mistake. > I imagine there will be varying opinions on that, however. > I also classify this as an improvement. No need to backpatch. I would apply changes to back branches if the query is wrong; that's not the case. I checked v2 and it looks good to me. Since you are changing the query format, we have a mix of CREATE TABLE foo(a int); CREATE TABLE foo (a int); The latter reads better for me. The same logic applies to VALUES. There is a mix of space or non-space before parenthesis. Consistency is a good goal. -- Euler Taveira EDB https://www.enterprisedb.com/
On Fri, Oct 24, 2025 at 10:40:18AM -0300, Euler Taveira wrote: > On Thu, Oct 23, 2025, at 7:50 PM, David Rowley wrote: >> I'm not so sure about the "Should be pushed to all stable branches" >> part as it seems more of a general improvement than fixing a mistake. >> I imagine there will be varying opinions on that, however. > > I also classify this as an improvement. No need to backpatch. I would apply > changes to back branches if the query is wrong; that's not the case. +1 -- nathan
I noticed the patch also changes some column types to lowercase:
<programlisting>
CREATE TYPE tablefunc_crosstab_N AS (
- row_name TEXT,
- category_1 TEXT,
- category_2 TEXT,
+ row_name text,
+ category_1 text,
+ category_2 text,
.
.
.
- category_N TEXT
+ category_N text
);
</programlisting>
FWIW I tend to use uppercase for those, too, but I'm not sure there is a
preferred style for the docs.
--
nathan
On Tue, Oct 28, 2025 at 04:34:45PM -0500, Nathan Bossart wrote: > I noticed the patch also changes some column types to lowercase: > > <programlisting> > CREATE TYPE tablefunc_crosstab_N AS ( > - row_name TEXT, > - category_1 TEXT, > - category_2 TEXT, > + row_name text, > + category_1 text, > + category_2 text, > . > . > . > - category_N TEXT > + category_N text > ); > </programlisting> > > FWIW I tend to use uppercase for those, too, but I'm not sure there is a > preferred style for the docs. Agreed, uppercase is better for type names. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Do not let urgent matters crowd out time for investment in the future.
Bruce Momjian <bruce@momjian.us> writes:
> On Tue, Oct 28, 2025 at 04:34:45PM -0500, Nathan Bossart wrote:
>> I noticed the patch also changes some column types to lowercase:
>> ...
>> - category_N TEXT
>> + category_N text
>>
>> FWIW I tend to use uppercase for those, too, but I'm not sure there is a
>> preferred style for the docs.
> Agreed, uppercase is better for type names.
"text" is not a keyword according to either us or the SQL standard.
I agree that there's some reason to capitalize things that are
grammar keywords, such as INTEGER or VARCHAR, but it's a big stretch
to go from that to capitalizing everything that is a type name.
Would you capitalize user-defined type names?
A concrete reason for worrying about this distinction is that the
keywords will be interpreted the same regardless of search_path,
but ordinary-identifier names will not.
But having said that, I think 100% consistency about this sort of
stylistic choice is neither achievable (for any long period) nor
particularly desirable. In the worst case it could mislead novices
into thinking that case is significant when it isn't.
regards, tom lane
On Thu, 30 Oct 2025 at 09:48, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Bruce Momjian <bruce@momjian.us> writes: > > On Tue, Oct 28, 2025 at 04:34:45PM -0500, Nathan Bossart wrote: > >> I noticed the patch also changes some column types to lowercase: > >> ... > >> - category_N TEXT > >> + category_N text > >> > >> FWIW I tend to use uppercase for those, too, but I'm not sure there is a > >> preferred style for the docs. > > > Agreed, uppercase is better for type names. > > "text" is not a keyword according to either us or the SQL standard. > I agree that there's some reason to capitalize things that are > grammar keywords, such as INTEGER or VARCHAR, but it's a big stretch > to go from that to capitalizing everything that is a type name. > Would you capitalize user-defined type names? Going by: git grep -i "\btext\b," we're fairly consistently using lower case, so FWIW, when I looked, I thought Eric's change made sense. How about if Eric just drops the portion of the patch that alters the casing of the types and leaves all the keyword uppercasing stuff in. Any objections to that part? David
On Thu, Oct 30, 2025 at 10:24:30AM +1300, David Rowley wrote: > On Thu, 30 Oct 2025 at 09:48, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > Bruce Momjian <bruce@momjian.us> writes: > > > On Tue, Oct 28, 2025 at 04:34:45PM -0500, Nathan Bossart wrote: > > >> I noticed the patch also changes some column types to lowercase: > > >> ... > > >> - category_N TEXT > > >> + category_N text > > >> > > >> FWIW I tend to use uppercase for those, too, but I'm not sure there is a > > >> preferred style for the docs. > > > > > Agreed, uppercase is better for type names. > > > > "text" is not a keyword according to either us or the SQL standard. > > I agree that there's some reason to capitalize things that are > > grammar keywords, such as INTEGER or VARCHAR, but it's a big stretch > > to go from that to capitalizing everything that is a type name. > > Would you capitalize user-defined type names? > > Going by: git grep -i "\btext\b," we're fairly consistently using > lower case, so FWIW, when I looked, I thought Eric's change made > sense. > > How about if Eric just drops the portion of the patch that alters the > casing of the types and leaves all the keyword uppercasing stuff in. > Any objections to that part? Works for me. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Do not let urgent matters crowd out time for investment in the future.
On Wed, Oct 29, 2025 at 05:25:43PM -0400, Bruce Momjian wrote: > On Thu, Oct 30, 2025 at 10:24:30AM +1300, David Rowley wrote: >> How about if Eric just drops the portion of the patch that alters the >> casing of the types and leaves all the keyword uppercasing stuff in. >> Any objections to that part? > > Works for me. +1 -- nathan
On 2025-10-29 22:24 +0100, David Rowley wrote:
> On Thu, 30 Oct 2025 at 09:48, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Bruce Momjian <bruce@momjian.us> writes:
> > > On Tue, Oct 28, 2025 at 04:34:45PM -0500, Nathan Bossart wrote:
> > >> I noticed the patch also changes some column types to lowercase:
> > >> ...
> > >> - category_N TEXT
> > >> + category_N text
> > >>
> > >> FWIW I tend to use uppercase for those, too, but I'm not sure there is a
> > >> preferred style for the docs.
> >
> > > Agreed, uppercase is better for type names.
> >
> > "text" is not a keyword according to either us or the SQL standard.
> > I agree that there's some reason to capitalize things that are
> > grammar keywords, such as INTEGER or VARCHAR, but it's a big stretch
> > to go from that to capitalizing everything that is a type name.
> > Would you capitalize user-defined type names?
>
> Going by: git grep -i "\btext\b," we're fairly consistently using
> lower case, so FWIW, when I looked, I thought Eric's change made
> sense.
That is also my impression. The docs clearly favor lowercase
identifiers and the various data type subpages consistently introduce
types as lowercase names. Also, if you check the output of
format_type() or pg_get_viewdef() you'll see that we use lowercase
types names there as well, even for the ones mandated by the SQL
standard:
regress=> CREATE VIEW myview AS SELECT 'foo'::VARCHAR AS f1;
CREATE VIEW
regress=> SELECT pg_get_viewdef('myview');
pg_get_viewdef
-----------------------------------------
SELECT 'foo'::character varying AS f1;
(1 row)
But I also agree with Tom that keeping a consistent style is impossible
in the long run. But it also shows that the docs are still written by
humans. As long as we can keep a consistent style within a single
listing (or even an entire page) I'm satisfied as a reader.
> How about if Eric just drops the portion of the patch that alters the
> casing of the types and leaves all the keyword uppercasing stuff in.
> Any objections to that part?
In the attached v3 I've reverted that part (and other similar changes
that I had already in the pipeline after my OCD kicked in ;). With one
exception: datatype.sgml has this change because lowercase bit(3) is
already used in the same listing a couple of lines down:
-CREATE TABLE test (a BIT(3), b BIT VARYING(5));
+CREATE TABLE test (a bit(3), b bit varying(5));
That's the only listing with inconsistently cased type names I could
find.
Besides that I've fixed a couple of more places that had lowercase
keywords or were missing some whitespace that I had missed before.
--
Erik Wienhold
Вложения
On Fri, 31 Oct 2025 at 13:53, Erik Wienhold <ewie@ewie.name> wrote:
> But I also agree with Tom that keeping a consistent style is impossible
> in the long run. But it also shows that the docs are still written by
> humans. As long as we can keep a consistent style within a single
> listing (or even an entire page) I'm satisfied as a reader.
For me, I don't see this as a reason not to try. If we do get things
to a consistent point, then anyone making changes that reference
existing portions of the documentation for inspiration should maintain
consistency. If we're entirely random, then there's no hope for anyone
to figure out what the best practice or perfected casing is.
> Besides that I've fixed a couple of more places that had lowercase
> keywords or were missing some whitespace that I had missed before.
A couple of things.
1) I see you've added a space after "INSERT INTO table" and before the
column list, but not consistently, per:
git grep -E "INSERT INTO \w+\("
2) An identifier casing has been changed here:
-SELECT sub_part, SUM(quantity) as total_quantity
+SELECT sub_part, sum(quantity) AS total_quantity
You could also look at the results of the SQL command that's returned
by the following SQL to see if there's anything else. I do see some
"ROLLUP(", "EXISTS(", "GROUPING(" and "VALUES(" in there. You have
been changing "VALUES(" to "VALUES (", so I assume those ones have
been missed:
select 'git grep -E "\b(' || string_agg(UPPER(word),'|') || ')\("'
from pg_get_keywords();
David
On 2025-10-31 04:03 +0100, David Rowley wrote:
> On Fri, 31 Oct 2025 at 13:53, Erik Wienhold <ewie@ewie.name> wrote:
> > But I also agree with Tom that keeping a consistent style is impossible
> > in the long run. But it also shows that the docs are still written by
> > humans. As long as we can keep a consistent style within a single
> > listing (or even an entire page) I'm satisfied as a reader.
>
> For me, I don't see this as a reason not to try. If we do get things
> to a consistent point, then anyone making changes that reference
> existing portions of the documentation for inspiration should maintain
> consistency. If we're entirely random, then there's no hope for anyone
> to figure out what the best practice or perfected casing is.
Sure. I didn't say we shouldn't try. But from my experience, minor
formatting issues slip through all the time if you can't rely on an
autoformatter or linter. Maybe less so in these docs because the sample
queries are usually short. But it already took me a couple of rounds to
find every lowercase keyword so far and I guess I still missed some.
> > Besides that I've fixed a couple of more places that had lowercase
> > keywords or were missing some whitespace that I had missed before.
>
> A couple of things.
>
> 1) I see you've added a space after "INSERT INTO table" and before the
> column list, but not consistently, per:
>
> git grep -E "INSERT INTO \w+\("
Fixed in the attached v4. Except for one match in dblink.sgml that is
the sample output of dblink_build_sql_insert which actually omits the
space after the table name and VALUES keyword.
> 2) An identifier casing has been changed here:
>
> -SELECT sub_part, SUM(quantity) as total_quantity
> +SELECT sub_part, sum(quantity) AS total_quantity
Reverted back to uppercase SUM. I think this was the only changed
identifier and what remains should only be keyword and whitespace
changes to keep the patch focused on that.
> You could also look at the results of the SQL command that's returned
> by the following SQL to see if there's anything else. I do see some
> "ROLLUP(", "EXISTS(", "GROUPING(" and "VALUES(" in there. You have
> been changing "VALUES(" to "VALUES (", so I assume those ones have
> been missed:
>
> select 'git grep -E "\b(' || string_agg(UPPER(word),'|') || ')\("'
> from pg_get_keywords();
Thanks. Fixed those as well. But only for keywords with non-C catcodes
because I figured that the catcode C keywords mostly cover builtin
function names where extra whitespace before the opening paren doesn't
make sense. Missed the uppercase keywords before because I was only
searching for the lowercase ones.
--
Erik Wienhold
Вложения
On Tue, 4 Nov 2025 at 09:06, Erik Wienhold <ewie@ewie.name> wrote:
> Fixed in the attached v4. Except for one match in dblink.sgml that is
> the sample output of dblink_build_sql_insert which actually omits the
> space after the table name and VALUES keyword.
I went through all these and I think it's mostly good. However...
It seems strange to me that you've made so many changes to transform
"CREATE TABLE foo(" into "CREATE TABLE foo (", but you've done the
opposite for CREATE FUNCTION and CREATE PROCEDURE. Did you go with the
majority rules here? It just seems a bit hard to follow what the
standard is being enforced here and if that's hard to understand now,
what hope is there of people following that in the future?
I'm starting to wonder if adjusting the spacing here is a worthwhile change.
David
On 2025-11-03 23:37 +0100, David Rowley wrote:
> On Tue, 4 Nov 2025 at 09:06, Erik Wienhold <ewie@ewie.name> wrote:
> > Fixed in the attached v4. Except for one match in dblink.sgml that
> > is the sample output of dblink_build_sql_insert which actually omits
> > the space after the table name and VALUES keyword.
>
> I went through all these and I think it's mostly good.
Thanks.
> However...
>
> It seems strange to me that you've made so many changes to transform
> "CREATE TABLE foo(" into "CREATE TABLE foo (", but you've done the
> opposite for CREATE FUNCTION and CREATE PROCEDURE. Did you go with the
> majority rules here?
Yes, CREATE FUNCTION with an extra space after the name is the
exception. On master I count 29 (w/ space) vs. 249 (w/o space) matches:
$ git grep -Ei 'create(\s+or\s+replace)?\s+(function|procedure)\s+\w+(\.\w+)?\s+\(' doc | wc -l
29
$ git grep -Ei 'create(\s+or\s+replace)?\s+(function|procedure)\s+\w+(\.\w+)?\(' doc | wc -l
249
I also see similar counts for CREATE TABLE, but with the
one-space-after-table-name variant being more common:
$ git grep -Ei 'create\s+table(\s+if\s+not\+sexists)?\s+\w+(\.\w+)?\s+\(' doc | wc -l
213
$ git grep -Ei 'create\s+table(\s+if\s+not\+sexists)?\s+\w+(\.\w+)?\(' doc | wc -l
34
The attached v5 fixes a couple of more places that I missed in v4.
> It just seems a bit hard to follow what the standard is being enforced
> here and if that's hard to understand now, what hope is there of
> people following that in the future?
I follow this rule/standard/convention which also matches what I already
find in most parts of the docs:
* No extra space for function calls, e.g.
extract(YEAR FROM now())
* Extra space for everything else, e.g.
VALUES (1, 2);
CREATE TABLE foo ();
> I'm starting to wonder if adjusting the spacing here is a worthwhile
> change.
I think it's worth to have that consistency. If the patch is too broad
I can of course limit it to the listings with inconsistent keyword
casing which is more less patch v2 plus some changes from v3 and later.
Or just fix the keywords for now to get that out of the way and deal
with the spacing in a separate patch/thread.
--
Erik Wienhold
Вложения
On Tue, 4 Nov 2025 at 13:04, Erik Wienhold <ewie@ewie.name> wrote: > > On 2025-11-03 23:37 +0100, David Rowley wrote: > > I'm starting to wonder if adjusting the spacing here is a worthwhile > > change. > > I think it's worth to have that consistency. If the patch is too broad > I can of course limit it to the listings with inconsistent keyword > casing which is more less patch v2 plus some changes from v3 and later. > > Or just fix the keywords for now to get that out of the way and deal > with the spacing in a separate patch/thread. I think just the keyword upper casing is a good idea for now. I'm starting to lose hope that there's enough merit and consistency to the proposed whitespace changes. Maybe there's a subset of it that does make sense to do. David