Обсуждение: psql's \d versus included-index-column feature
I noticed that psql's \d command doesn't do very well with included index columns. Given the regression db's test case, CREATE INDEX tbl_include_reg_idx ON tbl_include_reg (c1, c2) INCLUDE (c3, c4); we get regression=# \d tbl_include_reg_idx Index "public.tbl_include_reg_idx" Column | Type | Definition --------+---------+------------ c1 | integer | c1 c2 | integer | c2 c3 | integer | c4 | box | btree, for table "public.tbl_include_reg" So there are two problems with that: first, where's the definition info for the included columns, and second, how's the user supposed to tell which columns are key columns vs which are included? (No, I don't accept the answer that you're supposed to know that an omitted definition means an included column. For one thing, that approach cannot scale to handle included expression columns. For another, the documentation doesn't say any such thing.) I looked into the reason why the definition is empty, and the answer seems to be somebody's poorly-thought-through decision that pg_get_indexdef_worker's attrsOnly flag could be made to serve two purposes, with the result that pg_get_indexdef_ext with a nonzero column argument will print nothing for an included column. That doesn't seem like a sane (or documented) behavior either. So the attached patch splits it into two flags, attrsOnly and keysOnly, and with that we get regression=# \d tbl_include_reg_idx Index "public.tbl_include_reg_idx" Column | Type | Definition --------+---------+------------ c1 | integer | c1 c2 | integer | c2 c3 | integer | c3 c4 | box | c4 btree, for table "public.tbl_include_reg" which is better, but it's still not very clear what's what. Do we want to consider that good enough, or do we want to mark included columns in this display, and if so how? One idea is to do it like this: regression=# \d tbl_include_reg_idx Index "public.tbl_include_reg_idx" Column | Type | Definition --------+---------+------------ c1 | integer | c1 c2 | integer | c2 c3 | integer | INCLUDE c3 c4 | box | INCLUDE c4 btree, for table "public.tbl_include_reg" If we wanted to have pg_get_indexdef_ext() include "INCLUDE" in its output for an included column, this could be achieved with just a couple more lines of code. But that behavior seems rather ugly to me, and likely to break other use-cases for pg_get_indexdef_ext(). So I think we likely want to do this on the psql side instead, which will take more code but also offers more flexibility for the display layout. For instance, maybe we want something like regression=# \d tbl_include_reg_idx Index "public.tbl_include_reg_idx" Column | Type | Key | Definition --------+---------+------------------ c1 | integer | t | c1 c2 | integer | t | c2 c3 | integer | f | c3 c4 | box | f | c4 btree, for table "public.tbl_include_reg" Thoughts? regards, tom lane diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index 065238b..a38aed2 100644 --- a/src/backend/utils/adt/ruleutils.c +++ b/src/backend/utils/adt/ruleutils.c @@ -320,7 +320,8 @@ static void decompile_column_index_array(Datum column_index_array, Oid relId, static char *pg_get_ruledef_worker(Oid ruleoid, int prettyFlags); static char *pg_get_indexdef_worker(Oid indexrelid, int colno, const Oid *excludeOps, - bool attrsOnly, bool showTblSpc, bool inherits, + bool attrsOnly, bool keysOnly, + bool showTblSpc, bool inherits, int prettyFlags, bool missing_ok); static char *pg_get_statisticsobj_worker(Oid statextid, bool missing_ok); static char *pg_get_partkeydef_worker(Oid relid, int prettyFlags, @@ -1097,7 +1098,9 @@ pg_get_indexdef(PG_FUNCTION_ARGS) prettyFlags = PRETTYFLAG_INDENT; - res = pg_get_indexdef_worker(indexrelid, 0, NULL, false, false, false, + res = pg_get_indexdef_worker(indexrelid, 0, NULL, + false, false, + false, false, prettyFlags, true); if (res == NULL) @@ -1117,8 +1120,10 @@ pg_get_indexdef_ext(PG_FUNCTION_ARGS) prettyFlags = pretty ? (PRETTYFLAG_PAREN | PRETTYFLAG_INDENT | PRETTYFLAG_SCHEMA) : PRETTYFLAG_INDENT; - res = pg_get_indexdef_worker(indexrelid, colno, NULL, colno != 0, false, - false, prettyFlags, true); + res = pg_get_indexdef_worker(indexrelid, colno, NULL, + colno != 0, false, + false, false, + prettyFlags, true); if (res == NULL) PG_RETURN_NULL(); @@ -1134,10 +1139,13 @@ pg_get_indexdef_ext(PG_FUNCTION_ARGS) char * pg_get_indexdef_string(Oid indexrelid) { - return pg_get_indexdef_worker(indexrelid, 0, NULL, false, true, true, 0, false); + return pg_get_indexdef_worker(indexrelid, 0, NULL, + false, false, + true, true, + 0, false); } -/* Internal version that just reports the column definitions */ +/* Internal version that just reports the key-column definitions */ char * pg_get_indexdef_columns(Oid indexrelid, bool pretty) { @@ -1145,7 +1153,9 @@ pg_get_indexdef_columns(Oid indexrelid, bool pretty) prettyFlags = pretty ? (PRETTYFLAG_PAREN | PRETTYFLAG_INDENT | PRETTYFLAG_SCHEMA) : PRETTYFLAG_INDENT; - return pg_get_indexdef_worker(indexrelid, 0, NULL, true, false, false, + return pg_get_indexdef_worker(indexrelid, 0, NULL, + true, true, + false, false, prettyFlags, false); } @@ -1158,7 +1168,8 @@ pg_get_indexdef_columns(Oid indexrelid, bool pretty) static char * pg_get_indexdef_worker(Oid indexrelid, int colno, const Oid *excludeOps, - bool attrsOnly, bool showTblSpc, bool inherits, + bool attrsOnly, bool keysOnly, + bool showTblSpc, bool inherits, int prettyFlags, bool missing_ok) { /* might want a separate isConstraint parameter later */ @@ -1297,15 +1308,13 @@ pg_get_indexdef_worker(Oid indexrelid, int colno, Oid keycolcollation; /* - * attrsOnly flag is used for building unique-constraint and - * exclusion-constraint error messages. Included attrs are meaningless - * there, so do not include them in the message. + * Ignore non-key attributes if told to. */ - if (attrsOnly && keyno >= idxrec->indnkeyatts) + if (keysOnly && keyno >= idxrec->indnkeyatts) break; - /* Report the INCLUDED attributes, if any. */ - if ((!attrsOnly) && keyno == idxrec->indnkeyatts) + /* Otherwise, print INCLUDE to divide key and non-key attrs. */ + if (!colno && keyno == idxrec->indnkeyatts) { appendStringInfoString(&buf, ") INCLUDE ("); sep = ""; @@ -1352,13 +1361,12 @@ pg_get_indexdef_worker(Oid indexrelid, int colno, keycolcollation = exprCollation(indexkey); } - if (!attrsOnly && (!colno || colno == keyno + 1)) + /* Print additional decoration for (selected) key columns */ + if (!attrsOnly && keyno < idxrec->indnkeyatts && + (!colno || colno == keyno + 1)) { Oid indcoll; - if (keyno >= idxrec->indnkeyatts) - continue; - /* Add collation, if not default for column */ indcoll = indcollation->values[keyno]; if (OidIsValid(indcoll) && indcoll != keycolcollation) @@ -2197,6 +2205,7 @@ pg_get_constraintdef_worker(Oid constraintId, bool fullCommand, false, false, false, + false, prettyFlags, false)); break;
regression=# \d tbl_include_reg_idx
Index "public.tbl_include_reg_idx"
Column | Type | Key | Definition
--------+---------+------------------
c1 | integer | t | c1
c2 | integer | t | c2
c3 | integer | f | c3
c4 | box | f | c4
btree, for table "public.tbl_include_reg"
+1 for the additional column indicating whether the column is being treated as key data or supplemental included data.
-1 for printing a boolean t/f; would rather spell it out:
CASE WHEN "Key" THEN 'Key' ELSE 'Included' END AS "Data"
We're not hurting for horizontal space here and in any case I'd rather save others the eye strain of having to distinguish between lowercase "f" and "t".
David J.
On 2018-Jul-18, David G. Johnston wrote: > -1 for printing a boolean t/f; would rather spell it out: > > CASE WHEN "Key" THEN 'Key' ELSE 'Included' END AS "Data" +1 -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > On 2018-Jul-18, David G. Johnston wrote: >> -1 for printing a boolean t/f; would rather spell it out: >> >> CASE WHEN "Key" THEN 'Key' ELSE 'Included' END AS "Data" > +1 I can sympathize with the eyestrain argument against t/f, but the above doesn't seem like an improvement --- in particular, "Data" as the column header seems quite content-free. My counterproposal is to keep "Key" as the header and use "Yes"/"No" as the values. I'd be OK with "Key"/"Included" as the values if someone can propose an on-point column header to go with those. regards, tom lane
On 2018-Jul-18, Tom Lane wrote: > I can sympathize with the eyestrain argument against t/f, but the > above doesn't seem like an improvement --- in particular, "Data" > as the column header seems quite content-free. My counterproposal > is to keep "Key" as the header and use "Yes"/"No" as the values. I think "Key: no" is a bit obscure -- using "included" is a bit more self-documenting and lends better to documentation searches. > I'd be OK with "Key"/"Included" as the values if someone can > propose an on-point column header to go with those. "Role"? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Jul 18, 2018 at 11:14 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
regression=# \d tbl_include_reg_idx
Index "public.tbl_include_reg_idx"
Column | Type | Key | Definition
--------+---------+------------------
c1 | integer | t | c1
c2 | integer | t | c2
c3 | integer | f | c3
c4 | box | f | c4
btree, for table "public.tbl_include_reg"+1 for the additional column indicating whether the column is being treated as key data or supplemental included data.
+1
And especially I don't think we should place word "INCLUDE" to the definition column.
-1 for printing a boolean t/f; would rather spell it out:
IMHO, t/f have advantage of brevity. From my point of view, covering indexes are not so evident feature. So, users need to spend some time reading documentation before realizing what they are and how to use them. So, I don't expect that short designation of INCLUDE columns as "non-key" (Key == false) columns could be discouraging here.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On Thu, Jul 19, 2018 at 1:11 AM Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
On Wed, Jul 18, 2018 at 11:14 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
regression=# \d tbl_include_reg_idx
Index "public.tbl_include_reg_idx"
Column | Type | Key | Definition
--------+---------+------------------
c1 | integer | t | c1
c2 | integer | t | c2
c3 | integer | f | c3
c4 | box | f | c4
btree, for table "public.tbl_include_reg"+1 for the additional column indicating whether the column is being treated as key data or supplemental included data.+1And especially I don't think we should place word "INCLUDE" to the definition column.-1 for printing a boolean t/f; would rather spell it out:IMHO, t/f have advantage of brevity. From my point of view, covering indexes are not so evident feature. So, users need to spend some time reading documentation before realizing what they are and how to use them. So, I don't expect that short designation of INCLUDE columns as "non-key" (Key == false) columns could be discouraging here.
I don't think there is an established practice on how to display this sort of info, but I see that both styles already exist, namely:
=# \dL
List of languages
Name | Owner | Trusted | Description
------------+----------+---------+------------------------------
plpgsql | postgres | t | PL/pgSQL procedural language
plproxy | postgres | f |
...
and
=# \dC
List of casts
Source type | Target type | Function | Implicit?
-----------------------------+-----------------------------+--------------------+---------------
abstime | date | date | in assignment
abstime | integer | (binary coercible) | no
abstime | timestamp without time zone | timestamp | yes
...
I like the second option more, for readability reasons and because it is easier to extend if ever needed.
Given that the documentation refers to included columns as "non-key columns", it seems natural to me to name the \d output column "Key?" and use "yes/no" as the values.
Regards,
--
Alex
Oleksandr Shulgin <oleksandr.shulgin@zalando.de> writes: > I don't think there is an established practice on how to display this sort > of info, but I see that both styles already exist, namely: > =# \dL > List of languages > Name | Owner | Trusted | Description > ------------+----------+---------+------------------------------ > plpgsql | postgres | t | PL/pgSQL procedural language > plproxy | postgres | f | > ... > and > =# \dC > List of casts > Source type | Target type | Function > | Implicit? > -----------------------------+-----------------------------+--------------------+--------------- > abstime | date | date > | in assignment > abstime | integer | (binary > coercible) | no > abstime | timestamp without time zone | timestamp > | yes > ... > I like the second option more, for readability reasons and because it is > easier to extend if ever needed. > Given that the documentation refers to included columns as "non-key > columns", it seems natural to me to name the \d output column "Key?" and > use "yes/no" as the values. WFM, anyone want to argue against? regards, tom lane
On Thursday, July 19, 2018, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Given that the documentation refers to included columns as "non-key
> columns", it seems natural to me to name the \d output column "Key?" and
> use "yes/no" as the values.
WFM, anyone want to argue against?
Works for me as well.
David J.