Обсуждение: Expanding the use of FLEXIBLE_ARRAY_MEMBER for declarations like foo[1]
Hi, In a lot of places in the code we have many structures doing declarations of the type foo[1] to emulate variable length arrays. Attached are a set of patches aimed at replacing that with FLEXIBLE_ARRAY_MEMBER to prevent potential failures that could be caused by compiler optimizations and improve report of static code analyzers of the type Coverity (idea by Tom, patches from me): - 0001 uses FLEXIBLE_ARRAY_MEMBER in a bunch of trivial places (at least check-world does not complain) - 0002 does the same for catalog tables - 0003 changes varlena in c.h. This is done as a separate patch because this needs some other modifications as variable-length things need to be placed at the end of structures, because of: -- rolpassword that should be placed as the last field of pg_authid (and shouldn't there be CATALOG_VARLEN here??) -- inv_api.c uses bytea in an internal structure without putting it at the end of the structure. For clarity I think that we should just use a bytea pointer and do a sufficient allocation for the duration of the lobj manipulation. -- Similarly, tuptoaster.c needed to be patched for toast_save_datum There are as well couple of things that are not changed on purpose: - in namespace.h for FuncCandidateList. I tried manipulating it but it resulted in allocation overflow in PortalHeapMemory - I don't think that the t_bits fields in htup_details.h should be updated either. Regards, -- Michael
Вложения
On 2015-02-17 05:51:22 +0900, Michael Paquier wrote: > - 0002 does the same for catalog tables > - 0003 changes varlena in c.h. This is done as a separate patch > because this needs some other modifications as variable-length things > need to be placed at the end of structures, because of: > -- rolpassword that should be placed as the last field of pg_authid > (and shouldn't there be CATALOG_VARLEN here??) Yes, there should. > -- inv_api.c uses bytea in an internal structure without putting it at > the end of the structure. For clarity I think that we should just use > a bytea pointer and do a sufficient allocation for the duration of the > lobj manipulation. Hm. I don't really see the problem tbh. There's actually is a reason the bytea is at the beginning - the other fields are *are* the data part of the bytea (which is just the varlena header). > -- Similarly, tuptoaster.c needed to be patched for toast_save_datum I'm not a big fan of these two changes. This adds some not that small memory allocations to a somewhat hot path. Without a big win in clarity. And it doesn't seem to have anything to do with with FLEXIBLE_ARRAY_MEMBER. > There are as well couple of things that are not changed on purpose: > - in namespace.h for FuncCandidateList. I tried manipulating it but it > resulted in allocation overflow in PortalHeapMemory Did you change the allocation in FuncnameGetCandidates()? Note the - sizeof(Oid) there. > - I don't think that the t_bits fields in htup_details.h should be > updated either. Why not? Any not broken code should already use MinHeapTupleSize and similar macros. Generally it's worthwhile to check the code you changed for sizeof(changed_struct) and similar things. Because this very well could add bugs. And not all of them will result in a outright visibile error like the FuncnameGetCandidates() one. > --- a/src/include/access/htup_details.h > +++ b/src/include/access/htup_details.h > @@ -150,7 +150,7 @@ struct HeapTupleHeaderData > > /* ^ - 23 bytes - ^ */ > > - bits8 t_bits[1]; /* bitmap of NULLs -- VARIABLE LENGTH */ > + bits8 t_bits[1]; /* bitmap of NULLs */ > > /* MORE DATA FOLLOWS AT END OF STRUCT */ > }; > @@ -579,7 +579,7 @@ struct MinimalTupleData > > /* ^ - 23 bytes - ^ */ > > - bits8 t_bits[1]; /* bitmap of NULLs -- VARIABLE LENGTH */ > + bits8 t_bits[1]; /* bitmap of NULLs */ > > /* MORE DATA FOLLOWS AT END OF STRUCT */ > }; This sees like overeager search/replace. > diff --git a/src/include/nodes/params.h b/src/include/nodes/params.h > index 5b096c5..0ad7ef0 100644 > --- a/src/include/nodes/params.h > +++ b/src/include/nodes/params.h > @@ -71,7 +71,7 @@ typedef struct ParamListInfoData > ParserSetupHook parserSetup; /* parser setup hook */ > void *parserSetupArg; > int numParams; /* number of ParamExternDatas following */ > - ParamExternData params[1]; /* VARIABLE LENGTH ARRAY */ > + ParamExternData params[1]; > } ParamListInfoData; > Eh? > diff --git a/src/include/catalog/pg_attribute.h b/src/include/catalog/pg_attribute.h > index 87a3462..73bcefe 100644 > --- a/src/include/catalog/pg_attribute.h > +++ b/src/include/catalog/pg_attribute.h > @@ -157,13 +157,13 @@ CATALOG(pg_attribute,1249) BKI_BOOTSTRAP BKI_WITHOUT_OIDS BKI_ROWTYPE_OID(75) BK > /* NOTE: The following fields are not present in tuple descriptors. */ > > /* Column-level access permissions */ > - aclitem attacl[1]; > + aclitem attacl[FLEXIBLE_ARRAY_MEMBER]; > > /* Column-level options */ > - text attoptions[1]; > + text attoptions[FLEXIBLE_ARRAY_MEMBER]; > > /* Column-level FDW options */ > - text attfdwoptions[1]; > + text attfdwoptions[FLEXIBLE_ARRAY_MEMBER]; > #endif > } FormData_pg_attribute; Ugh. This really isn't C at all anymore - these structs wouldn't compile if you removed the #ifdef. Maybe we should instead a 'textarray' type for genbki's benefit? Generally the catalog changes are much less clearly a benefit imo. > From ad8f54cdb5776146f17d1038bb295b5f13b549f1 Mon Sep 17 00:00:00 2001 > From: Michael Paquier <michael@otacoo.com> > Date: Mon, 16 Feb 2015 03:53:38 +0900 > Subject: [PATCH 3/3] Update varlena in c.h to use FLEXIBLE_ARRAY_MEMBER > > Places using a variable-length variable not at the end of a structure > are changed with workaround, without impacting what those features do. I vote for rejecting most of this, except a (corrected version) of the pg_authid change. Which doesn't really belong to the rest of the patch anyway ;)x > #define VARHDRSZ ((int32) sizeof(int32)) > diff --git a/src/include/catalog/pg_authid.h b/src/include/catalog/pg_authid.h > index e01e6aa..d8789a5 100644 > --- a/src/include/catalog/pg_authid.h > +++ b/src/include/catalog/pg_authid.h > @@ -56,8 +56,10 @@ CATALOG(pg_authid,1260) BKI_SHARED_RELATION BKI_ROWTYPE_OID(2842) BKI_SCHEMA_MAC > int32 rolconnlimit; /* max connections allowed (-1=no limit) */ > > /* remaining fields may be null; use heap_getattr to read them! */ > - text rolpassword; /* password, if any */ > timestamptz rolvaliduntil; /* password expiration time, if any */ > +#ifdef CATALOG_VARLEN > + text rolpassword; /* password, if any */ > +#endif > } FormData_pg_authid; That change IIRC is wrong, because it'll make rolvaliduntil until NOT NULL (any column that's fixed width and has only fixed with columns before it is marked as such). Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2015-02-17 05:51:22 +0900, Michael Paquier wrote: >> diff --git a/src/include/catalog/pg_authid.h b/src/include/catalog/pg_authid.h >> index e01e6aa..d8789a5 100644 >> --- a/src/include/catalog/pg_authid.h >> +++ b/src/include/catalog/pg_authid.h >> @@ -56,8 +56,10 @@ CATALOG(pg_authid,1260) BKI_SHARED_RELATION BKI_ROWTYPE_OID(2842) BKI_SCHEMA_MAC >> int32 rolconnlimit; /* max connections allowed (-1=no limit) */ >> >> /* remaining fields may be null; use heap_getattr to read them! */ >> - text rolpassword; /* password, if any */ >> timestamptz rolvaliduntil; /* password expiration time, if any */ >> +#ifdef CATALOG_VARLEN >> + text rolpassword; /* password, if any */ >> +#endif >> } FormData_pg_authid; > That change IIRC is wrong, because it'll make rolvaliduntil until NOT > NULL (any column that's fixed width and has only fixed with columns > before it is marked as such). You were muttering about a BKI_FORCE_NOT_NULL option ... for symmetry, maybe we could add BKI_FORCE_NULL as well, and then use that for cases like this? Also, if we want to insist that these fields be accessed through heap_getattr, I'd be inclined to put them inside the "#ifdef CATALOG_VARLEN" to enforce that. I'm generally -1 on reordering any catalog columns as part of this patch. There should be zero user-visible change from it IMO. However, if we stick both those columns inside the ifdef, we don't need to reorder. regards, tom lane
Re: Expanding the use of FLEXIBLE_ARRAY_MEMBER for declarations like foo[1]
От
Michael Paquier
Дата:
On Tue, Feb 17, 2015 at 9:02 AM, Andres Freund wrote: > On 2015-02-17 05:51:22 +0900, Michael Paquier wrote: >> -- inv_api.c uses bytea in an internal structure without putting it at >> the end of the structure. For clarity I think that we should just use >> a bytea pointer and do a sufficient allocation for the duration of the >> lobj manipulation. > > Hm. I don't really see the problem tbh. > > There's actually is a reason the bytea is at the beginning - the other > fields are *are* the data part of the bytea (which is just the varlena > header). > >> -- Similarly, tuptoaster.c needed to be patched for toast_save_datum > > I'm not a big fan of these two changes. This adds some not that small > memory allocations to a somewhat hot path. Without a big win in > clarity. And it doesn't seem to have anything to do with with > FLEXIBLE_ARRAY_MEMBER. We could replace those palloc calls with a simple buffer that has a predefined size, but this somewhat reduces the readability of the code. >> There are as well couple of things that are not changed on purpose: >> - in namespace.h for FuncCandidateList. I tried manipulating it but it >> resulted in allocation overflow in PortalHeapMemory > > Did you change the allocation in FuncnameGetCandidates()? Note the - > sizeof(Oid) there. Yeah. Missed it. >> - I don't think that the t_bits fields in htup_details.h should be >> updated either. > > Why not? Any not broken code should already use MinHeapTupleSize and > similar macros. Changing t_bits impacts HeapTupleHeaderData, ReorderBufferTupleBuf and similarly a couple of redo routines in heapam.c using HeapTupleHeaderData in a couple of structures not placing it at the end (compiler complains). We could use for each of them a buffer that has enough room with sizeof(HeapTupleHeaderData) + MaxHeapTupleSize, but wouldn't it reduce the readability of the current code? Opinions welcome. >> diff --git a/src/include/catalog/pg_attribute.h b/src/include/catalog/pg_attribute.h > [...] > Generally the catalog changes are much less clearly a benefit imo. OK, let's drop them then. >> From ad8f54cdb5776146f17d1038bb295b5f13b549f1 Mon Sep 17 00:00:00 2001 >> From: Michael Paquier <michael@otacoo.com> >> Date: Mon, 16 Feb 2015 03:53:38 +0900 >> Subject: [PATCH 3/3] Update varlena in c.h to use FLEXIBLE_ARRAY_MEMBER >> >> Places using a variable-length variable not at the end of a structure >> are changed with workaround, without impacting what those features do. > > I vote for rejecting most of this, except a (corrected version) of the > pg_authid change. Which doesn't really belong to the rest of the patch > anyway ;)x Changing bytea to use FLEXIBLE_ARRAY_MEMBER requires those changes, or at least some changes in this area as something with FLEXIBLE_ARRAY_MEMBER can only be placed at the end of a structure. But I guess that we can do fine as well by replacing those structures with some buffers with a pre-defined size. I'll draft an additional patch on top of 0001 with all those less-trivial changes implemented. >> #define VARHDRSZ ((int32) sizeof(int32)) >> diff --git a/src/include/catalog/pg_authid.h b/src/include/catalog/pg_authid.h >> index e01e6aa..d8789a5 100644 >> --- a/src/include/catalog/pg_authid.h >> +++ b/src/include/catalog/pg_authid.h >> @@ -56,8 +56,10 @@ CATALOG(pg_authid,1260) BKI_SHARED_RELATION BKI_ROWTYPE_OID(2842) BKI_SCHEMA_MAC >> int32 rolconnlimit; /* max connections allowed (-1=no limit) */ >> >> /* remaining fields may be null; use heap_getattr to read them! */ >> - text rolpassword; /* password, if any */ >> timestamptz rolvaliduntil; /* password expiration time, if any */ >> +#ifdef CATALOG_VARLEN >> + text rolpassword; /* password, if any */ >> +#endif >> } FormData_pg_authid; > > That change IIRC is wrong, because it'll make rolvaliduntil until NOT > NULL (any column that's fixed width and has only fixed with columns > before it is marked as such). This sounds better as a separate patch... -- Michael
On 2015-02-18 17:15:18 +0900, Michael Paquier wrote: > >> - I don't think that the t_bits fields in htup_details.h should be > >> updated either. > > > > Why not? Any not broken code should already use MinHeapTupleSize and > > similar macros. > > Changing t_bits impacts HeapTupleHeaderData, ReorderBufferTupleBuf and > similarly a couple of redo routines in heapam.c using > HeapTupleHeaderData in a couple of structures not placing it at the > end (compiler complains). The compiler will complain if you use a FLEXIBLE_ARRAY_MEMBER in the middle of a struct but not when when you embed a struct that uses it into the middle another struct. At least gcc doesn't and I think it'd be utterly broken if another compiler did that. If there's a compiler that does so, we need to make it define FLEXIBLE_ARRAY_MEMBER to 1. Code embedding structs using *either* [FLEXIBLE_ARRAY_MEMBER] or [1] for variable length obviously has to take care where the variable length part goes. And that what the structs you removed where doing - that's where the t_bits et al go: { ...HeapTupleHeaderData header;char data[MaxHeapTupleSize]; ... } the 'data' bit is just the t_bits + data together. And similar in - struct - { - struct varlena hdr; - char data[TOAST_MAX_CHUNK_SIZE]; /* make struct big enough */ - int32 align_it; /* ensure struct is aligned well enough */ - } chunk_data; The 'data' is where the varlena's vl_dat stuff is stored. > >> Places using a variable-length variable not at the end of a structure > >> are changed with workaround, without impacting what those features do. > > > > I vote for rejecting most of this, except a (corrected version) of the > > pg_authid change. Which doesn't really belong to the rest of the patch > > anyway ;)x > > Changing bytea to use FLEXIBLE_ARRAY_MEMBER requires those changes, or > at least some changes in this area as something with > FLEXIBLE_ARRAY_MEMBER can only be placed at the end of a structure. Again, I think you're confusing things that may not be be done in a single struct, and structs that are embedded in other places. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2015-02-16 21:34:57 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2015-02-17 05:51:22 +0900, Michael Paquier wrote: > >> diff --git a/src/include/catalog/pg_authid.h b/src/include/catalog/pg_authid.h > >> index e01e6aa..d8789a5 100644 > >> --- a/src/include/catalog/pg_authid.h > >> +++ b/src/include/catalog/pg_authid.h > >> @@ -56,8 +56,10 @@ CATALOG(pg_authid,1260) BKI_SHARED_RELATION BKI_ROWTYPE_OID(2842) BKI_SCHEMA_MAC > >> int32 rolconnlimit; /* max connections allowed (-1=no limit) */ > >> > >> /* remaining fields may be null; use heap_getattr to read them! */ > >> - text rolpassword; /* password, if any */ > >> timestamptz rolvaliduntil; /* password expiration time, if any */ > >> +#ifdef CATALOG_VARLEN > >> + text rolpassword; /* password, if any */ > >> +#endif > >> } FormData_pg_authid; > > > That change IIRC is wrong, because it'll make rolvaliduntil until NOT > > NULL (any column that's fixed width and has only fixed with columns > > before it is marked as such). > > You were muttering about a BKI_FORCE_NOT_NULL option ... for symmetry, > maybe we could add BKI_FORCE_NULL as well, and then use that for cases > like this? Yea, I guess it'd not be too hard. > Also, if we want to insist that these fields be accessed > through heap_getattr, I'd be inclined to put them inside the "#ifdef > CATALOG_VARLEN" to enforce that. That we definitely should do. It's imo just a small bug that it was omitted here. I'll fix it, but not backpatch unless you prefer? > I'm generally -1 on reordering any catalog columns as part of this patch. > There should be zero user-visible change from it IMO. However, if we > stick both those columns inside the ifdef, we don't need to reorder. Agreed. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2015-02-16 21:34:57 -0500, Tom Lane wrote: >> Also, if we want to insist that these fields be accessed >> through heap_getattr, I'd be inclined to put them inside the "#ifdef >> CATALOG_VARLEN" to enforce that. > That we definitely should do. It's imo just a small bug that it was > omitted here. I'll fix it, but not backpatch unless you prefer? Agreed, that's really independent of FLEXIBLE_ARRAY_MEMBER, so fix it as its own patch. I also agree that back-patching is probably not appropriate. regards, tom lane
Re: Expanding the use of FLEXIBLE_ARRAY_MEMBER for declarations like foo[1]
От
Michael Paquier
Дата:
On Wed, Feb 18, 2015 at 10:09 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2015-02-18 17:15:18 +0900, Michael Paquier wrote: >> >> - I don't think that the t_bits fields in htup_details.h should be >> >> updated either. >> > >> > Why not? Any not broken code should already use MinHeapTupleSize and >> > similar macros. >> >> Changing t_bits impacts HeapTupleHeaderData, ReorderBufferTupleBuf and >> similarly a couple of redo routines in heapam.c using >> HeapTupleHeaderData in a couple of structures not placing it at the >> end (compiler complains). > > The compiler will complain if you use a FLEXIBLE_ARRAY_MEMBER in the > middle of a struct but not when when you embed a struct that uses it > into the middle another struct. At least gcc doesn't and I think it'd be > utterly broken if another compiler did that. If there's a compiler that > does so, we need to make it define FLEXIBLE_ARRAY_MEMBER to 1. clang does complain on my OSX laptop regarding that ;) -- Michael
On 2015-02-19 07:10:19 +0900, Michael Paquier wrote: > On Wed, Feb 18, 2015 at 10:09 PM, Andres Freund <andres@2ndquadrant.com> wrote: > > On 2015-02-18 17:15:18 +0900, Michael Paquier wrote: > >> >> - I don't think that the t_bits fields in htup_details.h should be > >> >> updated either. > >> > > >> > Why not? Any not broken code should already use MinHeapTupleSize and > >> > similar macros. > >> > >> Changing t_bits impacts HeapTupleHeaderData, ReorderBufferTupleBuf and > >> similarly a couple of redo routines in heapam.c using > >> HeapTupleHeaderData in a couple of structures not placing it at the > >> end (compiler complains). > > > > The compiler will complain if you use a FLEXIBLE_ARRAY_MEMBER in the > > middle of a struct but not when when you embed a struct that uses it > > into the middle another struct. At least gcc doesn't and I think it'd be > > utterly broken if another compiler did that. If there's a compiler that > > does so, we need to make it define FLEXIBLE_ARRAY_MEMBER to 1. > > clang does complain on my OSX laptop regarding that ;) I think that then puts the idea of doing it for those structs pretty much to bed. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Michael Paquier <michael.paquier@gmail.com> writes: > On Wed, Feb 18, 2015 at 10:09 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> The compiler will complain if you use a FLEXIBLE_ARRAY_MEMBER in the >> middle of a struct but not when when you embed a struct that uses it >> into the middle another struct. At least gcc doesn't and I think it'd be >> utterly broken if another compiler did that. If there's a compiler that >> does so, we need to make it define FLEXIBLE_ARRAY_MEMBER to 1. > clang does complain on my OSX laptop regarding that ;) I'm a bit astonished that gcc doesn't consider this an error. Sure seems like it should. (Has anyone tried it on recent gcc?) I am entirely opposed to Andreas' claim that we ought to consider compilers that do warn to be broken; if anything it's the other way around. Moreover, if we have any code that is assuming such cases are okay, it probably needs a second look. Isn't this situation effectively assuming that a variable-length array is fixed-length? regards, tom lane
Re: Expanding the use of FLEXIBLE_ARRAY_MEMBER for declarations like foo[1]
От
Michael Paquier
Дата:
On Thu, Feb 19, 2015 at 7:29 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Michael Paquier <michael.paquier@gmail.com> writes: >> On Wed, Feb 18, 2015 at 10:09 PM, Andres Freund <andres@2ndquadrant.com> wrote: >>> The compiler will complain if you use a FLEXIBLE_ARRAY_MEMBER in the >>> middle of a struct but not when when you embed a struct that uses it >>> into the middle another struct. At least gcc doesn't and I think it'd be >>> utterly broken if another compiler did that. If there's a compiler that >>> does so, we need to make it define FLEXIBLE_ARRAY_MEMBER to 1. > >> clang does complain on my OSX laptop regarding that ;) > > I'm a bit astonished that gcc doesn't consider this an error. Sure seems > like it should. (Has anyone tried it on recent gcc?) Just tried with gcc 4.9.2 on an ArchLinux bix and it does not complain after switching the declaration of varlena declaration from [1] to FLEXIBLE_ARRAY_MEMBER in c.h on HEAD. But it does with clang 3.5.1 on the same box. > I am entirely > opposed to Andreas' claim that we ought to consider compilers that do warn > to be broken; if anything it's the other way around. I'm on board with that. > Moreover, if we have any code that is assuming such cases are okay, it > probably needs a second look. Isn't this situation effectively assuming > that a variable-length array is fixed-length? AFAIK, switching a bunch of things to use FLEXIBLE_ARRAY_MEMBER has put a couple of things in light that could be revisited: 1) tuptoaster.c, with this declaration of varlena: struct { struct varlena hdr; char data[TOAST_MAX_CHUNK_SIZE]; /* make struct big enough */ int32 align_it; /* ensure struct is aligned well enough */ } chunk_data; 2) inv_api.c with this thing: struct { bytea hdr; char data[LOBLKSIZE]; /* make struct big enough */ int32 align_it; /* ensure struct is aligned well enough */ } workbuf; 3) heapam.c in three places with HeapTupleHeaderData: struct { HeapTupleHeaderData hdr; char data[MaxHeapTupleSize]; } tbuf; 4) pg_authid.h with its use of relpasswd. 5) reorderbuffer.h with its use of HeapTupleHeaderData: typedef struct ReorderBufferTupleBuf { /* position in preallocated list */ slist_node node; /* tuple, stored sequentially */ HeapTupleData tuple; HeapTupleHeaderData header; char data[MaxHeapTupleSize]; } ReorderBufferTupleBuf; Those issues can be grouped depending on where foo[1] is switched to FLEXIBLE_ARRAY_MEMBER, so I will try to get a set of patches depending on that. Regards, -- Michael
Michael Paquier <michael.paquier@gmail.com> writes: > On Thu, Feb 19, 2015 at 7:29 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Moreover, if we have any code that is assuming such cases are okay, it >> probably needs a second look. Isn't this situation effectively assuming >> that a variable-length array is fixed-length? > AFAIK, switching a bunch of things to use FLEXIBLE_ARRAY_MEMBER has > put a couple of things in light that could be revisited: > 1) tuptoaster.c, with this declaration of varlena: > struct > { > struct varlena hdr; > char data[TOAST_MAX_CHUNK_SIZE]; /* make > struct big enough */ > int32 align_it; /* ensure struct is > aligned well enough */ > } chunk_data; I'm pretty sure that thing ought to be a union, not a struct. > 2) inv_api.c with this thing: > struct > { > bytea hdr; > char data[LOBLKSIZE]; /* make struct > big enough */ > int32 align_it; /* ensure struct is > aligned well enough */ > } workbuf; And probably this too. > 3) heapam.c in three places with HeapTupleHeaderData: > struct > { > HeapTupleHeaderData hdr; > char data[MaxHeapTupleSize]; > } tbuf; And this, though I'm not sure if we'd have to change the size of the padding data[] member. > 5) reorderbuffer.h with its use of HeapTupleHeaderData: Hmm. Andres will have to answer for that one ;-) regards, tom lane
On 19/02/15 15:00, Tom Lane wrote: > Michael Paquier <michael.paquier@gmail.com> writes: >> On Thu, Feb 19, 2015 at 7:29 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Moreover, if we have any code that is assuming such cases are okay, it >>> probably needs a second look. Isn't this situation effectively assuming >>> that a variable-length array is fixed-length? >> AFAIK, switching a bunch of things to use FLEXIBLE_ARRAY_MEMBER has >> put a couple of things in light that could be revisited: >> 1) tuptoaster.c, with this declaration of varlena: >> struct >> { >> struct varlena hdr; >> char data[TOAST_MAX_CHUNK_SIZE]; /* make >> struct big enough */ >> int32 align_it; /* ensure struct is >> aligned well enough */ >> } chunk_data; > I'm pretty sure that thing ought to be a union, not a struct. > >> 2) inv_api.c with this thing: >> struct >> { >> bytea hdr; >> char data[LOBLKSIZE]; /* make struct >> big enough */ >> int32 align_it; /* ensure struct is >> aligned well enough */ >> } workbuf; > And probably this too. > >> 3) heapam.c in three places with HeapTupleHeaderData: >> struct >> { >> HeapTupleHeaderData hdr; >> char data[MaxHeapTupleSize]; >> } tbuf; > And this, though I'm not sure if we'd have to change the size of the > padding data[] member. > >> 5) reorderbuffer.h with its use of HeapTupleHeaderData: > Hmm. Andres will have to answer for that one ;-) > > regards, tom lane > > Curious, has this problem been raised with the gcc maintainers? Is this still a problem with gcc 5.0 (which is due to be released soon)? Cheers, Gavin
Re: Expanding the use of FLEXIBLE_ARRAY_MEMBER for declarations like foo[1]
От
Michael Paquier
Дата:
On Thu, Feb 19, 2015 at 11:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Michael Paquier <michael.paquier@gmail.com> writes: >> On Thu, Feb 19, 2015 at 7:29 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Moreover, if we have any code that is assuming such cases are okay, it >>> probably needs a second look. Isn't this situation effectively assuming >>> that a variable-length array is fixed-length? > >> AFAIK, switching a bunch of things to use FLEXIBLE_ARRAY_MEMBER has >> put a couple of things in light that could be revisited: >> 1) tuptoaster.c, with this declaration of varlena: >> struct >> ... >> } chunk_data; > > I'm pretty sure that thing ought to be a union, not a struct. > >> 2) inv_api.c with this thing: >> ... > And probably this too. Sounds good to me, but with an additional VARHDRSZ to give enough room IMO (or toast_save_datum explodes). >> 3) heapam.c in three places with HeapTupleHeaderData: >> struct >> { >> HeapTupleHeaderData hdr; >> char data[MaxHeapTupleSize]; >> } tbuf; > > And this, though I'm not sure if we'd have to change the size of the > padding data[] member. Here I think that we should add sizeof(HeapTupleHeaderData) to ensure that there is enough room >> 5) reorderbuffer.h with its use of HeapTupleHeaderData: > > Hmm. Andres will have to answer for that one ;-) Surely. This impacts decode.c and reorder.c at quick glance. So, attached are a new set of patches: 1) 0001 is more or less the same as upthread, changing trivial places with foo[1]. I have checked as well calls to sizeof for the structures impacted: 1-1) In dumputils.c, I guess that the call of sizeof with SimpleStringListCell should be changed as follows: cell = (SimpleStringListCell *) - pg_malloc(sizeof(SimpleStringListCell) + strlen(val)); + pg_malloc(sizeof(SimpleStringListCell)); 1-2) sizeof(ParamListInfoData) is present in a couple of places, assuming that sizeof(ParamListInfoData) has the equivalent of 1 parameter, like prepare.c, functions.c, spi.c and postgres.c: - /* sizeof(ParamListInfoData) includes the first array element */ paramLI = (ParamListInfo) palloc(sizeof(ParamListInfoData) + - (num_params - 1) * sizeof(ParamExternData)); + num_params * sizeof(ParamExternData)); 1-3) FuncCandidateList in namespace.c (thanks Andres!): newResult = (FuncCandidateList) - palloc(sizeof(struct _FuncCandidateList) - sizeof(Oid) - + effective_nargs * sizeof(Oid)); + palloc(sizeof(struct _FuncCandidateList) + + effective_nargs * sizeof(Oid)); I imagine that we do not want for those palloc calls to use ifdef FLEXIBLE_ARRAY_MEMBER to save some memory for code readability even if compiler does not support flexible-array length, right? 2) 0002 fixes pg_authid to use CATALOG_VARLEN, this is needed for 0003. 3) 0003 switches varlena in c.h to use FLEXIBLE_ARRAY_MEMBER, with necessary tweaks added for tuptoaster.c and inv_api.c 4) 0004 is some preparatory work before switching HeapTupleHeaderData and MinimalTupleData, changing the struct declarations to union in heapam.c with enough room ensured for processing. Regards, -- Michael
Вложения
Michael Paquier <michael.paquier@gmail.com> writes: > 1-2) sizeof(ParamListInfoData) is present in a couple of places, > assuming that sizeof(ParamListInfoData) has the equivalent of 1 > parameter, like prepare.c, functions.c, spi.c and postgres.c: > - /* sizeof(ParamListInfoData) includes the first array element */ > paramLI = (ParamListInfo) > palloc(sizeof(ParamListInfoData) + > - (num_params - 1) * sizeof(ParamExternData)); > + num_params * sizeof(ParamExternData)); > 1-3) FuncCandidateList in namespace.c (thanks Andres!): > newResult = (FuncCandidateList) > - palloc(sizeof(struct _FuncCandidateList) - sizeof(Oid) > - + effective_nargs * sizeof(Oid)); > + palloc(sizeof(struct _FuncCandidateList) + > + effective_nargs * sizeof(Oid)); > I imagine that we do not want for those palloc calls to use ifdef > FLEXIBLE_ARRAY_MEMBER to save some memory for code readability even if > compiler does not support flexible-array length, right? These are just wrong. As a general rule, we do not want to *ever* take sizeof() a struct that contains a flexible array: the results will not be consistent across platforms. The right thing is to use offsetof() instead. See the helpful comment autoconf provides: /* Define to nothing if C supports flexible array members, and to 1 if it does not. That way, with a declaration like `structs { int n; double d[FLEXIBLE_ARRAY_MEMBER]; };', the struct hack can be used with pre-C99 compilers. When computingthe size of such an object, don't use 'sizeof (struct s)' as it overestimates the size. Use 'offsetof (struct s,d)' instead. Don't use 'offsetof (struct s, d[0])', as this doesn't work with MSVC and with C++ compilers. */ #define FLEXIBLE_ARRAY_MEMBER /**/ This point is actually the main reason we've not done this change long since. People did not feel like running around to make sure there were no overlooked uses of sizeof(). regards, tom lane
Re: Expanding the use of FLEXIBLE_ARRAY_MEMBER for declarations like foo[1]
От
Michael Paquier
Дата:
On Thu, Feb 19, 2015 at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Michael Paquier <michael.paquier@gmail.com> writes: >> 1-2) sizeof(ParamListInfoData) is present in a couple of places, >> assuming that sizeof(ParamListInfoData) has the equivalent of 1 >> parameter, like prepare.c, functions.c, spi.c and postgres.c: >> - /* sizeof(ParamListInfoData) includes the first array element */ >> paramLI = (ParamListInfo) >> palloc(sizeof(ParamListInfoData) + >> - (num_params - 1) * sizeof(ParamExternData)); >> + num_params * sizeof(ParamExternData)); >> 1-3) FuncCandidateList in namespace.c (thanks Andres!): >> newResult = (FuncCandidateList) >> - palloc(sizeof(struct _FuncCandidateList) - sizeof(Oid) >> - + effective_nargs * sizeof(Oid)); >> + palloc(sizeof(struct _FuncCandidateList) + >> + effective_nargs * sizeof(Oid)); >> I imagine that we do not want for those palloc calls to use ifdef >> FLEXIBLE_ARRAY_MEMBER to save some memory for code readability even if >> compiler does not support flexible-array length, right? > > These are just wrong. As a general rule, we do not want to *ever* take > sizeof() a struct that contains a flexible array: the results will not > be consistent across platforms. The right thing is to use offsetof() > instead. See the helpful comment autoconf provides: > > [...] And I had this one in front of my eyes a couple of hours ago... Thanks. > This point is actually the main reason we've not done this change long > since. People did not feel like running around to make sure there were > no overlooked uses of sizeof(). Thanks for the clarifications and the review. Attached is a new set. -- Michael
Вложения
Re: Expanding the use of FLEXIBLE_ARRAY_MEMBER for declarations like foo[1]
От
Michael Paquier
Дата:
On Thu, Feb 19, 2015 at 3:57 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Thu, Feb 19, 2015 at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Michael Paquier <michael.paquier@gmail.com> writes: >>> 1-2) sizeof(ParamListInfoData) is present in a couple of places, >>> assuming that sizeof(ParamListInfoData) has the equivalent of 1 >>> parameter, like prepare.c, functions.c, spi.c and postgres.c: >>> - /* sizeof(ParamListInfoData) includes the first array element */ >>> paramLI = (ParamListInfo) >>> palloc(sizeof(ParamListInfoData) + >>> - (num_params - 1) * sizeof(ParamExternData)); >>> + num_params * sizeof(ParamExternData)); >>> 1-3) FuncCandidateList in namespace.c (thanks Andres!): >>> newResult = (FuncCandidateList) >>> - palloc(sizeof(struct _FuncCandidateList) - sizeof(Oid) >>> - + effective_nargs * sizeof(Oid)); >>> + palloc(sizeof(struct _FuncCandidateList) + >>> + effective_nargs * sizeof(Oid)); >>> I imagine that we do not want for those palloc calls to use ifdef >>> FLEXIBLE_ARRAY_MEMBER to save some memory for code readability even if >>> compiler does not support flexible-array length, right? >> >> These are just wrong. As a general rule, we do not want to *ever* take >> sizeof() a struct that contains a flexible array: the results will not >> be consistent across platforms. The right thing is to use offsetof() >> instead. See the helpful comment autoconf provides: >> >> [...] > > And I had this one in front of my eyes a couple of hours ago... Thanks. > >> This point is actually the main reason we've not done this change long >> since. People did not feel like running around to make sure there were >> no overlooked uses of sizeof(). > > Thanks for the clarifications and the review. Attached is a new set. Grr. Completely forgot to use offsetof in dumputils.c as well. Patch that can be applied on top of 0001 is attached. -- Michael
Вложения
On 2015-02-18 17:29:27 -0500, Tom Lane wrote: > Michael Paquier <michael.paquier@gmail.com> writes: > > On Wed, Feb 18, 2015 at 10:09 PM, Andres Freund <andres@2ndquadrant.com> wrote: > >> The compiler will complain if you use a FLEXIBLE_ARRAY_MEMBER in the > >> middle of a struct but not when when you embed a struct that uses it > >> into the middle another struct. At least gcc doesn't and I think it'd be > >> utterly broken if another compiler did that. If there's a compiler that > >> does so, we need to make it define FLEXIBLE_ARRAY_MEMBER to 1. > > > clang does complain on my OSX laptop regarding that ;) > > I'm a bit astonished that gcc doesn't consider this an error. Sure seems > like it should. Why? The flexible arrary stuff tells the compiler that it doesn't have to worry about space for the array - it seems alright that it actually doesn't. There's pretty much no way you can do that sensibly if the variable length array itself is somewhere in the middle of a struct - but if you embed the whole struct somewhere you have to take care yourself. And e.g. the varlena cases Michael has shown do just that? > (Has anyone tried it on recent gcc?) Yes. > Moreover, if we have any code that is assuming such cases are okay, it > probably needs a second look. Isn't this situation effectively assuming > that a variable-length array is fixed-length? Not really. If you have struct varlena hdr; char data[TOAST_MAX_CHUNK_SIZE]; /* make struct big enough */ the variable length part is preallocated in the data? You're right that many of these structs could just be replaced with a union though. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2015-02-18 21:00:43 -0500, Tom Lane wrote: > Michael Paquier <michael.paquier@gmail.com> writes: > > 3) heapam.c in three places with HeapTupleHeaderData: > > struct > > { > > HeapTupleHeaderData hdr; > > char data[MaxHeapTupleSize]; > > } tbuf; > > And this, though I'm not sure if we'd have to change the size of the > padding data[] member. I don't think so. /** MaxHeapTupleSize is the maximum allowed size of a heap tuple, including* header and MAXALIGN alignment padding. Basicallyit's BLCKSZ minus the* other stuff that has to be on a disk page. Since heap pages use no* "special space", there'sno deduction for that. ... #define MaxHeapTupleSize (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData))) > > 5) reorderbuffer.h with its use of HeapTupleHeaderData: > > Hmm. Andres will have to answer for that one ;-) That should be fairly uncomplicated to replace. ... /* tuple, stored sequentially */HeapTupleData tuple;HeapTupleHeaderData header;char data[MaxHeapTupleSize]; probably can just be replaced by a union of data and header part - as quoted above MaxHeapTupleSize actually contains space for the header. It's a bit annoying because potentially some output plugin might reference .header - but they can just be changed to reference tuple.t_data instead. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Michael Paquier <michael.paquier@gmail.com> writes: > Thanks for the clarifications and the review. Attached is a new set. I've reviewed and pushed the 0001 patch (you missed a few things :-(). Let's see how unhappy the buildfarm is with this before we start on the rest of them. regards, tom lane
Re: Expanding the use of FLEXIBLE_ARRAY_MEMBER for declarations like foo[1]
От
Michael Paquier
Дата:
On Fri, Feb 20, 2015 at 2:14 PM, Tom Lane wrote: > Michael Paquier writes: >> Thanks for the clarifications and the review. Attached is a new set. > > I've reviewed and pushed the 0001 patch (you missed a few things :-(). My apologies. I completely forgot to check for any calls of offsetof with the structures changed... -- Michael
Re: Expanding the use of FLEXIBLE_ARRAY_MEMBER for declarations like foo[1]
От
Michael Paquier
Дата:
On Fri, Feb 20, 2015 at 2:21 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Fri, Feb 20, 2015 at 2:14 PM, Tom Lane wrote: >> Michael Paquier writes: >>> Thanks for the clarifications and the review. Attached is a new set. >> >> I've reviewed and pushed the 0001 patch (you missed a few things :-(). > > My apologies. I completely forgot to check for any calls of offsetof > with the structures changed... Attached are 3 more patches to improve the coverage (being careful this time with calls of offsetof and sizeof...): - 0001 covers varlena in c.h - 0002 covers HeapTupleHeaderData and MinimalTupleData, with things changed in code paths of reorderbuffer and decoder - 0003 changes RecordIOData, used in hstore, rowtypes and json functions Even with this set applied, the following things remain in backend code: $ git grep "VARIABLE LENGTH" | grep "[1]" access/nbtree/nbtutils.c: BTOneVacInfo vacuums[1]; /* VARIABLE LENGTH ARRAY */ access/transam/multixact.c: MultiXactId perBackendXactIds[1]; /* VARIABLE LENGTH ARRAY */ access/transam/twophase.c: GlobalTransaction prepXacts[1]; /* VARIABLE LENGTH ARRAY */ commands/tablespace.c: Oid tblSpcs[1]; /* VARIABLE LENGTH ARRAY */ commands/trigger.c: SetConstraintTriggerData trigstates[1]; /* VARIABLE LENGTH ARRAY */ executor/nodeAgg.c: AggStatePerGroupData pergroup[1]; /* VARIABLE LENGTH ARRAY */ optimizer/plan/setrefs.c: tlist_vinfo vars[1]; /* VARIABLE LENGTH ARRAY */ postmaster/checkpointer.c: CheckpointerRequest requests[1]; /* VARIABLE LENGTH ARRAY */ storage/ipc/pmsignal.c: sig_atomic_t PMChildFlags[1]; /* VARIABLE LENGTH ARRAY */ storage/ipc/procarray.c: int pgprocnos[1]; /* VARIABLE LENGTH ARRAY */ utils/adt/rowtypes.c: ColumnCompareData columns[1]; /* VARIABLE LENGTH ARRAY */ utils/cache/inval.c: SharedInvalidationMessage msgs[1]; /* VARIABLE LENGTH ARRAY */ utils/cache/typcache.c: EnumItem enum_values[1]; /* VARIABLE LENGTH ARRAY */ Regards, -- Michael
Вложения
Re: Expanding the use of FLEXIBLE_ARRAY_MEMBER for declarations like foo[1]
От
Michael Paquier
Дата:
On Fri, Feb 20, 2015 at 4:59 PM, Michael Paquier wrote: > > Attached are 3 more patches to improve the coverage (being careful > this time with calls of offsetof and sizeof...): > - 0001 covers varlena in c.h > - 0002 covers HeapTupleHeaderData and MinimalTupleData, with things > changed in code paths of reorderbuffer and decoder > - 0003 changes RecordIOData, used in hstore, rowtypes and json functions > > Even with this set applied, the following things remain in backend code: > $ git grep "VARIABLE LENGTH" | grep "[1]" > [stuff] Attached is a new series. 0001 and 0002 are the same, 0003 and 0004 the backend structures listed previously. I noticed as well that indexed_tlist in setrefs.c meritates some attention. Regards, -- Michael
Вложения
Re: Expanding the use of FLEXIBLE_ARRAY_MEMBER for declarations like foo[1]
От
Michael Paquier
Дата:
On Fri, Feb 20, 2015 at 8:57 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > Attached is a new series. 0001 and 0002 are the same, 0003 and 0004 > the backend structures listed previously. I noticed as well that > indexed_tlist in setrefs.c meritates some attention. And after all those commits attached is a patch changing HeapTupleHeaderData, using the following macro to track the size of the structure: #define SizeofHeapTupleHeader offsetof(HeapTupleHeaderData, t_bits) Regards, -- Michael
Вложения
Michael Paquier <michael.paquier@gmail.com> writes: > And after all those commits attached is a patch changing > HeapTupleHeaderData, using the following macro to track the size of > the structure: > #define SizeofHeapTupleHeader offsetof(HeapTupleHeaderData, t_bits) I've pushed this with a few minor fixes (mostly, using MAXALIGN where appropriate to avoid changing the results of planner calculations). Andres, would you double-check the changes in reorderbuffer.c? There were some weird calculations with offsetof(ReorderBufferTupleBuf, data) - offsetof(HeapTupleHeaderData, t_bits) which Michael simplified in a way that's not 100% equivalent. I think it's probably better this way; it looks like the old coding was maybe wrong, or at least in the habit of misaligning data. But I might be misunderstanding. regards, tom lane
On 2015-02-21 15:16:55 -0500, Tom Lane wrote: > Andres, would you double-check the changes in reorderbuffer.c? > There were some weird calculations with > offsetof(ReorderBufferTupleBuf, data) - offsetof(HeapTupleHeaderData, t_bits) > which Michael simplified in a way that's not 100% equivalent. I think > it's probably better this way; it looks like the old coding was maybe > wrong, or at least in the habit of misaligning data. But I might be > misunderstanding. Hm, yea, that looks/looked slightly wierd. I think it's actually correct though: HeapTupleData's t_len include's HeapTupleHeaderData itself and offsetof(ReorderBufferTupleBuf, data) points to *after* HeapTupleHeaderData. As this is only the length computation, not the copy, I don't see an active issue. Why I wrote it that way, instead of using offsetof(ReorderBufferTupleBuf, header) + t_len - which should be equivalent - I don't know. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
After some more hacking, the only remaining uses of foo[1] in struct declarations are: 1. A couple of places where the array is actually the only struct member; for some unexplainable reason gcc won't let you use flexible array syntax in that case. 2. struct sqlda_struct in ecpg's sqlda-native.h. We have a problem with using [FLEXIBLE_ARRAY_MEMBER] there because (1) pg_config.h isn't (and I think shouldn't be) #included by this file, and (2) there is very possibly application code depending on sizeof() this struct; the risk of breaking such code seems to outweigh any likely benefit. Also, despite that I tried changing [1] to [] and fixing the places I could find that depended on sizeof(struct sqlda_struct), but I apparently can't find them all because the ecpg regression tests fell over :-(. Anyway, barring excitement in the buildfarm I think this project is complete. regards, tom lane
Re: Expanding the use of FLEXIBLE_ARRAY_MEMBER for declarations like foo[1]
От
Michael Paquier
Дата:
On Sun, Feb 22, 2015 at 6:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > After some more hacking, the only remaining uses of foo[1] in struct > declarations are: > > 1. A couple of places where the array is actually the only struct member; > for some unexplainable reason gcc won't let you use flexible array syntax > in that case. Yes. Thanks for adding notes at those places. > 2. struct sqlda_struct in ecpg's sqlda-native.h. We have a problem with > using [FLEXIBLE_ARRAY_MEMBER] there because (1) pg_config.h isn't (and I > think shouldn't be) #included by this file, and (2) there is very possibly > application code depending on sizeof() this struct; the risk of breaking > such code seems to outweigh any likely benefit. Also, despite that > I tried changing [1] to [] and fixing the places I could find that > depended on sizeof(struct sqlda_struct), but I apparently can't find them > all because the ecpg regression tests fell over :-(. Yeah, I think we can live with that. The rest of the backend code is clean now, and the allocation calculations are more understandable. > Anyway, barring excitement in the buildfarm I think this project is > complete. Thanks! -- Michael