Обсуждение: CustomScan support on readfuncs.c

Поиск
Список
Период
Сортировка

CustomScan support on readfuncs.c

От
Kouhei Kaigai
Дата:
Hi,

I tried to define two additional callbacks to support CustomScan
on readfuncs.c.

First of all, we need to pay attention how to treat output of
TextOutCustomScan when additional text output is generated.
Only custom-scan provider knows what shall be displayed, so
all we can do is invoke a new callback: TextReadCustomScan().
Because private fields shall be displayed next to the common
field of CustomScan, this callback is invoked once pg_strtok
pointer reached to the last field of CustomScan. Then, custom
scan provider allocates CustomScan or a structure embedding
CustomScan; only CSP knows exact size to be allocated.
It can fetch some tokens using pg_strtok and reconstruct its
private fields, but no need to restore the common field because
it shall be done by _readCustomScan.
If no callbacks are defined, _readCustomScan allocates
a CustomScan object and read only known fields.

Then, let's look back a bit. Next issue is how to reproduce
the "methods" pointer from the text representation.
I try to lookup the methods table using a pair of library
and symbol name; probably, it is a straightforward way.
The "methods" field is put earlier than all private fields
generated by TextOutCustomScan, so it should be reconstructable
prior to TextReadCustomScan.
To support this feature, I had to add an interface contract
that requires extensions to put library and symbol name on
CustomScanMethods table.
Although INIT_CUSTOM_SCAN_METHODS() macro can set up these
fields, author of extension needs to pay attention.

In addition to these enhancement, a new NodeCopyCustomScan
callback eliminates a restriction; custom-scan provider
cannot define its own structure that embeds CustomScan.
Only CSP knows exact size of the structure, this callback
is intended to allocate a new one and copy the private fields,
but no need to copy the common fields.

These three callbacks (one existing, two new) will make
CustomScan node copyObject, nodeToString and stringToNode
aware.


I also noticed that some static functions are valuable for
extensions, to avoid reinvention of same routines.
How about to expose these functions?
- _outToken
- _outBitmapset
- nullable_string
- _readBitmapset


The attached patch is conceptual, just compilable but not
tested yet. I'll try to make a test case soon, however,
it makes sense to get feedbacks earlier even if it is
based on the concept design.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

Вложения

Re: CustomScan support on readfuncs.c

От
Amit Kapila
Дата:
On Fri, Sep 25, 2015 at 6:49 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
Hi,

I tried to define two additional callbacks to support CustomScan
on readfuncs.c.

First of all, we need to pay attention how to treat output of
TextOutCustomScan when additional text output is generated.
Only custom-scan provider knows what shall be displayed, so
all we can do is invoke a new callback: TextReadCustomScan().
Because private fields shall be displayed next to the common
field of CustomScan, this callback is invoked once pg_strtok
pointer reached to the last field of CustomScan. Then, custom
scan provider allocates CustomScan or a structure embedding
CustomScan; only CSP knows exact size to be allocated.
It can fetch some tokens using pg_strtok and reconstruct its
private fields, but no need to restore the common field because
it shall be done by _readCustomScan.

So will the specification of TextReadCustomScan() and
TextOutCustomScan() ensures that they don't access the common
fields (like custom_private or others) of CustomScan?
If they change/overwrite them in some way, then I think _readCustomScan()
won't work because it doesn't take into account that common fields could
have been updated by TextReadCustomScan().



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: CustomScan support on readfuncs.c

От
Kouhei Kaigai
Дата:
> On Fri, Sep 25, 2015 at 6:49 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> 
> 
>     Hi,
> 
>     I tried to define two additional callbacks to support CustomScan
>     on readfuncs.c.
> 
>     First of all, we need to pay attention how to treat output of
>     TextOutCustomScan when additional text output is generated.
>     Only custom-scan provider knows what shall be displayed, so
>     all we can do is invoke a new callback: TextReadCustomScan().
>     Because private fields shall be displayed next to the common
>     field of CustomScan, this callback is invoked once pg_strtok
>     pointer reached to the last field of CustomScan. Then, custom
>     scan provider allocates CustomScan or a structure embedding
>     CustomScan; only CSP knows exact size to be allocated.
>     It can fetch some tokens using pg_strtok and reconstruct its
>     private fields, but no need to restore the common field because
>     it shall be done by _readCustomScan.
> 
> 
> 
> So will the specification of TextReadCustomScan() and
> TextOutCustomScan() ensures that they don't access the common
> fields (like custom_private or others) of CustomScan?
> If they change/overwrite them in some way, then I think _readCustomScan()
> won't work because it doesn't take into account that common fields could
> have been updated by TextReadCustomScan().
>
Yes. Even if callback of TextReadCustomScan() wants to change or
overwrite, then it is eventually overwritten by the core.
If we allow custom-scan provide to adjust common fields of CustomScan,
it is a change of interface role because the relevant callbacks (TextOut,
TextRead and NodeCopy) are expected to perform faithfully according to
the supplied node.
If extension really wants to adjust plan-node, probably, something like
plan_tree_mutator() should be the place to do.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: CustomScan support on readfuncs.c

От
Robert Haas
Дата:
On Thu, Sep 24, 2015 at 9:19 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> Then, let's look back a bit. Next issue is how to reproduce
> the "methods" pointer from the text representation.
> I try to lookup the methods table using a pair of library
> and symbol name; probably, it is a straightforward way.
> The "methods" field is put earlier than all private fields
> generated by TextOutCustomScan, so it should be reconstructable
> prior to TextReadCustomScan.
> To support this feature, I had to add an interface contract
> that requires extensions to put library and symbol name on
> CustomScanMethods table.
> Although INIT_CUSTOM_SCAN_METHODS() macro can set up these
> fields, author of extension needs to pay attention.
>
> In addition to these enhancement, a new NodeCopyCustomScan
> callback eliminates a restriction; custom-scan provider
> cannot define its own structure that embeds CustomScan.
> Only CSP knows exact size of the structure, this callback
> is intended to allocate a new one and copy the private fields,
> but no need to copy the common fields.
>
> These three callbacks (one existing, two new) will make
> CustomScan node copyObject, nodeToString and stringToNode
> aware.

Instead of doing this:

+    /* Dump library and symbol name instead of raw pointer */    appendStringInfoString(str, " :methods ");
-    _outToken(str, node->methods->CustomName);
+    _outToken(str, node->methods->methods_library_name);
+    appendStringInfoChar(str, ' ');
+    _outToken(str, node->methods->methods_symbol_name);

Suppose we just make library_name and symbol_name fields in the node
itself, that are dumped and loaded like any others.

Would that be better?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: CustomScan support on readfuncs.c

От
Kouhei Kaigai
Дата:
> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Robert Haas
> Sent: Tuesday, September 29, 2015 12:08 AM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Amit Kapila; Andres Freund; pgsql-hackers
> Subject: Re: [HACKERS] CustomScan support on readfuncs.c
> 
> On Thu, Sep 24, 2015 at 9:19 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> > Then, let's look back a bit. Next issue is how to reproduce
> > the "methods" pointer from the text representation.
> > I try to lookup the methods table using a pair of library
> > and symbol name; probably, it is a straightforward way.
> > The "methods" field is put earlier than all private fields
> > generated by TextOutCustomScan, so it should be reconstructable
> > prior to TextReadCustomScan.
> > To support this feature, I had to add an interface contract
> > that requires extensions to put library and symbol name on
> > CustomScanMethods table.
> > Although INIT_CUSTOM_SCAN_METHODS() macro can set up these
> > fields, author of extension needs to pay attention.
> >
> > In addition to these enhancement, a new NodeCopyCustomScan
> > callback eliminates a restriction; custom-scan provider
> > cannot define its own structure that embeds CustomScan.
> > Only CSP knows exact size of the structure, this callback
> > is intended to allocate a new one and copy the private fields,
> > but no need to copy the common fields.
> >
> > These three callbacks (one existing, two new) will make
> > CustomScan node copyObject, nodeToString and stringToNode
> > aware.
> 
> Instead of doing this:
> 
> +    /* Dump library and symbol name instead of raw pointer */
>      appendStringInfoString(str, " :methods ");
> -    _outToken(str, node->methods->CustomName);
> +    _outToken(str, node->methods->methods_library_name);
> +    appendStringInfoChar(str, ' ');
> +    _outToken(str, node->methods->methods_symbol_name);
> 
> Suppose we just make library_name and symbol_name fields in the node
> itself, that are dumped and loaded like any others.
> 
> Would that be better?
>
I have no preference here.

Even if we dump library_name/symbol_name as if field in CustomScan,
not CustomScanMethods, in a similar way, we cannot use WRITE_STRING_FIELD
here, because its 'fldname' assumes these members are direct field of
CustomScan.
 /* Write a character-string (possibly NULL) field */ #define WRITE_STRING_FIELD(fldname) \     (appendStringInfo(str,
":" CppAsString(fldname) " "), \      _outToken(str, node->fldname))
 



One other question I have. Do we have a portable way to lookup
a pair of library and symbol by address?
Glibc has dladdr() functions that returns these information,
however, manpage warned it is not defined by POSIX.
If we would be able to have any portable way, it may make the
interface simpler.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: CustomScan support on readfuncs.c

От
Robert Haas
Дата:
On Mon, Sep 28, 2015 at 8:31 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
>> Instead of doing this:
>>
>> +    /* Dump library and symbol name instead of raw pointer */
>>      appendStringInfoString(str, " :methods ");
>> -    _outToken(str, node->methods->CustomName);
>> +    _outToken(str, node->methods->methods_library_name);
>> +    appendStringInfoChar(str, ' ');
>> +    _outToken(str, node->methods->methods_symbol_name);
>>
>> Suppose we just make library_name and symbol_name fields in the node
>> itself, that are dumped and loaded like any others.
>>
>> Would that be better?
>>
> I have no preference here.
>
> Even if we dump library_name/symbol_name as if field in CustomScan,
> not CustomScanMethods, in a similar way, we cannot use WRITE_STRING_FIELD
> here, because its 'fldname' assumes these members are direct field of
> CustomScan.
>
>   /* Write a character-string (possibly NULL) field */
>   #define WRITE_STRING_FIELD(fldname) \
>       (appendStringInfo(str, " :" CppAsString(fldname) " "), \
>        _outToken(str, node->fldname))

Well that's exactly what I was suggesting: making them a direct field
of CustomScan.

> One other question I have. Do we have a portable way to lookup
> a pair of library and symbol by address?
> Glibc has dladdr() functions that returns these information,
> however, manpage warned it is not defined by POSIX.
> If we would be able to have any portable way, it may make the
> interface simpler.

Yes: load_external_function.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: CustomScan support on readfuncs.c

От
Kouhei Kaigai
Дата:
> On Mon, Sep 28, 2015 at 8:31 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> >> Instead of doing this:
> >>
> >> +    /* Dump library and symbol name instead of raw pointer */
> >>      appendStringInfoString(str, " :methods ");
> >> -    _outToken(str, node->methods->CustomName);
> >> +    _outToken(str, node->methods->methods_library_name);
> >> +    appendStringInfoChar(str, ' ');
> >> +    _outToken(str, node->methods->methods_symbol_name);
> >>
> >> Suppose we just make library_name and symbol_name fields in the node
> >> itself, that are dumped and loaded like any others.
> >>
> >> Would that be better?
> >>
> > I have no preference here.
> >
> > Even if we dump library_name/symbol_name as if field in CustomScan,
> > not CustomScanMethods, in a similar way, we cannot use WRITE_STRING_FIELD
> > here, because its 'fldname' assumes these members are direct field of
> > CustomScan.
> >
> >   /* Write a character-string (possibly NULL) field */
> >   #define WRITE_STRING_FIELD(fldname) \
> >       (appendStringInfo(str, " :" CppAsString(fldname) " "), \
> >        _outToken(str, node->fldname))
> 
> Well that's exactly what I was suggesting: making them a direct field
> of CustomScan.
>
Let me confirm. Are you suggesting to have library_name/symbol_name
in CustomScan, not CustomScanMethods?
I prefer these fields are in CustomScanMethods because we don't need
to setup them again once PG_init set up these symbols. CustomScan has
to be created every time when it is chosen by planner.

> > One other question I have. Do we have a portable way to lookup
> > a pair of library and symbol by address?
> > Glibc has dladdr() functions that returns these information,
> > however, manpage warned it is not defined by POSIX.
> > If we would be able to have any portable way, it may make the
> > interface simpler.
> 
> Yes: load_external_function.
>
It looks up an address by a pair of library and symbol name....
What I'm looking for is a portable function that looks up a pair
of library and symbol name by an address, like dladdr() of glibc.
I don't know whether other *nix or windows have these infrastructure.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


Re: CustomScan support on readfuncs.c

От
Robert Haas
Дата:
On Tue, Sep 29, 2015 at 6:19 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
>> On Mon, Sep 28, 2015 at 8:31 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
>> >> Instead of doing this:
>> >>
>> >> +    /* Dump library and symbol name instead of raw pointer */
>> >>      appendStringInfoString(str, " :methods ");
>> >> -    _outToken(str, node->methods->CustomName);
>> >> +    _outToken(str, node->methods->methods_library_name);
>> >> +    appendStringInfoChar(str, ' ');
>> >> +    _outToken(str, node->methods->methods_symbol_name);
>> >>
>> >> Suppose we just make library_name and symbol_name fields in the node
>> >> itself, that are dumped and loaded like any others.
>> >>
>> >> Would that be better?
>> >>
>> > I have no preference here.
>> >
>> > Even if we dump library_name/symbol_name as if field in CustomScan,
>> > not CustomScanMethods, in a similar way, we cannot use WRITE_STRING_FIELD
>> > here, because its 'fldname' assumes these members are direct field of
>> > CustomScan.
>> >
>> >   /* Write a character-string (possibly NULL) field */
>> >   #define WRITE_STRING_FIELD(fldname) \
>> >       (appendStringInfo(str, " :" CppAsString(fldname) " "), \
>> >        _outToken(str, node->fldname))
>>
>> Well that's exactly what I was suggesting: making them a direct field
>> of CustomScan.
>>
> Let me confirm. Are you suggesting to have library_name/symbol_name
> in CustomScan, not CustomScanMethods?
> I prefer these fields are in CustomScanMethods because we don't need
> to setup them again once PG_init set up these symbols. CustomScan has
> to be created every time when it is chosen by planner.

True.  But that doesn't cost much.  I'm not sure it's better, so if
you don't like it, don't worry about it.

>> > One other question I have. Do we have a portable way to lookup
>> > a pair of library and symbol by address?
>> > Glibc has dladdr() functions that returns these information,
>> > however, manpage warned it is not defined by POSIX.
>> > If we would be able to have any portable way, it may make the
>> > interface simpler.
>>
>> Yes: load_external_function.
>>
> It looks up an address by a pair of library and symbol name....
> What I'm looking for is a portable function that looks up a pair
> of library and symbol name by an address, like dladdr() of glibc.
> I don't know whether other *nix or windows have these infrastructure.

No, that doesn't exist, and the chances of us trying add that
infrastructure for this feature are nil.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: CustomScan support on readfuncs.c

От
Kouhei Kaigai
Дата:
> -----Original Message-----
> From: Robert Haas [mailto:robertmhaas@gmail.com]
> Sent: Saturday, October 03, 2015 5:44 AM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Amit Kapila; Andres Freund; pgsql-hackers
> Subject: ##freemail## Re: [HACKERS] CustomScan support on readfuncs.c
> 
> On Tue, Sep 29, 2015 at 6:19 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> >> On Mon, Sep 28, 2015 at 8:31 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> >> >> Instead of doing this:
> >> >>
> >> >> +    /* Dump library and symbol name instead of raw pointer */
> >> >>      appendStringInfoString(str, " :methods ");
> >> >> -    _outToken(str, node->methods->CustomName);
> >> >> +    _outToken(str, node->methods->methods_library_name);
> >> >> +    appendStringInfoChar(str, ' ');
> >> >> +    _outToken(str, node->methods->methods_symbol_name);
> >> >>
> >> >> Suppose we just make library_name and symbol_name fields in the node
> >> >> itself, that are dumped and loaded like any others.
> >> >>
> >> >> Would that be better?
> >> >>
> >> > I have no preference here.
> >> >
> >> > Even if we dump library_name/symbol_name as if field in CustomScan,
> >> > not CustomScanMethods, in a similar way, we cannot use WRITE_STRING_FIELD
> >> > here, because its 'fldname' assumes these members are direct field of
> >> > CustomScan.
> >> >
> >> >   /* Write a character-string (possibly NULL) field */
> >> >   #define WRITE_STRING_FIELD(fldname) \
> >> >       (appendStringInfo(str, " :" CppAsString(fldname) " "), \
> >> >        _outToken(str, node->fldname))
> >>
> >> Well that's exactly what I was suggesting: making them a direct field
> >> of CustomScan.
> >>
> > Let me confirm. Are you suggesting to have library_name/symbol_name
> > in CustomScan, not CustomScanMethods?
> > I prefer these fields are in CustomScanMethods because we don't need
> > to setup them again once PG_init set up these symbols. CustomScan has
> > to be created every time when it is chosen by planner.
> 
> True.  But that doesn't cost much.  I'm not sure it's better, so if
> you don't like it, don't worry about it.
>
Yep, I like library_name and symbol_name are in CustomScanMethods
because the table of callbacks and its identifiers are not separable

> >> > One other question I have. Do we have a portable way to lookup
> >> > a pair of library and symbol by address?
> >> > Glibc has dladdr() functions that returns these information,
> >> > however, manpage warned it is not defined by POSIX.
> >> > If we would be able to have any portable way, it may make the
> >> > interface simpler.
> >>
> >> Yes: load_external_function.
> >>
> > It looks up an address by a pair of library and symbol name....
> > What I'm looking for is a portable function that looks up a pair
> > of library and symbol name by an address, like dladdr() of glibc.
> > I don't know whether other *nix or windows have these infrastructure.
> 
> No, that doesn't exist, and the chances of us trying add that
> infrastructure for this feature are nil.
>
OK, probably, it is the only way to expect extension put correct
values on the pair of library and symbol names.

I try to check whether the current patch workable with this direction
using my extension. Please wait for a few days.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


Re: CustomScan support on readfuncs.c

От
Kouhei Kaigai
Дата:
> > On Tue, Sep 29, 2015 at 6:19 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> > >> On Mon, Sep 28, 2015 at 8:31 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> > >> >> Instead of doing this:
> > >> >>
> > >> >> +    /* Dump library and symbol name instead of raw pointer */
> > >> >>      appendStringInfoString(str, " :methods ");
> > >> >> -    _outToken(str, node->methods->CustomName);
> > >> >> +    _outToken(str, node->methods->methods_library_name);
> > >> >> +    appendStringInfoChar(str, ' ');
> > >> >> +    _outToken(str, node->methods->methods_symbol_name);
> > >> >>
> > >> >> Suppose we just make library_name and symbol_name fields in the node
> > >> >> itself, that are dumped and loaded like any others.
> > >> >>
> > >> >> Would that be better?
> > >> >>
> > >> > I have no preference here.
> > >> >
> > >> > Even if we dump library_name/symbol_name as if field in CustomScan,
> > >> > not CustomScanMethods, in a similar way, we cannot use WRITE_STRING_FIELD
> > >> > here, because its 'fldname' assumes these members are direct field of
> > >> > CustomScan.
> > >> >
> > >> >   /* Write a character-string (possibly NULL) field */
> > >> >   #define WRITE_STRING_FIELD(fldname) \
> > >> >       (appendStringInfo(str, " :" CppAsString(fldname) " "), \
> > >> >        _outToken(str, node->fldname))
> > >>
> > >> Well that's exactly what I was suggesting: making them a direct field
> > >> of CustomScan.
> > >>
> > > Let me confirm. Are you suggesting to have library_name/symbol_name
> > > in CustomScan, not CustomScanMethods?
> > > I prefer these fields are in CustomScanMethods because we don't need
> > > to setup them again once PG_init set up these symbols. CustomScan has
> > > to be created every time when it is chosen by planner.
> >
> > True.  But that doesn't cost much.  I'm not sure it's better, so if
> > you don't like it, don't worry about it.
> >
> Yep, I like library_name and symbol_name are in CustomScanMethods
> because the table of callbacks and its identifiers are not separable
> 
> > >> > One other question I have. Do we have a portable way to lookup
> > >> > a pair of library and symbol by address?
> > >> > Glibc has dladdr() functions that returns these information,
> > >> > however, manpage warned it is not defined by POSIX.
> > >> > If we would be able to have any portable way, it may make the
> > >> > interface simpler.
> > >>
> > >> Yes: load_external_function.
> > >>
> > > It looks up an address by a pair of library and symbol name....
> > > What I'm looking for is a portable function that looks up a pair
> > > of library and symbol name by an address, like dladdr() of glibc.
> > > I don't know whether other *nix or windows have these infrastructure.
> >
> > No, that doesn't exist, and the chances of us trying add that
> > infrastructure for this feature are nil.
> >
> OK, probably, it is the only way to expect extension put correct
> values on the pair of library and symbol names.
> 
> I try to check whether the current patch workable with this direction
> using my extension. Please wait for a few days.
>
Sorry for my late.

I confirmed that CustomScan support of readfuncs.c works fine using the
attached patch for the PG-Strom tree. It worked as expected - duplication,
serialization and deserialization by:
  plan_dup = stringToNode(nodeToString(copyObject(plan_orig)))

So, the custom-scan-on-readfuncs.v1.path itself is good for me.

In addition, I felt the following functions are useful for extensions.
 * _outToken()
 * _outBitmapset()
 * _readBitmapset()

Do we have none-static version of above functions?
If nothing, extension has to implement them by itself, more or less.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


Вложения

Re: CustomScan support on readfuncs.c

От
Robert Haas
Дата:
On Wed, Nov 4, 2015 at 10:13 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> Sorry for my late.
>
> I confirmed that CustomScan support of readfuncs.c works fine using the
> attached patch for the PG-Strom tree. It worked as expected - duplication,
> serialization and deserialization by:
>   plan_dup = stringToNode(nodeToString(copyObject(plan_orig)))
>
> So, the custom-scan-on-readfuncs.v1.path itself is good for me.

I don't think this is is what we want.  In particular, I like this:
   Plan trees must be able to be duplicated using <function>copyObject</>,
-   so all the data stored within the <quote>custom</> fields must consist of
-   nodes that that function can handle.  Furthermore, custom scan providers
-   cannot substitute a larger structure that embeds
-   a <structname>CustomScan</> for the structure itself, as would be possible
-   for a <structname>CustomPath</> or <structname>CustomScanState</>.
+   serialized using <function>nodeToString</> and deserialized using
+   <function>stringToNode</>, so so all the data stored within
+   the <quote>custom</> fields must consist of nodes that that function
+   can handle.
+   Furthermore, custom scan providers have to implement <quote>optional</>
+   callbacks if it defines substitute a larger structure that embeds
+   a <structname>CustomScan</> for the structure itself.
+  </para>

The purposes of this patch is supposedly to add readfuncs.c support to
custom scans - I agree with that goal.  This documentation change is
talking about something else, namely using a CustomScan as the fist
field in a larger structure.  I'm not sure I agree with that goal, and
in any case it seems like it ought to be a separate patch.  This patch
would be a lot smaller if it weren't trying to make that change too.

I also don't think that this patch ought to introduce
get_current_library_filename().  There are other cases - e.g. for
launching background workers - where such a thing could be used, but
we haven't introduced it up until now.  It isn't a requirement to get
readfuncs support working.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: CustomScan support on readfuncs.c

От
Kouhei Kaigai
Дата:
> On Wed, Nov 4, 2015 at 10:13 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> > Sorry for my late.
> >
> > I confirmed that CustomScan support of readfuncs.c works fine using the
> > attached patch for the PG-Strom tree. It worked as expected - duplication,
> > serialization and deserialization by:
> >   plan_dup = stringToNode(nodeToString(copyObject(plan_orig)))
> >
> > So, the custom-scan-on-readfuncs.v1.path itself is good for me.
> 
> I don't think this is is what we want.  In particular, I like this:
> 
>     Plan trees must be able to be duplicated using <function>copyObject</>,
> -   so all the data stored within the <quote>custom</> fields must consist of
> -   nodes that that function can handle.  Furthermore, custom scan providers
> -   cannot substitute a larger structure that embeds
> -   a <structname>CustomScan</> for the structure itself, as would be possible
> -   for a <structname>CustomPath</> or <structname>CustomScanState</>.
> +   serialized using <function>nodeToString</> and deserialized using
> +   <function>stringToNode</>, so so all the data stored within
> +   the <quote>custom</> fields must consist of nodes that that function
> +   can handle.
> +   Furthermore, custom scan providers have to implement <quote>optional</>
> +   callbacks if it defines substitute a larger structure that embeds
> +   a <structname>CustomScan</> for the structure itself.
> +  </para>
> 
> The purposes of this patch is supposedly to add readfuncs.c support to
> custom scans - I agree with that goal.  This documentation change is
> talking about something else, namely using a CustomScan as the fist
> field in a larger structure.  I'm not sure I agree with that goal, and
> in any case it seems like it ought to be a separate patch.  This patch
> would be a lot smaller if it weren't trying to make that change too.
>
Hmm. I might mix up two individual discussions here.
OK, I'll cut off the minimum portion for readfuncs.c support.
The other portion - to use a CustomScan as the first field in a larger
structure - shall be submitted as a separate one.

> I also don't think that this patch ought to introduce
> get_current_library_filename().  There are other cases - e.g. for
> launching background workers - where such a thing could be used, but
> we haven't introduced it up until now.  It isn't a requirement to get
> readfuncs support working.
>
Even though it isn't a minimum requirement (because extension can put
arbitrary cstring here), it can tell reasonably reliable pathname for
extensions if backend tracks filename of the library to be loaded.

Extension knows its name; its Makefile can embed module name somewhere,
for example. However, we cannot ensure use always install the modules
under the $libdir. Even if "$libdir/pg_strom" is expected, we have no
mechanism to prohibit users to install them under the "$libdir/buggy"
directory. In this case, "pg_strom" is not a right name to solve the
library path.

Even though glibc has dladdr(3) to know the filename that contains the
target symbol, it is not a portable way. So, I thought it is the best
way to inform extension by the core, on _PG_init() time where all the
extension get control once on loading.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


Re: CustomScan support on readfuncs.c

От
Kouhei Kaigai
Дата:
> -----Original Message-----
> From: Kaigai Kouhei(海外 浩平)
> Sent: Friday, November 06, 2015 11:23 AM
> To: 'Robert Haas'
> Cc: Amit Kapila; Andres Freund; pgsql-hackers
> Subject: RE: [HACKERS] CustomScan support on readfuncs.c
> 
> > On Wed, Nov 4, 2015 at 10:13 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> > > Sorry for my late.
> > >
> > > I confirmed that CustomScan support of readfuncs.c works fine using the
> > > attached patch for the PG-Strom tree. It worked as expected - duplication,
> > > serialization and deserialization by:
> > >   plan_dup = stringToNode(nodeToString(copyObject(plan_orig)))
> > >
> > > So, the custom-scan-on-readfuncs.v1.path itself is good for me.
> >
> > I don't think this is is what we want.  In particular, I like this:
> >
> >     Plan trees must be able to be duplicated using <function>copyObject</>,
> > -   so all the data stored within the <quote>custom</> fields must consist of
> > -   nodes that that function can handle.  Furthermore, custom scan providers
> > -   cannot substitute a larger structure that embeds
> > -   a <structname>CustomScan</> for the structure itself, as would be possible
> > -   for a <structname>CustomPath</> or <structname>CustomScanState</>.
> > +   serialized using <function>nodeToString</> and deserialized using
> > +   <function>stringToNode</>, so so all the data stored within
> > +   the <quote>custom</> fields must consist of nodes that that function
> > +   can handle.
> > +   Furthermore, custom scan providers have to implement <quote>optional</>
> > +   callbacks if it defines substitute a larger structure that embeds
> > +   a <structname>CustomScan</> for the structure itself.
> > +  </para>
> >
> > The purposes of this patch is supposedly to add readfuncs.c support to
> > custom scans - I agree with that goal.  This documentation change is
> > talking about something else, namely using a CustomScan as the fist
> > field in a larger structure.  I'm not sure I agree with that goal, and
> > in any case it seems like it ought to be a separate patch.  This patch
> > would be a lot smaller if it weren't trying to make that change too.
> >
> Hmm. I might mix up two individual discussions here.
> OK, I'll cut off the minimum portion for readfuncs.c support.
> The other portion - to use a CustomScan as the first field in a larger
> structure - shall be submitted as a separate one.
> 
> > I also don't think that this patch ought to introduce
> > get_current_library_filename().  There are other cases - e.g. for
> > launching background workers - where such a thing could be used, but
> > we haven't introduced it up until now.  It isn't a requirement to get
> > readfuncs support working.
> >
> Even though it isn't a minimum requirement (because extension can put
> arbitrary cstring here), it can tell reasonably reliable pathname for
> extensions if backend tracks filename of the library to be loaded.
> 
> Extension knows its name; its Makefile can embed module name somewhere,
> for example. However, we cannot ensure use always install the modules
> under the $libdir. Even if "$libdir/pg_strom" is expected, we have no
> mechanism to prohibit users to install them under the "$libdir/buggy"
> directory. In this case, "pg_strom" is not a right name to solve the
> library path.
> 
> Even though glibc has dladdr(3) to know the filename that contains the
> target symbol, it is not a portable way. So, I thought it is the best
> way to inform extension by the core, on _PG_init() time where all the
> extension get control once on loading.
>
I tried to split the previous version into two portions.

- custom-scan-on-readfuncs.v2.patch
It allows to serialize/deserialize CustomScan node as discussed upthread.
Regarding of get_current_library_filename(), I keep this feature as
the previous version right now, because I have no good alternatives.

In this patch, the role of TextReadCustomScan callback is to clean out
any tokens generated by TextOutCustomScan. The CustomScan node itself
can be reconstructed with common portion because we expect custom_exprs
and custom_private have objects which are safe to copyObject().


- custom-scan-on-copyfuncs.v2.patch
This is the portion that supports a larger structure with CustomScan
on its head.
The role of TextReadCustomScan is changed. It reconstruct the private
structure that contains CustomScan by palloc(), and fill-up its private
fields by tokens read. The common field of CustomScan shall be filled
up by the core.
One other callback I added is NodeCopyCustomScan; that enables extension
to allocate a structure larger than CustomScan and fill-up its private
fields. 
Anyway, I'll have further discussion for 2nd patch in another thread.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


Вложения

Re: CustomScan support on readfuncs.c

От
Robert Haas
Дата:
On Fri, Nov 6, 2015 at 2:02 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> I tried to split the previous version into two portions.
>
> - custom-scan-on-readfuncs.v2.patch
> It allows to serialize/deserialize CustomScan node as discussed upthread.
> Regarding of get_current_library_filename(), I keep this feature as
> the previous version right now, because I have no good alternatives.

Why can't the library just pass its name as a constant string?

> In this patch, the role of TextReadCustomScan callback is to clean out
> any tokens generated by TextOutCustomScan. The CustomScan node itself
> can be reconstructed with common portion because we expect custom_exprs
> and custom_private have objects which are safe to copyObject().

Some of the documentation changes for the embed-the-struct changes are
still present in the readfuncs patch.

Rather than adding TextReadCustomScan, I think we should rip
TextOutputCustomScan out.  It seems like a useless appendage.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: CustomScan support on readfuncs.c

От
Kouhei Kaigai
Дата:
> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Robert Haas
> Sent: Saturday, November 07, 2015 1:42 AM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Amit Kapila; Andres Freund; pgsql-hackers
> Subject: Re: [HACKERS] CustomScan support on readfuncs.c
> 
> On Fri, Nov 6, 2015 at 2:02 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> > I tried to split the previous version into two portions.
> >
> > - custom-scan-on-readfuncs.v2.patch
> > It allows to serialize/deserialize CustomScan node as discussed upthread.
> > Regarding of get_current_library_filename(), I keep this feature as
> > the previous version right now, because I have no good alternatives.
> 
> Why can't the library just pass its name as a constant string?
>
Are you suggesting to pass the object name on software build time?
If so, it may load incorrect libraries when DBA renamed its filename.
At least, PostgreSQL cannot prevent DBA to rename library filenames
even if they try to deploy "pg_strom.so", "pg_strom.20151106.so" and
"pg_strom.20151030_bug.so" on same directory.

I don't know portable way to get library name on run-time, so, all we
can use are environment specific.- dladdr(3) : specific to glibc- /proc/<PID>/maps : specific to Linux

Thus, I thought the backend assist is the most reasonable way here.

> > In this patch, the role of TextReadCustomScan callback is to clean out
> > any tokens generated by TextOutCustomScan. The CustomScan node itself
> > can be reconstructed with common portion because we expect custom_exprs
> > and custom_private have objects which are safe to copyObject().
> 
> Some of the documentation changes for the embed-the-struct changes are
> still present in the readfuncs patch.
> 
> Rather than adding TextReadCustomScan, I think we should rip
> TextOutputCustomScan out.  It seems like a useless appendage.
>
OK, I'll once remove TextOut, then add it later.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


Re: CustomScan support on readfuncs.c

От
Robert Haas
Дата:
On Sat, Nov 7, 2015 at 6:38 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> Are you suggesting to pass the object name on software build time?

Yes.  That's what test_shm_mq and worker_spi already do:
       sprintf(worker.bgw_library_name, "test_shm_mq");

> If so, it may load incorrect libraries when DBA renamed its filename.
> At least, PostgreSQL cannot prevent DBA to rename library filenames
> even if they try to deploy "pg_strom.so", "pg_strom.20151106.so" and
> "pg_strom.20151030_bug.so" on same directory.

I agree.  But that's not a new problem that needs to be solved by this
patch.  It already exists - see above.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: CustomScan support on readfuncs.c

От
Kouhei Kaigai
Дата:
> On Sat, Nov 7, 2015 at 6:38 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> > Are you suggesting to pass the object name on software build time?
> 
> Yes.  That's what test_shm_mq and worker_spi already do:
> 
>         sprintf(worker.bgw_library_name, "test_shm_mq");
>
OK, I ripped out the portion around dfmgr.c.

> > If so, it may load incorrect libraries when DBA renamed its filename.
> > At least, PostgreSQL cannot prevent DBA to rename library filenames
> > even if they try to deploy "pg_strom.so", "pg_strom.20151106.so" and
> > "pg_strom.20151030_bug.so" on same directory.
> 
> I agree.  But that's not a new problem that needs to be solved by this
> patch.  It already exists - see above.
>
It still may be a potential problem, even though a corner case.
I'll try to implement an internal API to know "what is my name".

The attached main patch (custom-scan-on-readfuncs.v3.patch) once
removes TextOutCustomScan, thus all the serialized tokens become
known to the core backend, and add _readCustomScan() at readfuncs.c.
It deserializes CustomScan nodes from cstring tokens, especially,
reloads the pointer of CustomScanMethods tables using a pair of
library name and symbol name.
Thus, it also requires custom scan providers to keep the methods
table visible from external objects.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


Вложения

Re: CustomScan support on readfuncs.c

От
Robert Haas
Дата:
On Mon, Nov 9, 2015 at 9:46 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> The attached main patch (custom-scan-on-readfuncs.v3.patch) once
> removes TextOutCustomScan, thus all the serialized tokens become
> known to the core backend, and add _readCustomScan() at readfuncs.c.
> It deserializes CustomScan nodes from cstring tokens, especially,
> reloads the pointer of CustomScanMethods tables using a pair of
> library name and symbol name.
> Thus, it also requires custom scan providers to keep the methods
> table visible from external objects.

Committed with some kibitzing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: CustomScan support on readfuncs.c

От
Kouhei Kaigai
Дата:
> On Mon, Nov 9, 2015 at 9:46 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> > The attached main patch (custom-scan-on-readfuncs.v3.patch) once
> > removes TextOutCustomScan, thus all the serialized tokens become
> > known to the core backend, and add _readCustomScan() at readfuncs.c.
> > It deserializes CustomScan nodes from cstring tokens, especially,
> > reloads the pointer of CustomScanMethods tables using a pair of
> > library name and symbol name.
> > Thus, it also requires custom scan providers to keep the methods
> > table visible from external objects.
> 
> Committed with some kibitzing.
>
Thanks,

How do we deal with TextOutCustomScan in the v9.5 tree?
I think we ought to remove this callback also prior to v9.5 release.
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


Re: CustomScan support on readfuncs.c

От
Robert Haas
Дата:
On Thu, Nov 12, 2015 at 7:59 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
>> On Mon, Nov 9, 2015 at 9:46 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
>> > The attached main patch (custom-scan-on-readfuncs.v3.patch) once
>> > removes TextOutCustomScan, thus all the serialized tokens become
>> > known to the core backend, and add _readCustomScan() at readfuncs.c.
>> > It deserializes CustomScan nodes from cstring tokens, especially,
>> > reloads the pointer of CustomScanMethods tables using a pair of
>> > library name and symbol name.
>> > Thus, it also requires custom scan providers to keep the methods
>> > table visible from external objects.
>>
>> Committed with some kibitzing.
>>
> Thanks,
>
> How do we deal with TextOutCustomScan in the v9.5 tree?
> I think we ought to remove this callback also prior to v9.5 release.

I'll do that if there's a clear consensus in favor.  I wasn't sure it
really made sense so close to release.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: CustomScan support on readfuncs.c

От
Kouhei Kaigai
Дата:
> On Thu, Nov 12, 2015 at 7:59 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> >> On Mon, Nov 9, 2015 at 9:46 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> >> > The attached main patch (custom-scan-on-readfuncs.v3.patch) once
> >> > removes TextOutCustomScan, thus all the serialized tokens become
> >> > known to the core backend, and add _readCustomScan() at readfuncs.c.
> >> > It deserializes CustomScan nodes from cstring tokens, especially,
> >> > reloads the pointer of CustomScanMethods tables using a pair of
> >> > library name and symbol name.
> >> > Thus, it also requires custom scan providers to keep the methods
> >> > table visible from external objects.
> >>
> >> Committed with some kibitzing.
> >>
> > Thanks,
> >
> > How do we deal with TextOutCustomScan in the v9.5 tree?
> > I think we ought to remove this callback also prior to v9.5 release.
> 
> I'll do that if there's a clear consensus in favor.  I wasn't sure it
> really made sense so close to release.
>
Unless we don't reach a consensus in the "CustomScan in a larger
structure" thread, TextOutCustomScan callback becomes obsoleted in
the v9.6 release.

So, I think this callback shall be dropped once.
Let's wait for a few days for other persons' opinion?

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>