Обсуждение: Row-Level Security

Поиск

Список

Период

Сортировка

Row-Level Security

От

Stephen Frost

Дата:

12 декабря 2009 г., 16:30:12

Greetings,

> I'll start a new thread on this specific topic to hopefully pull out
> anyone who's focus is more on that than on SEPG.

Row-Level security has been implemented in a number of existing
commercial databases.  There exists an implementation of row-level
security for PostgreSQL today in the form of SEPostgres.
I believe there is a signfigant user base who would like RLS without
SELinux (or perhaps with some other security manager).  As it is a
useful feature indepenent of SELinux, it should be implemented in a way
which doesn't depend on SELinux in any way.

I've started a wiki page to discuss this here:
http://wiki.postgresql.org/wiki/RLS

I'd like to start a discussion about RLS for PG- design, user-interface,
syntax, capabilities, on-disk format changes, etc.  For starters, I
think we shoud review the existing RLS implementations.  To that end,
I've added a number of articles about them to the wiki.  I think the
next step is to start summarizing how those operate and important
similarities and differences between them.  Our goal, of course, is to
take the best of what's out there.

Please comment, update the wiki, let us know you're interested in this..
Thanks!
    Stephen

Re: Row-Level Security

От

KaiGai Kohei

Дата:

12 декабря 2009 г., 19:30:36

(2009/12/13 5:30), Stephen Frost wrote:
> Greetings,
> 
>> I'll start a new thread on this specific topic to hopefully pull out
>> anyone who's focus is more on that than on SEPG.
> 
> Row-Level security has been implemented in a number of existing
> commercial databases.  There exists an implementation of row-level
> security for PostgreSQL today in the form of SEPostgres.
> I believe there is a signfigant user base who would like RLS without
> SELinux (or perhaps with some other security manager).  As it is a
> useful feature indepenent of SELinux, it should be implemented in a way
> which doesn't depend on SELinux in any way.

Yes, it is also my plan.
If once PostgreSQL gets row-level granularity in access controls,
it is quite easy to add SELinux support as a security provider.


> I've started a wiki page to discuss this here:
> http://wiki.postgresql.org/wiki/RLS
> 
> I'd like to start a discussion about RLS for PG- design, user-interface,
> syntax, capabilities, on-disk format changes, etc.  For starters, I
> think we shoud review the existing RLS implementations.  To that end,
> I've added a number of articles about them to the wiki.  I think the
> next step is to start summarizing how those operate and important
> similarities and differences between them.  Our goal, of course, is to
> take the best of what's out there.
> 
> Please comment, update the wiki, let us know you're interested in this..

Good start, however, could you defer the discussion after the Feb-15?
My hands are now full in the security framework and SE-PgSQL/Lite. :(

Thanks,
-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>

Re: Row-Level Security

От

Stephen Frost

Дата:

12 декабря 2009 г., 20:11:25

KaiGai,

* KaiGai Kohei (kaigai@kaigai.gr.jp) wrote:
> > Please comment, update the wiki, let us know you're interested in this..
>
> Good start, however, could you defer the discussion after the Feb-15?
> My hands are now full in the security framework and SE-PgSQL/Lite. :(

While I'm glad you're enthusiastic and interested in this too, I don't
believe we need to delay this initial discussion.  To be honest, I think
we really need to get some input and interest from others as well.  I'll
do my best to make sure the wiki is updated with information and links
to any signifigant threads on the lists.  I don't expect to be writing
any serious code by Feb 15th on this anyway.
Thanks,
    Stephen

Re: Row-Level Security

От

Josh Berkus

Дата:

12 декабря 2009 г., 20:41:23

Stephen,

> Please comment, update the wiki, let us know you're interested in this.. 

I blogged about this some time ago.  One issue I can see is that I
believe that the RLS which many users want is different from the RLS
which SEPostgres implements.

Links:

http://it.toolbox.com/blogs/database-soup/thinking-about-row-level-security-part-1-30732
http://it.toolbox.com/blogs/database-soup/thinking-about-row-level-security-part-2-30757

--Josh Berkus

Re: Row-Level Security

От

Robert Haas

Дата:

12 декабря 2009 г., 23:19:00

On Sat, Dec 12, 2009 at 7:41 PM, Josh Berkus <josh@agliodbs.com> wrote:
> Stephen,
>
>> Please comment, update the wiki, let us know you're interested in this..
>
> I blogged about this some time ago.  One issue I can see is that I
> believe that the RLS which many users want is different from the RLS
> which SEPostgres implements.
>
> Links:
>
> http://it.toolbox.com/blogs/database-soup/thinking-about-row-level-security-part-1-30732
> http://it.toolbox.com/blogs/database-soup/thinking-about-row-level-security-part-2-30757
>
> --Josh Berkus

I read these blog entries a while ago but had forgotten about them.
They're very good, and summarize a lot of my thinking on this topic as
well.  I think that we can design a framework for row-level security
which can encompass both constraint-based security and label-based
security.  Both seem to me to be be based around doing essentially the
following things:

1. Adding columns to the table to store access control information.
2. Populating those columns with additional information (owner, ACL,
security label, etc.) which can be used to make access control
decisions.
3. Injecting logic into incoming queries which uses the information
inserted by (1) to filter out rows to which the access control policy
does not wish to allow access.

Waving my hands in the air, #1 and #2 seem pretty straightforward.
For constraint-based security, one can imagine just adding a column to
the table and then adding a BEFORE INSERT OR UPDATE FOR EACH ROW
trigger that populates that column.  For label-based MAC, that's not
going to be quite sufficient, because the system needs to ensure that
the trigger that populates the security column must run last; if it
doesn't, some other trigger can come along afterwards and slip in a
value that isn't supposed to be there; plus, it might be inconvenient
to need to define this trigger for every table that needs RLS.

However, those problems don't seem insurmountable.  Suppose we provide
a hook function that essentially acts like a global BEFORE INSERT OR
UPDATE trigger but which fires after all of the regular triggers.
SE-PostgreSQL can gain control at that point and search through the
columns of the target relation for a column called, say,
sepg_security_label.  If it finds such a column and that column is of
the appropriate type, then (1) if an explicit security label is
provided, it checks whether the specified label is permissible, (2)
otherwise, if the operation is insert, it determines the appropriate
default label for the current security context and inserts it, (3)
otherwise, it just leaves the current label alone.  This might not be
quite the right behavior but the point is whatever behavior you want
to have in terms of assigning/disallowing values for that column
should be possible to implement here.  The upshot is that if the
system administrator creates an sepg_security_label column of the
correct type, row-level security will be enabled for that table.
Otherwise, it will not.

#3 seems a little bit trickier.  I don't think the GRANT ... WHERE
syntax is going to be very easy to use.  For constraint-based
row-security, I think we should have something more like:

ALTER TABLE table ADD ROW FILTER filtername USING othertable [, ...]
WHERE where-clause

(This suffers from the same problem as DELETE ... USING, namely that
sometimes you want an outer join between table and othertable.)

This gives the user a convenient way to insert a join against one or
more side tables if they are so inclined.

For security frameworks like SE-PostgreSQL, we might just provide a
hook allowing the incoming query tree to be modified, and let the hook
function check whether each table in the query has row-level security
enabled, and if so perform a modification equivalent to the above.

None of this addresses the issue of doing RLS on system catalogs,
which seems like a much harder problem, possibly one that we should
just ignore for the first phase of this project.

Thoughts?

...Robert

Re: Row-Level Security

От

KaiGai Kohei

Дата:

13 декабря 2009 г., 04:51:05

(2009/12/13 12:18), Robert Haas wrote:
> On Sat, Dec 12, 2009 at 7:41 PM, Josh Berkus<josh@agliodbs.com>  wrote:
>> Stephen,
>>
>>> Please comment, update the wiki, let us know you're interested in this..
>>
>> I blogged about this some time ago.  One issue I can see is that I
>> believe that the RLS which many users want is different from the RLS
>> which SEPostgres implements.
>>
>> Links:
>>
>> http://it.toolbox.com/blogs/database-soup/thinking-about-row-level-security-part-1-30732
>> http://it.toolbox.com/blogs/database-soup/thinking-about-row-level-security-part-2-30757
>>
>> --Josh Berkus
>
> I read these blog entries a while ago but had forgotten about them.
> They're very good, and summarize a lot of my thinking on this topic as
> well.  I think that we can design a framework for row-level security
> which can encompass both constraint-based security and label-based
> security.  Both seem to me to be be based around doing essentially the
> following things:
>
> 1. Adding columns to the table to store access control information.
> 2. Populating those columns with additional information (owner, ACL,
> security label, etc.) which can be used to make access control
> decisions.
> 3. Injecting logic into incoming queries which uses the information
> inserted by (1) to filter out rows to which the access control policy
> does not wish to allow access.
>
> Waving my hands in the air, #1 and #2 seem pretty straightforward.
> For constraint-based security, one can imagine just adding a column to
> the table and then adding a BEFORE INSERT OR UPDATE FOR EACH ROW
> trigger that populates that column.  For label-based MAC, that's not
> going to be quite sufficient, because the system needs to ensure that
> the trigger that populates the security column must run last; if it
> doesn't, some other trigger can come along afterwards and slip in a
> value that isn't supposed to be there; plus, it might be inconvenient
> to need to define this trigger for every table that needs RLS.

Right, label-based MAC need its hook being called after all the BR-Insert
triggers to assign a correct security label, not only access controls.
I'd like to point out one more thing. When we update tuples, "invisible"
tuples have to be filtered out before trigger functions.

> However, those problems don't seem insurmountable.  Suppose we provide
> a hook function that essentially acts like a global BEFORE INSERT OR
> UPDATE trigger but which fires after all of the regular triggers.

Basically, right. In my branch, SE-PgSQL put its hook after all the BR
trigger invocations.

http://code.google.com/p/sepgsql/source/browse/branches/pgsql-8.4.x/sepgsql/src/backend/executor/execMain.c#1883

But we have another approach. When RelationBuildTriggers() initializes
TriggerDesc of Relation, we can inject security hook as a special BR-trigger
at the last. If we initialize it here, we don't need to modify COPY FROM
implementation, not only INSERT.

The reason why I didn't apply this approach is it needs more modification
to the core routines, so it makes harder to manage out-of-tree code.

> SE-PostgreSQL can gain control at that point and search through the
> columns of the target relation for a column called, say,
> sepg_security_label.  If it finds such a column and that column is of
> the appropriate type, then (1) if an explicit security label is
> provided, it checks whether the specified label is permissible, (2)
> otherwise, if the operation is insert, it determines the appropriate
> default label for the current security context and inserts it, (3)
> otherwise, it just leaves the current label alone.  This might not be
> quite the right behavior but the point is whatever behavior you want
> to have in terms of assigning/disallowing values for that column
> should be possible to implement here.  The upshot is that if the
> system administrator creates an sepg_security_label column of the
> correct type, row-level security will be enabled for that table.
> Otherwise, it will not.

Basically, right. SE-PgSQL (or others) assign a new tuple either an
explicitly given or a default security label, then it checks permission
whether the client can insert a tuple with this label, or not.

One point. MAC is "mandatory", so the table owner should not be able to
control whether row-level checks are applied, or not.
So, I used a special purpose system column to represent security label.
It is generated for each tables, and no additional storage consumption
when MAC feature is disabled.

> #3 seems a little bit trickier.  I don't think the GRANT ... WHERE
> syntax is going to be very easy to use.  For constraint-based
> row-security, I think we should have something more like:
>
> ALTER TABLE table ADD ROW FILTER filtername USING othertable [, ...]
> WHERE where-clause
>
> (This suffers from the same problem as DELETE ... USING, namely that
> sometimes you want an outer join between table and othertable.)
>
> This gives the user a convenient way to insert a join against one or
> more side tables if they are so inclined.

Is it reasonably possible to implement USING clause, even if row-level
security is applied on COPY FROM/TO statement?

And, isn't it necessary to specify condition to apply the filter?
(such as select, update and delete)

> For security frameworks like SE-PostgreSQL, we might just provide a
> hook allowing the incoming query tree to be modified, and let the hook
> function check whether each table in the query has row-level security
> enabled, and if so perform a modification equivalent to the above.

One point we have to pay mention is all the row-level filter conditions
have to be evaluated before all the user given condition, except for
operators pulled-up to index accesses.
It allows malicious row-cost functions to leak "invisible" tuples anywhere.

> None of this addresses the issue of doing RLS on system catalogs,
> which seems like a much harder problem, possibly one that we should
> just ignore for the first phase of this project.

It is reasonable.

I'd like to point out a few more issues:

* TRUNCATE statement

Truncate is a good feature to clean up the contents of a table.
But it may contain unremovable tuples. So, it needs to scan a table to
be truncated once to confirm all the tuples can be removed by the current
user. It is a trade-off case between performance and security.

* Foreign Key constraint(1)

I don't think upcoming label-based MAC feature support covert channel issue.
Even if PK is invisible, we can guess PK exists from FK. In fact, commercial
database products (such as Oracle Label Security) also does not care about.

* Foreign Key constraint(2)

FK is implemented as a trigger which internally uses SELECT/UPDATE/DELETE.
If associated tuples are filtered out, it breaks reference integrity.
So, we have to apply special care. In SE-PgSQL case, it raises an error
instead of filtering during FK checks. And, row-level security hook is
called at the last for each tuples, unlike normal cases.

Thanks,
-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>

Re: Row-Level Security

От

Robert Haas

Дата:

13 декабря 2009 г., 08:11:54

On Sun, Dec 13, 2009 at 3:50 AM, KaiGai Kohei <kaigai@kaigai.gr.jp> wrote:
> (2009/12/13 12:18), Robert Haas wrote:
>> On Sat, Dec 12, 2009 at 7:41 PM, Josh Berkus<josh@agliodbs.com>  wrote:
>>> I blogged about this some time ago.  One issue I can see is that I
>>> believe that the RLS which many users want is different from the RLS
>>> which SEPostgres implements.
>>>
>>> Links:
>>>
>>> http://it.toolbox.com/blogs/database-soup/thinking-about-row-level-security-part-1-30732
>>> http://it.toolbox.com/blogs/database-soup/thinking-about-row-level-security-part-2-30757
>>
>> I read these blog entries a while ago but had forgotten about them.
>> They're very good, and summarize a lot of my thinking on this topic as
>> well.  I think that we can design a framework for row-level security
>> which can encompass both constraint-based security and label-based
>> security.  Both seem to me to be be based around doing essentially the
>> following things:
>>
>> 1. Adding columns to the table to store access control information.
>> 2. Populating those columns with additional information (owner, ACL,
>> security label, etc.) which can be used to make access control
>> decisions.
>> 3. Injecting logic into incoming queries which uses the information
>> inserted by (1) to filter out rows to which the access control policy
>> does not wish to allow access.
>>
>> Waving my hands in the air, #1 and #2 seem pretty straightforward.
>> For constraint-based security, one can imagine just adding a column to
>> the table and then adding a BEFORE INSERT OR UPDATE FOR EACH ROW
>> trigger that populates that column.  For label-based MAC, that's not
>> going to be quite sufficient, because the system needs to ensure that
>> the trigger that populates the security column must run last; if it
>> doesn't, some other trigger can come along afterwards and slip in a
>> value that isn't supposed to be there; plus, it might be inconvenient
>> to need to define this trigger for every table that needs RLS.
>
> Right, label-based MAC need its hook being called after all the BR-Insert
> triggers to assign a correct security label, not only access controls.
> I'd like to point out one more thing. When we update tuples, "invisible"
> tuples have to be filtered out before trigger functions.
>
>> However, those problems don't seem insurmountable.  Suppose we provide
>> a hook function that essentially acts like a global BEFORE INSERT OR
>> UPDATE trigger but which fires after all of the regular triggers.
>
> Basically, right. In my branch, SE-PgSQL put its hook after all the BR
> trigger invocations.
>
> http://code.google.com/p/sepgsql/source/browse/branches/pgsql-8.4.x/sepgsql/src/backend/executor/execMain.c#1883
>
> But we have another approach. When RelationBuildTriggers() initializes
> TriggerDesc of Relation, we can inject security hook as a special BR-trigger
> at the last. If we initialize it here, we don't need to modify COPY FROM
> implementation, not only INSERT.
>
> The reason why I didn't apply this approach is it needs more modification
> to the core routines, so it makes harder to manage out-of-tree code.

That's definitely something to consider if it's true.  Why did it
require more modification of the core routines?

>> SE-PostgreSQL can gain control at that point and search through the
>> columns of the target relation for a column called, say,
>> sepg_security_label.  If it finds such a column and that column is of
>> the appropriate type, then (1) if an explicit security label is
>> provided, it checks whether the specified label is permissible, (2)
>> otherwise, if the operation is insert, it determines the appropriate
>> default label for the current security context and inserts it, (3)
>> otherwise, it just leaves the current label alone.  This might not be
>> quite the right behavior but the point is whatever behavior you want
>> to have in terms of assigning/disallowing values for that column
>> should be possible to implement here.  The upshot is that if the
>> system administrator creates an sepg_security_label column of the
>> correct type, row-level security will be enabled for that table.
>> Otherwise, it will not.
>
> Basically, right. SE-PgSQL (or others) assign a new tuple either an
> explicitly given or a default security label, then it checks permission
> whether the client can insert a tuple with this label, or not.
>
> One point. MAC is "mandatory", so the table owner should not be able to
> control whether row-level checks are applied, or not.
> So, I used a special purpose system column to represent security label.
> It is generated for each tables, and no additional storage consumption
> when MAC feature is disabled.

My current feeling is that a special-purpose system column is not the
best approach.  I don't see what we gain by doing it that way.  Even
in an SE-PostgreSQL environment, row-level security might not be
desired on every table - after all, we've been told that SE-PostgreSQL
is useful without any row-level security AT ALL, so it's not hard to
think there could be environments where only some tables need to
protected.  So I think we want to have a way to turn it on and off on
a per-table basis.

Of course, as you point out, we have to make sure that anyone who
tries to turn RLS on or off for a particular table is authorized to
perform that operation.  But that's a separate problem which is I
don't think has much to do with row-level security.

>> #3 seems a little bit trickier.  I don't think the GRANT ... WHERE
>> syntax is going to be very easy to use.  For constraint-based
>> row-security, I think we should have something more like:
>>
>> ALTER TABLE table ADD ROW FILTER filtername USING othertable [, ...]
>> WHERE where-clause
>>
>> (This suffers from the same problem as DELETE ... USING, namely that
>> sometimes you want an outer join between table and othertable.)
>>
>> This gives the user a convenient way to insert a join against one or
>> more side tables if they are so inclined.
>
> Is it reasonably possible to implement USING clause, even if row-level
> security is applied on COPY FROM/TO statement?
> And, isn't it necessary to specify condition to apply the filter?
> (such as select, update and delete)

The filter is the WHERE clause.  I would think that the operation
being performed (select, update, delete) wouldn't enter into it.  This
part is just to decide which tuples will actually be accessible AT
ALL.  If you want to further prevent certain tuples that are being
accessed from being update or deleted, you can use a trigger for that
(possibly one of the global, always-applied-last triggers discussed
above).

For INSERT and COPY, I don't think that the ALTER TABLE ... ADD ROW
FILTER stuff would apply.  If you want to restrict what gets inserted,
that's another job for triggers.

>> For security frameworks like SE-PostgreSQL, we might just provide a
>> hook allowing the incoming query tree to be modified, and let the hook
>> function check whether each table in the query has row-level security
>> enabled, and if so perform a modification equivalent to the above.
>
> One point we have to pay mention is all the row-level filter conditions
> have to be evaluated before all the user given condition, except for
> operators pulled-up to index accesses.
> It allows malicious row-cost functions to leak "invisible" tuples anywhere.

We currently have this problem with DAC as well - it means that VIEWs
don't actually work as a security gateway, if the user has the ability
to define a function and pass a WHERE clause to a query against the
view, they can extract the hidden rows.  Fixing it seems like a hard
problem.

>> None of this addresses the issue of doing RLS on system catalogs,
>> which seems like a much harder problem, possibly one that we should
>> just ignore for the first phase of this project.
>
> It is reasonable.
>
> I'd like to point out a few more issues:
>
> * TRUNCATE statement
>
> Truncate is a good feature to clean up the contents of a table.
> But it may contain unremovable tuples. So, it needs to scan a table to
> be truncated once to confirm all the tuples can be removed by the current
> user. It is a trade-off case between performance and security.

I think we should just disallow TRUNCATE in cases where this might be
an issue.  If you want a slow and painful way to get rid of your table
contents, use DELETE.  Or at least, I'd start by doing it this way and
then we can think about whether there's enough benefit to doing what
you're suggesting later.

> * Foreign Key constraint(1)
>
> I don't think upcoming label-based MAC feature support covert channel issue.
> Even if PK is invisible, we can guess PK exists from FK. In fact, commercial
> database products (such as Oracle Label Security) also does not care about.

While I can't speak for anyone else, I don't have a problem not caring
about this.

> * Foreign Key constraint(2)
>
> FK is implemented as a trigger which internally uses SELECT/UPDATE/DELETE.
> If associated tuples are filtered out, it breaks reference integrity.
> So, we have to apply special care. In SE-PgSQL case, it raises an error
> instead of filtering during FK checks. And, row-level security hook is
> called at the last for each tuples, unlike normal cases.

Perfecting referential integrity here seems like a pretty tough
problem, but it's likely not necessary to solve it in order to get an
implementation of row-level security that is useful for some purposes.

...Robert

Re: Row-Level Security

От

KaiGai Kohei

Дата:

14 декабря 2009 г., 01:12:53

Robert Haas wrote:
> On Sun, Dec 13, 2009 at 3:50 AM, KaiGai Kohei <kaigai@kaigai.gr.jp> wrote:
>> Basically, right. In my branch, SE-PgSQL put its hook after all the BR
>> trigger invocations.
>>
>> http://code.google.com/p/sepgsql/source/browse/branches/pgsql-8.4.x/sepgsql/src/backend/executor/execMain.c#1883
>>
>> But we have another approach. When RelationBuildTriggers() initializes
>> TriggerDesc of Relation, we can inject security hook as a special BR-trigger
>> at the last. If we initialize it here, we don't need to modify COPY FROM
>> implementation, not only INSERT.
>>
>> The reason why I didn't apply this approach is it needs more modification
>> to the core routines, so it makes harder to manage out-of-tree code.
> 
> That's definitely something to consider if it's true.  Why did it
> require more modification of the core routines?

In my local branch, it just adds two lines as follows: + /* SELinux labeling and permission checks */ +
sepgsql_heap_insert(resultRelationDesc,tuple);

It is obviously less than modify RelationBuildTriggers() to allocate an
additional slot for the TrigDesc array and put an entry.
The reason was just from the perspective to maintain out-of-tree code,
but different perspective will be necessary to propose a featuer to upstream.

>> One point. MAC is "mandatory", so the table owner should not be able to
>> control whether row-level checks are applied, or not.
>> So, I used a special purpose system column to represent security label.
>> It is generated for each tables, and no additional storage consumption
>> when MAC feature is disabled.
> 
> My current feeling is that a special-purpose system column is not the
> best approach.  I don't see what we gain by doing it that way.  Even
> in an SE-PostgreSQL environment, row-level security might not be
> desired on every table - after all, we've been told that SE-PostgreSQL
> is useful without any row-level security AT ALL, so it's not hard to
> think there could be environments where only some tables need to
> protected.  So I think we want to have a way to turn it on and off on
> a per-table basis.
> 
> Of course, as you point out, we have to make sure that anyone who
> tries to turn RLS on or off for a particular table is authorized to
> perform that operation.  But that's a separate problem which is I
> don't think has much to do with row-level security.

Yes, it is a separate problem not to be concluded at the moment.
(Perhaps, it depends on security model. In DAC, per-table basis is preferable.)

So, I'd like to bring up just an issue to be discussed later.
When we build a binary with a label-based MAC, such as SE-PgSQL, it shall
be turned on/off in the startup time.
(I don't assume it should be configurable in runtime.)

If we set up database cluster without any label-based MAC, all the tuple
shall not have any security label. If the security label is stored within
regular column, we have to modify schema for any tables at first.
If system column provides a security label of tuple, we can dynamically
generate an appropriate security label. In SELinux case, it assumes any
unlabeled objects performs as if it has a pseudo security label: system_u:object_r:unlabeled_t:s0

Needless to say, we need to assign appropriate security labels for
meaningful access controls later, but it does not require any schema
changes, even if we repeat to turn on/off the label-based MAC feature.

When label-based MAC feature is disabled, this system column can return
a pseudo value such as NULL or empty string.

>>> #3 seems a little bit trickier.  I don't think the GRANT ... WHERE
>>> syntax is going to be very easy to use.  For constraint-based
>>> row-security, I think we should have something more like:
>>>
>>> ALTER TABLE table ADD ROW FILTER filtername USING othertable [, ...]
>>> WHERE where-clause
>>>
>>> (This suffers from the same problem as DELETE ... USING, namely that
>>> sometimes you want an outer join between table and othertable.)
>>>
>>> This gives the user a convenient way to insert a join against one or
>>> more side tables if they are so inclined.
>> Is it reasonably possible to implement USING clause, even if row-level
>> security is applied on COPY FROM/TO statement?
>> And, isn't it necessary to specify condition to apply the filter?
>> (such as select, update and delete)
> 
> The filter is the WHERE clause.  I would think that the operation
> being performed (select, update, delete) wouldn't enter into it.  This
> part is just to decide which tuples will actually be accessible AT
> ALL.  If you want to further prevent certain tuples that are being
> accessed from being update or deleted, you can use a trigger for that
> (possibly one of the global, always-applied-last triggers discussed
> above).
> 
> For INSERT and COPY, I don't think that the ALTER TABLE ... ADD ROW
> FILTER stuff would apply.  If you want to restrict what gets inserted,
> that's another job for triggers.

Are you talking about COPY TO, not only COPY FROM?
For INSERT and COPY FROM, I agree with the direction. Access controls
(and labeling) should be applied on the BR trigger functions.

But COPY TO should filter violated tuples in proper way, because it
can be a big bypass for row-level access controls.
If WHERE clause does not refer any other relations, it is not a difficult
to handle correctly.

>>> For security frameworks like SE-PostgreSQL, we might just provide a
>>> hook allowing the incoming query tree to be modified, and let the hook
>>> function check whether each table in the query has row-level security
>>> enabled, and if so perform a modification equivalent to the above.
>> One point we have to pay mention is all the row-level filter conditions
>> have to be evaluated before all the user given condition, except for
>> operators pulled-up to index accesses.
>> It allows malicious row-cost functions to leak "invisible" tuples anywhere.
> 
> We currently have this problem with DAC as well - it means that VIEWs
> don't actually work as a security gateway, if the user has the ability
> to define a function and pass a WHERE clause to a query against the
> view, they can extract the hidden rows.  Fixing it seems like a hard
> problem.

Yes, we need to consider reasonable solution for the matter.

>> * TRUNCATE statement
>>
>> Truncate is a good feature to clean up the contents of a table.
>> But it may contain unremovable tuples. So, it needs to scan a table to
>> be truncated once to confirm all the tuples can be removed by the current
>> user. It is a trade-off case between performance and security.
> 
> I think we should just disallow TRUNCATE in cases where this might be
> an issue.  If you want a slow and painful way to get rid of your table
> contents, use DELETE.  Or at least, I'd start by doing it this way and
> then we can think about whether there's enough benefit to doing what
> you're suggesting later.

It seems to me the uniformed-disallow is more painfull than violation checks
on the table to be truncated. At least, we should provide an option to check
the table to be truncated does not contain any unremovable tuples when row-
level checks are activated.

However, as you pointed out, it is not a first issue to be resolved.
It may be a todo feature.

>> * Foreign Key constraint(2)
>>
>> FK is implemented as a trigger which internally uses SELECT/UPDATE/DELETE.
>> If associated tuples are filtered out, it breaks reference integrity.
>> So, we have to apply special care. In SE-PgSQL case, it raises an error
>> instead of filtering during FK checks. And, row-level security hook is
>> called at the last for each tuples, unlike normal cases.
> 
> Perfecting referential integrity here seems like a pretty tough
> problem, but it's likely not necessary to solve it in order to get an
> implementation of row-level security that is useful for some purposes.

Is the approach in SE-PgSQL suitable for the issue?
It can prevent to update/delete tuple referenced by invisible tuples.

We have two modes in row-level security.
The first is filtering-mode. It applies security policy function prior
to any other user given conditions, and filters out violated tuples from
the result set.
The second is aborting-mode. It is only used by internal stuff which does
not provide any malicious function in the condition. It applies security
policy function next to all the WHERE clause, and raises an error if the
query tries to refer violated tuples.

Thanks,
-- 
OSS Platform Development Division, NEC
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: Row-Level Security

От

Robert Haas

Дата:

14 декабря 2009 г., 01:54:38

2009/12/13 KaiGai Kohei <kaigai@ak.jp.nec.com>:
> Robert Haas wrote:
>> On Sun, Dec 13, 2009 at 3:50 AM, KaiGai Kohei <kaigai@kaigai.gr.jp> wrote:
>>> Basically, right. In my branch, SE-PgSQL put its hook after all the BR
>>> trigger invocations.
>>>
>>> http://code.google.com/p/sepgsql/source/browse/branches/pgsql-8.4.x/sepgsql/src/backend/executor/execMain.c#1883
>>>
>>> But we have another approach. When RelationBuildTriggers() initializes
>>> TriggerDesc of Relation, we can inject security hook as a special BR-trigger
>>> at the last. If we initialize it here, we don't need to modify COPY FROM
>>> implementation, not only INSERT.
>>>
>>> The reason why I didn't apply this approach is it needs more modification
>>> to the core routines, so it makes harder to manage out-of-tree code.
>>
>> That's definitely something to consider if it's true.  Why did it
>> require more modification of the core routines?
>
> In my local branch, it just adds two lines as follows:
>  + /* SELinux labeling and permission checks */
>  + sepgsql_heap_insert(resultRelationDesc, tuple);
>
> It is obviously less than modify RelationBuildTriggers() to allocate an
> additional slot for the TrigDesc array and put an entry.
> The reason was just from the perspective to maintain out-of-tree code,
> but different perspective will be necessary to propose a featuer to upstream.

Yes, the difference in code impact between those two will not be
relevant for upstream, I think.

>>> One point. MAC is "mandatory", so the table owner should not be able to
>>> control whether row-level checks are applied, or not.
>>> So, I used a special purpose system column to represent security label.
>>> It is generated for each tables, and no additional storage consumption
>>> when MAC feature is disabled.
>>
>> My current feeling is that a special-purpose system column is not the
>> best approach.  I don't see what we gain by doing it that way.  Even
>> in an SE-PostgreSQL environment, row-level security might not be
>> desired on every table - after all, we've been told that SE-PostgreSQL
>> is useful without any row-level security AT ALL, so it's not hard to
>> think there could be environments where only some tables need to
>> protected.  So I think we want to have a way to turn it on and off on
>> a per-table basis.
>>
>> Of course, as you point out, we have to make sure that anyone who
>> tries to turn RLS on or off for a particular table is authorized to
>> perform that operation.  But that's a separate problem which is I
>> don't think has much to do with row-level security.
>
> Yes, it is a separate problem not to be concluded at the moment.
> (Perhaps, it depends on security model. In DAC, per-table basis is preferable.)

Even for MAC, it might be desirable to turn it off on codes tables or
the like, to minimize the performance hit.  But we can defer this
question to another day.

> So, I'd like to bring up just an issue to be discussed later.
> When we build a binary with a label-based MAC, such as SE-PgSQL, it shall
> be turned on/off in the startup time.
> (I don't assume it should be configurable in runtime.)

I don't see any real reason why it couldn't be configurable at
runtime, but I don't have a terribly strong opinion at this point.  I
might have an opinion later when I'm more informed.

> If we set up database cluster without any label-based MAC, all the tuple
> shall not have any security label. If the security label is stored within
> regular column, we have to modify schema for any tables at first.
> If system column provides a security label of tuple, we can dynamically
> generate an appropriate security label. In SELinux case, it assumes any
> unlabeled objects performs as if it has a pseudo security label:
>  system_u:object_r:unlabeled_t:s0
>
> Needless to say, we need to assign appropriate security labels for
> meaningful access controls later, but it does not require any schema
> changes, even if we repeat to turn on/off the label-based MAC feature.
>
> When label-based MAC feature is disabled, this system column can return
> a pseudo value such as NULL or empty string.

I think you are wrong about all of this.  To add security labels to
existing tuples, you're going to need to rewrite the table, period.
Whether you're adding a column in the process or just populating the
contents of a previous-omitted column doesn't seem particularly
relevant.  Similarly you can insert a pseudo security label when the
column is missing just as well as you can when it's present but
unpopulated.

>>>> #3 seems a little bit trickier.  I don't think the GRANT ... WHERE
>>>> syntax is going to be very easy to use.  For constraint-based
>>>> row-security, I think we should have something more like:
>>>>
>>>> ALTER TABLE table ADD ROW FILTER filtername USING othertable [, ...]
>>>> WHERE where-clause
>>>>
>>>> (This suffers from the same problem as DELETE ... USING, namely that
>>>> sometimes you want an outer join between table and othertable.)
>>>>
>>>> This gives the user a convenient way to insert a join against one or
>>>> more side tables if they are so inclined.
>>> Is it reasonably possible to implement USING clause, even if row-level
>>> security is applied on COPY FROM/TO statement?
>>> And, isn't it necessary to specify condition to apply the filter?
>>> (such as select, update and delete)
>>
>> The filter is the WHERE clause.  I would think that the operation
>> being performed (select, update, delete) wouldn't enter into it.  This
>> part is just to decide which tuples will actually be accessible AT
>> ALL.  If you want to further prevent certain tuples that are being
>> accessed from being update or deleted, you can use a trigger for that
>> (possibly one of the global, always-applied-last triggers discussed
>> above).
>>
>> For INSERT and COPY, I don't think that the ALTER TABLE ... ADD ROW
>> FILTER stuff would apply.  If you want to restrict what gets inserted,
>> that's another job for triggers.
>
> Are you talking about COPY TO, not only COPY FROM?
> For INSERT and COPY FROM, I agree with the direction. Access controls
> (and labeling) should be applied on the BR trigger functions.

OK.  Yes, that's what I meant.

> But COPY TO should filter violated tuples in proper way, because it
> can be a big bypass for row-level access controls.

Good point.  If a table has row filtering enabled, we'll have to
convert COPY FROM <table> to COPY FROM (SELECT * FROM <table>).  That
should be enough to make the row filters kick in.

> If WHERE clause does not refer any other relations, it is not a difficult
> to handle correctly.

Even if it does refer to other relations I think it's fine, under the
above approach.

[snip]

>>> * Foreign Key constraint(2)
>>>
>>> FK is implemented as a trigger which internally uses SELECT/UPDATE/DELETE.
>>> If associated tuples are filtered out, it breaks reference integrity.
>>> So, we have to apply special care. In SE-PgSQL case, it raises an error
>>> instead of filtering during FK checks. And, row-level security hook is
>>> called at the last for each tuples, unlike normal cases.
>>
>> Perfecting referential integrity here seems like a pretty tough
>> problem, but it's likely not necessary to solve it in order to get an
>> implementation of row-level security that is useful for some purposes.
>
> Is the approach in SE-PgSQL suitable for the issue?
> It can prevent to update/delete tuple referenced by invisible tuples.
>
> We have two modes in row-level security.
> The first is filtering-mode. It applies security policy function prior
> to any other user given conditions, and filters out violated tuples from
> the result set.
> The second is aborting-mode. It is only used by internal stuff which does
> not provide any malicious function in the condition. It applies security
> policy function next to all the WHERE clause, and raises an error if the
> query tries to refer violated tuples.

Hmm... the idea of having two modes doesn't sound right off the top of
my head.  But I think we have a long time before we need worry about
this.  We have neither SE-PostgreSQL nor RLS in core, nor are either
one anywhere close to being merged.  So worrying about how the two
will interact when we have both is putting the cart before the horse.
A lot can change between now and then.

...Robert

Re: Row-Level Security

От

KaiGai Kohei

Дата:

14 декабря 2009 г., 02:49:32

Robert Haas wrote:
>>>> One point. MAC is "mandatory", so the table owner should not be able to
>>>> control whether row-level checks are applied, or not.
>>>> So, I used a special purpose system column to represent security label.
>>>> It is generated for each tables, and no additional storage consumption
>>>> when MAC feature is disabled.
>>> My current feeling is that a special-purpose system column is not the
>>> best approach.  I don't see what we gain by doing it that way.  Even
>>> in an SE-PostgreSQL environment, row-level security might not be
>>> desired on every table - after all, we've been told that SE-PostgreSQL
>>> is useful without any row-level security AT ALL, so it's not hard to
>>> think there could be environments where only some tables need to
>>> protected.  So I think we want to have a way to turn it on and off on
>>> a per-table basis.
>>>
>>> Of course, as you point out, we have to make sure that anyone who
>>> tries to turn RLS on or off for a particular table is authorized to
>>> perform that operation.  But that's a separate problem which is I
>>> don't think has much to do with row-level security.
>> Yes, it is a separate problem not to be concluded at the moment.
>> (Perhaps, it depends on security model. In DAC, per-table basis is preferable.)
> 
> Even for MAC, it might be desirable to turn it off on codes tables or
> the like, to minimize the performance hit.  But we can defer this
> question to another day.

Yes, I provide sepgsql_row_level guc in my local branch to turn on/off
its row-level controls. It allows to reduce performance penalty related
to RLS and reduce storage consumption for security labels. (It requires
additional sizeof(Oid) bytes for each tuples.)
The point is this guc option is configurable from the only administrator
who can edit $PGDATA/postgresql.conf.

But it is an implementation detail not to be concluded at the moment.

>> If we set up database cluster without any label-based MAC, all the tuple
>> shall not have any security label. If the security label is stored within
>> regular column, we have to modify schema for any tables at first.
>> If system column provides a security label of tuple, we can dynamically
>> generate an appropriate security label. In SELinux case, it assumes any
>> unlabeled objects performs as if it has a pseudo security label:
>>  system_u:object_r:unlabeled_t:s0
>>
>> Needless to say, we need to assign appropriate security labels for
>> meaningful access controls later, but it does not require any schema
>> changes, even if we repeat to turn on/off the label-based MAC feature.
>>
>> When label-based MAC feature is disabled, this system column can return
>> a pseudo value such as NULL or empty string.
> 
> I think you are wrong about all of this.  To add security labels to
> existing tuples, you're going to need to rewrite the table, period.
> Whether you're adding a column in the process or just populating the
> contents of a previous-omitted column doesn't seem particularly
> relevant.  Similarly you can insert a pseudo security label when the
> column is missing just as well as you can when it's present but
> unpopulated.

For system catalogs, we cannot touch its schema with a light heart,
even if active enhanced security provider is switched or turned on/off.
If we define a common system column for all the label-based MAC,
it can be available for both of user tables and system catalogs
without any table-rewrite process.

But it is an implementation detail not to be concluded at the moment.

>>>>> #3 seems a little bit trickier.  I don't think the GRANT ... WHERE
>>>>> syntax is going to be very easy to use.  For constraint-based
>>>>> row-security, I think we should have something more like:
>>>>>
>>>>> ALTER TABLE table ADD ROW FILTER filtername USING othertable [, ...]
>>>>> WHERE where-clause
>>>>>
>>>>> (This suffers from the same problem as DELETE ... USING, namely that
>>>>> sometimes you want an outer join between table and othertable.)
>>>>>
>>>>> This gives the user a convenient way to insert a join against one or
>>>>> more side tables if they are so inclined.
>>>> Is it reasonably possible to implement USING clause, even if row-level
>>>> security is applied on COPY FROM/TO statement?
>>>> And, isn't it necessary to specify condition to apply the filter?
>>>> (such as select, update and delete)
>>> The filter is the WHERE clause.  I would think that the operation
>>> being performed (select, update, delete) wouldn't enter into it.  This
>>> part is just to decide which tuples will actually be accessible AT
>>> ALL.  If you want to further prevent certain tuples that are being
>>> accessed from being update or deleted, you can use a trigger for that
>>> (possibly one of the global, always-applied-last triggers discussed
>>> above).
>>>
>>> For INSERT and COPY, I don't think that the ALTER TABLE ... ADD ROW
>>> FILTER stuff would apply.  If you want to restrict what gets inserted,
>>> that's another job for triggers.
>> Are you talking about COPY TO, not only COPY FROM?
>> For INSERT and COPY FROM, I agree with the direction. Access controls
>> (and labeling) should be applied on the BR trigger functions.
> 
> OK.  Yes, that's what I meant.
> 
[..snip..]

>> But COPY TO should filter violated tuples in proper way, because it
>> can be a big bypass for row-level access controls.
> 
> Good point.  If a table has row filtering enabled, we'll have to
> convert COPY FROM <table> to COPY FROM (SELECT * FROM <table>).  That
> should be enough to make the row filters kick in.

Wow, it seems to me a good idea.

>>>> * Foreign Key constraint(2)
>>>>
>>>> FK is implemented as a trigger which internally uses SELECT/UPDATE/DELETE.
>>>> If associated tuples are filtered out, it breaks reference integrity.
>>>> So, we have to apply special care. In SE-PgSQL case, it raises an error
>>>> instead of filtering during FK checks. And, row-level security hook is
>>>> called at the last for each tuples, unlike normal cases.
>>> Perfecting referential integrity here seems like a pretty tough
>>> problem, but it's likely not necessary to solve it in order to get an
>>> implementation of row-level security that is useful for some purposes.
>> Is the approach in SE-PgSQL suitable for the issue?
>> It can prevent to update/delete tuple referenced by invisible tuples.
>>
>> We have two modes in row-level security.
>> The first is filtering-mode. It applies security policy function prior
>> to any other user given conditions, and filters out violated tuples from
>> the result set.
>> The second is aborting-mode. It is only used by internal stuff which does
>> not provide any malicious function in the condition. It applies security
>> policy function next to all the WHERE clause, and raises an error if the
>> query tries to refer violated tuples.
> 
> Hmm... the idea of having two modes doesn't sound right off the top of
> my head.  But I think we have a long time before we need worry about
> this.  We have neither SE-PostgreSQL nor RLS in core, nor are either
> one anywhere close to being merged.  So worrying about how the two
> will interact when we have both is putting the cart before the horse.
> A lot can change between now and then.

IIRC, I've not gotten any opposition about this two-modes design.
Most of arguments about RLS were information leaks via covert-channels
which allows us to estimate an existence of invisible PK/FK.
But we don't define it as a problem to be resolved.

Thanks,
-- 
OSS Platform Development Division, NEC
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: Row-Level Security

От

KaiGai Kohei

Дата:

14 декабря 2009 г., 04:59:17

One more issue I found.

What row-level policy should be applied on inherited tables?

If inconsistent policy is applied on the parent and child table,
we can see different result set, although a part of result set
in the parent table come from the child table.

My idea is to copy row-level policies to inherited tables from the
parent table. We can additional row-level policy on the inherited
tables, but all the condition is chained by AND, so here is no
inconsistency.
Even if the inherited table has multiple parents, all the row-level
policies shall be applied, so here is no inconsistency.
(Needless to say, child table have same columns, so we can apply
same row-level policies.)

Thanks,
-- 
OSS Platform Development Division, NEC
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: Row-Level Security

От

Robert Haas

Дата:

14 декабря 2009 г., 07:18:53

2009/12/14 KaiGai Kohei <kaigai@ak.jp.nec.com>:
> Robert Haas wrote:
>>>>> One point. MAC is "mandatory", so the table owner should not be able to
>>>>> control whether row-level checks are applied, or not.
>>>>> So, I used a special purpose system column to represent security label.
>>>>> It is generated for each tables, and no additional storage consumption
>>>>> when MAC feature is disabled.
>>>> My current feeling is that a special-purpose system column is not the
>>>> best approach.  I don't see what we gain by doing it that way.  Even
>>>> in an SE-PostgreSQL environment, row-level security might not be
>>>> desired on every table - after all, we've been told that SE-PostgreSQL
>>>> is useful without any row-level security AT ALL, so it's not hard to
>>>> think there could be environments where only some tables need to
>>>> protected.  So I think we want to have a way to turn it on and off on
>>>> a per-table basis.
>>>>
>>>> Of course, as you point out, we have to make sure that anyone who
>>>> tries to turn RLS on or off for a particular table is authorized to
>>>> perform that operation.  But that's a separate problem which is I
>>>> don't think has much to do with row-level security.
>>> Yes, it is a separate problem not to be concluded at the moment.
>>> (Perhaps, it depends on security model. In DAC, per-table basis is preferable.)
>>
>> Even for MAC, it might be desirable to turn it off on codes tables or
>> the like, to minimize the performance hit.  But we can defer this
>> question to another day.
>
> Yes, I provide sepgsql_row_level guc in my local branch to turn on/off
> its row-level controls. It allows to reduce performance penalty related
> to RLS and reduce storage consumption for security labels. (It requires
> additional sizeof(Oid) bytes for each tuples.)
> The point is this guc option is configurable from the only administrator
> who can edit $PGDATA/postgresql.conf.
>
> But it is an implementation detail not to be concluded at the moment.

Well, that would be a global switch, not per table.

>>> If we set up database cluster without any label-based MAC, all the tuple
>>> shall not have any security label. If the security label is stored within
>>> regular column, we have to modify schema for any tables at first.
>>> If system column provides a security label of tuple, we can dynamically
>>> generate an appropriate security label. In SELinux case, it assumes any
>>> unlabeled objects performs as if it has a pseudo security label:
>>>  system_u:object_r:unlabeled_t:s0
>>>
>>> Needless to say, we need to assign appropriate security labels for
>>> meaningful access controls later, but it does not require any schema
>>> changes, even if we repeat to turn on/off the label-based MAC feature.
>>>
>>> When label-based MAC feature is disabled, this system column can return
>>> a pseudo value such as NULL or empty string.
>>
>> I think you are wrong about all of this.  To add security labels to
>> existing tuples, you're going to need to rewrite the table, period.
>> Whether you're adding a column in the process or just populating the
>> contents of a previous-omitted column doesn't seem particularly
>> relevant.  Similarly you can insert a pseudo security label when the
>> column is missing just as well as you can when it's present but
>> unpopulated.
>
> For system catalogs, we cannot touch its schema with a light heart,
> even if active enhanced security provider is switched or turned on/off.
> If we define a common system column for all the label-based MAC,
> it can be available for both of user tables and system catalogs
> without any table-rewrite process.
>
> But it is an implementation detail not to be concluded at the moment.

Err... well, as I said upthread: "None of this addresses the issue of
doing RLS on system catalogs, which seems like a much harder problem,
possibly one that we should just ignore for the first phase of this
project."  So yeah, I agree: it won't work for system catalogs.

[snip]

>>>>> * Foreign Key constraint(2)
>>>>>
>>>>> FK is implemented as a trigger which internally uses SELECT/UPDATE/DELETE.
>>>>> If associated tuples are filtered out, it breaks reference integrity.
>>>>> So, we have to apply special care. In SE-PgSQL case, it raises an error
>>>>> instead of filtering during FK checks. And, row-level security hook is
>>>>> called at the last for each tuples, unlike normal cases.
>>>> Perfecting referential integrity here seems like a pretty tough
>>>> problem, but it's likely not necessary to solve it in order to get an
>>>> implementation of row-level security that is useful for some purposes.
>>> Is the approach in SE-PgSQL suitable for the issue?
>>> It can prevent to update/delete tuple referenced by invisible tuples.
>>>
>>> We have two modes in row-level security.
>>> The first is filtering-mode. It applies security policy function prior
>>> to any other user given conditions, and filters out violated tuples from
>>> the result set.
>>> The second is aborting-mode. It is only used by internal stuff which does
>>> not provide any malicious function in the condition. It applies security
>>> policy function next to all the WHERE clause, and raises an error if the
>>> query tries to refer violated tuples.
>>
>> Hmm... the idea of having two modes doesn't sound right off the top of
>> my head.  But I think we have a long time before we need worry about
>> this.  We have neither SE-PostgreSQL nor RLS in core, nor are either
>> one anywhere close to being merged.  So worrying about how the two
>> will interact when we have both is putting the cart before the horse.
>> A lot can change between now and then.
>
> IIRC, I've not gotten any opposition about this two-modes design.
> Most of arguments about RLS were information leaks via covert-channels
> which allows us to estimate an existence of invisible PK/FK.
> But we don't define it as a problem to be resolved.

I know that was one of Tom's concerns.  Personally, my concerns are:

1. I want to implement row-level security in a way that is useful for
people who don't care about SE-PostgreSQL.  I think there are lots of
people who would be interested in that.  In fact, as Josh said, there
are probably MORE people who are interested in the constraint-based
approach than there are who want label-based security a la
SE-PostgreSQL.

2. I want to implement row-level security in a way that is very
flexible and allows for a wide range of access control policies.  The
core row-level security mechanism should not care about or prejudge a
particular policy - it should just be a mechanism for enforcing
row-filtering.

3. I want to implement row-level security in a way that allows the
planner maximum flexibility in implementing the row filtering that is
needed in a particular case.  SE-PostgreSQL RLS presumes what is
essentially an additional join against the security table ID for every
table in the query - doing this in a way that allows joins to be
reordered or implemented in multiple ways (straight nestloop, nestloop
with inner indexscan, hash join) will drastically improve performance.The original implementation didn't actually
implementit as a join, 
but rather with special-case code that performed the security ID
lookups as part of the heap scan.  That's not going to work for any
kind of row-level security other than SE-PostgreSQL (so, see points 1
and 2) and it's also going to make the performance much worse than it
needs to be.  Granted, the performance is never going to be GOOD, but
we should try to at least make it not ATROCIOUS.

...Robert

Re: Row-Level Security

От

KaiGai Kohei

Дата:

14 декабря 2009 г., 07:52:43

(2009/12/14 20:18), Robert Haas wrote:
>>>>>> * Foreign Key constraint(2)
>>>>>>
>>>>>> FK is implemented as a trigger which internally uses SELECT/UPDATE/DELETE.
>>>>>> If associated tuples are filtered out, it breaks reference integrity.
>>>>>> So, we have to apply special care. In SE-PgSQL case, it raises an error
>>>>>> instead of filtering during FK checks. And, row-level security hook is
>>>>>> called at the last for each tuples, unlike normal cases.
>>>>> Perfecting referential integrity here seems like a pretty tough
>>>>> problem, but it's likely not necessary to solve it in order to get an
>>>>> implementation of row-level security that is useful for some purposes.
>>>> Is the approach in SE-PgSQL suitable for the issue?
>>>> It can prevent to update/delete tuple referenced by invisible tuples.
>>>>
>>>> We have two modes in row-level security.
>>>> The first is filtering-mode. It applies security policy function prior
>>>> to any other user given conditions, and filters out violated tuples from
>>>> the result set.
>>>> The second is aborting-mode. It is only used by internal stuff which does
>>>> not provide any malicious function in the condition. It applies security
>>>> policy function next to all the WHERE clause, and raises an error if the
>>>> query tries to refer violated tuples.
>>>
>>> Hmm... the idea of having two modes doesn't sound right off the top of
>>> my head.  But I think we have a long time before we need worry about
>>> this.  We have neither SE-PostgreSQL nor RLS in core, nor are either
>>> one anywhere close to being merged.  So worrying about how the two
>>> will interact when we have both is putting the cart before the horse.
>>> A lot can change between now and then.
>>
>> IIRC, I've not gotten any opposition about this two-modes design.
>> Most of arguments about RLS were information leaks via covert-channels
>> which allows us to estimate an existence of invisible PK/FK.
>> But we don't define it as a problem to be resolved.
>
> I know that was one of Tom's concerns.  Personally, my concerns are:
>
> 1. I want to implement row-level security in a way that is useful for
> people who don't care about SE-PostgreSQL.  I think there are lots of
> people who would be interested in that.  In fact, as Josh said, there
> are probably MORE people who are interested in the constraint-based
> approach than there are who want label-based security a la
> SE-PostgreSQL.

I also agree it is a right direction. SELinux shall be "one of them" to
make access control decision in row-level.
In my current standpoint, we add general row-level security first, then
SELinux support provides its access control decision function. OK?

> 2. I want to implement row-level security in a way that is very
> flexible and allows for a wide range of access control policies.  The
> core row-level security mechanism should not care about or prejudge a
> particular policy - it should just be a mechanism for enforcing
> row-filtering.

Ditto,

> 3. I want to implement row-level security in a way that allows the
> planner maximum flexibility in implementing the row filtering that is
> needed in a particular case.  SE-PostgreSQL RLS presumes what is
> essentially an additional join against the security table ID for every
> table in the query - doing this in a way that allows joins to be
> reordered or implemented in multiple ways (straight nestloop, nestloop
> with inner indexscan, hash join) will drastically improve performance.
>   The original implementation didn't actually implement it as a join,
> but rather with special-case code that performed the security ID
> lookups as part of the heap scan.  That's not going to work for any
> kind of row-level security other than SE-PostgreSQL (so, see points 1
> and 2) and it's also going to make the performance much worse than it
> needs to be.  Granted, the performance is never going to be GOOD, but
> we should try to at least make it not ATROCIOUS.

The reason why I put on the security hook in ExecScan() is to avoid the
problem that row-cost user defined function can be evaluated earlier
than row-level security policy. (I believed it was a well-known problem
at that time yet.) So, I didn't want to append it before optimization.

I also believe this matter should be resolved when we provide row-level
security stuff, because it is a security feature.

Thanks,
-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>

Re: Row-Level Security

От

Stephen Frost

Дата:

14 декабря 2009 г., 09:47:25

KaiGai,

* KaiGai Kohei (kaigai@kaigai.gr.jp) wrote:
> The reason why I put on the security hook in ExecScan() is to avoid the
> problem that row-cost user defined function can be evaluated earlier
> than row-level security policy. (I believed it was a well-known problem
> at that time yet.) So, I didn't want to append it before optimization.

This is a problem which needs to be addressed and fixed independently.

> I also believe this matter should be resolved when we provide row-level
> security stuff, because it is a security feature.

This issue should be fixed first, not as part of some large-scale patch.

If you have thoughts or ideas about how to address this problem as it
relates to views, I think you would find alot of people willing to
listen and to discuss it.  This must be independent of SELinux,
independent of row-level security, and isn't something based on any of
the patches which have been submitted so far.  None of them that I've
seen resolve this problem in a way that the community is willing to
accept.
Thanks,
    Stephen

Re: Row-Level Security

От

Tom Lane

Дата:

14 декабря 2009 г., 11:05:30

KaiGai Kohei <kaigai@ak.jp.nec.com> writes:
> One more issue I found.
> What row-level policy should be applied on inherited tables?

Yup, that seems like an interesting problem.

> Even if the inherited table has multiple parents, all the row-level
> policies shall be applied, so here is no inconsistency.
> (Needless to say, child table have same columns, so we can apply
> same row-level policies.)

I don't think I believe either of those statements.
        regards, tom lane

Re: Row-Level Security

От

KaiGai Kohei

Дата:

14 декабря 2009 г., 21:41:35

Stephen Frost wrote:
> KaiGai,
> 
> * KaiGai Kohei (kaigai@kaigai.gr.jp) wrote:
>> The reason why I put on the security hook in ExecScan() is to avoid the
>> problem that row-cost user defined function can be evaluated earlier
>> than row-level security policy. (I believed it was a well-known problem
>> at that time yet.) So, I didn't want to append it before optimization.
> 
> This is a problem which needs to be addressed and fixed independently.
> 
>> I also believe this matter should be resolved when we provide row-level
>> security stuff, because it is a security feature.
> 
> This issue should be fixed first, not as part of some large-scale patch.
> 
> If you have thoughts or ideas about how to address this problem as it
> relates to views, I think you would find alot of people willing to
> listen and to discuss it.  This must be independent of SELinux,
> independent of row-level security, and isn't something based on any of
> the patches which have been submitted so far.  None of them that I've
> seen resolve this problem in a way that the community is willing to
> accept.

Sorry, I don't have something good idea at the moment.

IIRC, one headache issue is that user may provide well indexable conditions,
such as "SELECT * FROM view_x WHERE id = 1234". In this case, if we strictly
keep the order of evaluation between inside and outside of the view, its
performance penalty will over reasonable tradeoff to the better security.

Someone pointed out user given conditions which can be replaced by index scan
are "trusted", so all we need to focus on are conditions which need to check
for each tuples. I also think it is a right direction, as long as we can
restrict who can define index access method in appropriate way.

If we can focus on the order of evaluation on the non-indexed conditions,
the point is order_qual_clauses() which sort the given qual list based on
the cost evaluation. If we can mark condition node a flag which means this
node come from inside of view or row-level policy, it is not difficult to
keep evaluation order.

However, it is just my quick idea. It might miss something.
We need to consider the matter for more details...
-- 
OSS Platform Development Division, NEC
KaiGai Kohei <kaigai@ak.jp.nec.com>

Re: Row-Level Security

От

Stephen Frost

Дата:

14 декабря 2009 г., 22:07:02

KaiGai,

* KaiGai Kohei (kaigai@ak.jp.nec.com) wrote:
> IIRC, one headache issue is that user may provide well indexable conditions,
> such as "SELECT * FROM view_x WHERE id = 1234". In this case, if we strictly
> keep the order of evaluation between inside and outside of the view, its
> performance penalty will over reasonable tradeoff to the better security.
>
> Someone pointed out user given conditions which can be replaced by index scan
> are "trusted", so all we need to focus on are conditions which need to check
> for each tuples. I also think it is a right direction, as long as we can
> restrict who can define index access method in appropriate way.

It sounds like that might help, but I feel that a whole solution will be
more complex than just differentiating between seq scan nodes and index
scan ones.

> If we can focus on the order of evaluation on the non-indexed conditions,
> the point is order_qual_clauses() which sort the given qual list based on
> the cost evaluation. If we can mark condition node a flag which means this
> node come from inside of view or row-level policy, it is not difficult to
> keep evaluation order.

Identifying where this matters is important.  Anyone have suggestions on
how to do that?  There was some discussion on IRC about that but it
didn't really go anywhere.  I don't like the idea of presuming the user
will always want to limit the planner in this way.  Perhaps we can
convince ourselves, once we have an implementation, that it doesn't
poorly affect performance (the primary reason to avoid constraining the
planner), or that it's what our users would really want (I might be able
to buy off on this..), but I doubt it.

A couple of options about how the user could ask us to constrain the
planning to eliminate this issue are, off-hand:
Global GUC which enables/disables
Attribute of the view, perhaps indicated as 'CREATE SECURITY VIEW' or
similar
Something in the definition of the WHERE clause, eg: select * from x
where security(q = 50);

Anyone have thoughts about this?  Perhaps it's too early to discuss
this anyway, just trying to keep the discussion moving in some way.

> However, it is just my quick idea. It might miss something.
> We need to consider the matter for more details...

I agree, this needs more thought and input from others who are very
familiar with the planner, executor, etc.  Additionally, this needs
to be done before we can really go anywhere with row-level security.
Thanks,
    Stephen

Re: Row-Level Security

От

Robert Haas

Дата:

14 декабря 2009 г., 23:32:31

2009/12/14 KaiGai Kohei <kaigai@ak.jp.nec.com>:
> IIRC, one headache issue is that user may provide well indexable conditions,
> such as "SELECT * FROM view_x WHERE id = 1234". In this case, if we strictly
> keep the order of evaluation between inside and outside of the view, its
> performance penalty will over reasonable tradeoff to the better security.

If you don't allow the indexable qual to be pushed down into the view
in this situation, performance will be wretched.  I think we need to
distinguish between trusted and untrusted operations.  Everything in
the view definition is trusted.  And some other things... perhaps
access methods and some/most/all system catalog functions... are
trusted.  Other stuff is untrusted, and can't be pushed down.

I think there was a previous discussion of this when Heikki first
posted the issue to -hackers.

...Robert

Re: Row-Level Security

От

Stephen Frost

Дата:

15 декабря 2009 г., 08:20:18

* Robert Haas (robertmhaas@gmail.com) wrote:
> I think there was a previous discussion of this when Heikki first
> posted the issue to -hackers.

There was, it's now linked off the http://wiki.postgresql.org/wiki/RLS
page (as well as this thread).  Feel free to add other threads, update
with your thoughts, summarize what the thread conclusions were, etc...
Otherwise I'll have to. ;)  (Seriously, I'm planning to, but if you
could take a peek at what I've put up there so far, I wouldn't
complain).
Thanks,
    Stephen

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Row-Level Security