Обсуждение: That mode-700 check on DATADIR again

Поиск
Список
Период
Сортировка

That mode-700 check on DATADIR again

От
Chapman Flack
Дата:
I have, more or less, this classic question:

https://www.postgresql.org/message-id/4667C403.1070807%40t3go.de

But I have it for a newer reason, where again it seems as if a better
answer than "don't do that" might be worth having.

1. Suppose you are running PG in a VM (named pgvm just for exposition).

2. Suppose DATADIR in this VM is a direct virtio mount of an LVM
   logical volume on the host machine (vg0/pgdata, say).

3. Suppose you've thought of a quick way to snag a copy of this
   volume to build a replica machine:

   pg_start_backup(...)
   virsh domfsfreeze pgvm /var/lib/pgsql/x.x/data
   lvm lvcreate --snapshot vg/pgdata --name snap_data
   virsh domfsthaw pgvm
   pg_stop_backup()

   and then copy from snap_data at leisure.

4. Suppose the guest OS running in the VM has an SELinux policy
   in force.

Those virsh domfsfreeze / domfsthaw operations work by libvirt
RPC-ing into a qemu-ga (qemu guest agent) process running
in the guest; the agent then opens the given mountpoint path
/var/lib/pgsql/x.x/data O_RDONLY, and does an ioctl(FIFREEZE) on it.

When you see the "failed to open: ... Permission denied" from qemu-ga
(which runs as root!), you pretty much know SELinux is involved.

Turns out it's involved at two levels.

First is the familiar type-enforcement job SELinux does. The
DATADIR's context has type postgresql_db_t, and the policy doesn't
say a process with type virt_qemu_ga_t gets to do open, read, search,
or ioctl on it, root or no root. That's simple, add a policy module with
 allow virt_qemu_ga_t postgresql_db_t:dir { ioctl open read search };
problem solved.

The second level's more of a head-scratcher, because it involves *two*
interacting linuxisms, SELinux and capabilities. It turns out qemu-ga
is run in a mode where capabilities govern, and its root-ness means
nothing special. It can't touch DATADIR (or its parent or grandparent,
for that matter), for the simple reason that they are all
rwx------ postgres postgres, so a poor nonspecial user named root
has no access at all.

That would not be a problem if qemu-ga could exercise the
CAP_DAC_READ_SEARCH or CAP_DAC_OVERRIDE capabilities, but the
SELinux policy again steps in to deny that.

One fix would be to make another tweak to the SELinux policy,
allowing qemu-ga to use CAP_DAC_READ_SEARCH. But that's overly
broad, and would let qemu-ga read/search everything everywhere.

Really, all that's needed is to add ACLs on DATADIR and its parent
and grandparent, granting rx (for DATADIR) or x (its ancestors) to
the single specific user named root.

And that works, and qemu-ga can freeze and thaw the filesystem,
and all is well.

Until you happen to stop postgres, and then find out it refuses to
restart.

Because the way POSIX ACLs work, after adding a "root may r-x" rule
to DATADIR, it looks like this:

[data]# getfacl .
# file: .
# owner: postgres
# group: postgres
user::rwx
user:root:r-x
group::---
mask::r-x
other::---

Seeing the group::--- and other::--- ought to make postgres happy,
right? There is no 'group' access, and no 'other' access, nothing
but 'rwx' for postgres itself, and 'r-x' for root. This ought to
please even the strictest security zealot.

However, when you stat a file with a POSIX ACL, you get shown the
ACL's 'mask' entry (essentially the ceiling of all the 'extra' ACL
entries) in place of the old-fashioned group bits. So in a
non-ACL-aware ls or stat, the above looks like:

[data]# ls -ld
drwxr-x---+ 22 postgres postgres 4096 Dec 11 18:14 .

... and postgres refuses to start because it mistakes the r-x for
'group' permissions. ACLs have added new semantics to POSIX
permissions, and postgres doesn't understand them when it makes this
hey-don't-shoot-your-foot check.

So, it seems there's at least one use case where some kind of
no_really_the_datadir_permissions_are_fine option would be welcome
to get around a well-intended but sometimes broken check.

And really, this is just a specific example of the general principle
that security checks made in an LBYL rather than EAFP manner are
always at risk of drawing wrong conclusions because they can't be
omniscient about all the factors that apply in every environment.
So it's always a good idea to provide an escape hatch for that kind of
check.

Isn't it?

-Chap


Re: That mode-700 check on DATADIR again

От
Stephen Frost
Дата:
Greetings Chapman,

* Chapman Flack (chap@anastigmatix.net) wrote:
> I have, more or less, this classic question:
>
> https://www.postgresql.org/message-id/4667C403.1070807%40t3go.de

[...]

> So, it seems there's at least one use case where some kind of
> no_really_the_datadir_permissions_are_fine option would be welcome
> to get around a well-intended but sometimes broken check.

There's multiple use-cases for this, and some efforts are being made to
specifically address these cases.

> So it's always a good idea to provide an escape hatch for that kind of
> check.
>
> Isn't it?

Patches are in the works (the ground-work having been committed earlier
this cycle...) to be more flexible in this area.  The unfortunate part
is that this is all PG11 work at this point, but, with a bit of luck and
some hard work, we'll have this improved for that release.

This effort may not address all use-cases, of course, but the plan is to
at least address standard unix group privileges, to allow a non-root /
non-PG-superuser, to be able to run a file-level backup of PG.  If there
are other reasonable use-cases which still need to be addressed beyond
that, then hopefully we can work out a sensible way to build on what's
been done for those as well.

If you have specific questions or comments on this, I'd suggest chatting
with David Steele, who is working on this, and whom I've CC'd here.

Thanks!

Stephen

Вложения

Re: That mode-700 check on DATADIR again

От
David Steele
Дата:
On 12/11/17 9:41 PM, Chapman Flack wrote:
> I have, more or less, this classic question:
> 
> https://www.postgresql.org/message-id/4667C403.1070807%40t3go.de

<snip>

> However, when you stat a file with a POSIX ACL, you get shown the
> ACL's 'mask' entry (essentially the ceiling of all the 'extra' ACL
> entries) in place of the old-fashioned group bits. So in a
> non-ACL-aware ls or stat, the above looks like:
> 
> [data]# ls -ld
> drwxr-x---+ 22 postgres postgres 4096 Dec 11 18:14 .
> 
> ... and postgres refuses to start because it mistakes the r-x for
> 'group' permissions. ACLs have added new semantics to POSIX
> permissions, and postgres doesn't understand them when it makes this
> hey-don't-shoot-your-foot check.

I'm working on a patch that allows $PGDATA to have group r-x so that a 
non-privileged user in the group can do a file-level backup.

However, it looks like it would work for your case as well based on your 
ACL.

I'm planning to have the patch ready sometime next week and I'll respond 
here once it goes into the CF.  Reviews would be welcome!

Thanks,
-- 
-David
david@pgmasters.net