Обсуждение: Allowing users to create objects in version controlled schema
I want to get some feedback on the idea of allowing users to create objects in a database whose schema is already managed with version controlled migrations. One of my team mates is displeased with our setup because he cannot create his own tables without bypassing version control. This is a bad idea IMO even if it's technically possible to give users a reserved shared namespace that is never touched by versioned migrations. The following issues come to my mind. Issues 1 and 2 are solvable while issues 3 and 4 are just sources of unnecessary frustration on my part. 1. There are multiple users each with a dedicated database user. Consequently, table owners will vary but those tables should be accessible to any user. The only solution I can find (besides giving superuser privileges, yikes!) is to use an event trigger on ddl_command_end to handle CREATE TABLE and change ownership to a group. Is there something like CREATE SCHEMA AUTHORIZATION which applies to objects created afterwards with separate DDL? Or just leave the owner as it is and resort to DEFAULT PRIVILEGES instead? 2. Users must not create views and procedures that depend on objects managed by versioned migrations. Otherwise migrations may fail or break procedures unexpectedly due to untracked dependencies. Event triggers can prevent that as documented for table rewrites[1]. 3. Reinventing the wheel and data duplication if anyone can haphazardly create new objects. It is also difficult to enforce best practices in that case. 4. Harder to reproduce bugs if the complete database schema cannot be recreated from version control. My background is in web applications where you have a nice separation of database layer and application layer. In our case, users work directly with a PostGIS-enabled database via QGIS and pgAdmin, hence the "need" (or wish) to create own tables because it looks like a quick and easy solution to them. I still have to accustom to this setup where people directly access the database instead of having a layer of abstraction in top. Has anybody experience with such a setup? More arguments against it are appreciated. Solutions are also welcome. [1] https://www.postgresql.org/docs/15/event-trigger-table-rewrite-example.html -- Erik
There is a CREATE ON SCHEMA myschema TO whomever privilege.
But ALTERS/DROPS require the owner role to be granted to such user.
Erik Wienhold wrote on 11/7/2022 12:28 PM:
But ALTERS/DROPS require the owner role to be granted to such user.
Erik Wienhold wrote on 11/7/2022 12:28 PM:
I want to get some feedback on the idea of allowing users to create objects in a database whose schema is already managed with version controlled migrations. One of my team mates is displeased with our setup because he cannot create his own tables without bypassing version control. This is a bad idea IMO even if it's technically possible to give users a reserved shared namespace that is never touched by versioned migrations. The following issues come to my mind. Issues 1 and 2 are solvable while issues 3 and 4 are just sources of unnecessary frustration on my part. 1. There are multiple users each with a dedicated database user. Consequently, table owners will vary but those tables should be accessible to any user. The only solution I can find (besides giving superuser privileges, yikes!) is to use an event trigger on ddl_command_end to handle CREATE TABLE and change ownership to a group. Is there something like CREATE SCHEMA AUTHORIZATION which applies to objects created afterwards with separate DDL? Or just leave the owner as it is and resort to DEFAULT PRIVILEGES instead? 2. Users must not create views and procedures that depend on objects managed by versioned migrations. Otherwise migrations may fail or break procedures unexpectedly due to untracked dependencies. Event triggers can prevent that as documented for table rewrites[1]. 3. Reinventing the wheel and data duplication if anyone can haphazardly create new objects. It is also difficult to enforce best practices in that case. 4. Harder to reproduce bugs if the complete database schema cannot be recreated from version control. My background is in web applications where you have a nice separation of database layer and application layer. In our case, users work directly with a PostGIS-enabled database via QGIS and pgAdmin, hence the "need" (or wish) to create own tables because it looks like a quick and easy solution to them. I still have to accustom to this setup where people directly access the database instead of having a layer of abstraction in top. Has anybody experience with such a setup? More arguments against it are appreciated. Solutions are also welcome. [1] https://www.postgresql.org/docs/15/event-trigger-table-rewrite-example.html -- Erik
Regards,
Michael Vitale
703-600-9343
Вложения
Why does a developer(?) need to create his own tables in a production(?) database? On 11/7/22 11:28, Erik Wienhold wrote: > I want to get some feedback on the idea of allowing users to create objects in > a database whose schema is already managed with version controlled migrations. > > One of my team mates is displeased with our setup because he cannot create his > own tables without bypassing version control. This is a bad idea IMO even if > it's technically possible to give users a reserved shared namespace that is > never touched by versioned migrations. > > The following issues come to my mind. Issues 1 and 2 are solvable while issues > 3 and 4 are just sources of unnecessary frustration on my part. > > 1. There are multiple users each with a dedicated database user. Consequently, > table owners will vary but those tables should be accessible to any user. > The only solution I can find (besides giving superuser privileges, yikes!) > is to use an event trigger on ddl_command_end to handle CREATE TABLE and > change ownership to a group. > > Is there something like CREATE SCHEMA AUTHORIZATION which applies to objects > created afterwards with separate DDL? Or just leave the owner as it is and > resort to DEFAULT PRIVILEGES instead? > > 2. Users must not create views and procedures that depend on objects managed by > versioned migrations. Otherwise migrations may fail or break procedures > unexpectedly due to untracked dependencies. Event triggers can prevent that > as documented for table rewrites[1]. > > 3. Reinventing the wheel and data duplication if anyone can haphazardly create > new objects. It is also difficult to enforce best practices in that case. > > 4. Harder to reproduce bugs if the complete database schema cannot be recreated > from version control. > > My background is in web applications where you have a nice separation of > database layer and application layer. In our case, users work directly with > a PostGIS-enabled database via QGIS and pgAdmin, hence the "need" (or wish) to > create own tables because it looks like a quick and easy solution to them. > > I still have to accustom to this setup where people directly access the database > instead of having a layer of abstraction in top. > > Has anybody experience with such a setup? More arguments against it are > appreciated. Solutions are also welcome. > > [1] https://www.postgresql.org/docs/15/event-trigger-table-rewrite-example.html > > -- > Erik > > -- Angular momentum makes the world go 'round.
> On 07/11/2022 21:18 CET Ron <ronljohnsonjr@gmail.com> wrote: > > On 11/7/22 11:28, Erik Wienhold wrote: > > I want to get some feedback on the idea of allowing users to create objects in > > a database whose schema is already managed with version controlled migrations. > > > > One of my team mates is displeased with our setup because he cannot create his > > own tables without bypassing version control. This is a bad idea IMO even if > > it's technically possible to give users a reserved shared namespace that is > > never touched by versioned migrations. > > Why does a developer(?) need to create his own tables in a production(?) > database? Not developers as in "software developer". They work in QGIS to modify data, create maps/documents. Also pgAdmin or psql to export CSV. For that purpose they may also write queries. On the production database of course. There's also a web application on top where I have more control but I still have to deal with the database schema. As far as I can tell, own tables are "necessary" to prepare geometric data for layouts. Beats me why those tables have to be created that way and where the data is coming from. Or why it has to be a table and not a view. Maybe it's the mindset and habit. The team started before I joined. Nobody had any experience in software development much less database development. It all started with databases where everybody was superuser. A dumping ground for CSV imports as I like to call it. I still see a lot of mistakes[1] being made. That's why I am pushing for version control and proper role management. [1] https://wiki.postgresql.org/wiki/Don%27t_Do_This -- Erik
Feels like more of a process question and less of a technical postgres question. In our world, each DB user has their own schema they have full access to, and can grant privileges as required. All schema definitions tables/views/etc are versioned using git in a structured repository of flat sql files. It works well. Table defs etc get code reviewed and deployed by more DB-minded engineers, and people are cognizant of writing optimized SQL. Maybe I misunderstand your situation, though. postgres itself offers no "version control"
On Mon, Nov 7, 2022 at 5:10 PM Erik Wienhold <ewie@ewie.name> wrote:
> On 07/11/2022 21:18 CET Ron <ronljohnsonjr@gmail.com> wrote:
>
> On 11/7/22 11:28, Erik Wienhold wrote:
> > I want to get some feedback on the idea of allowing users to create objects in
> > a database whose schema is already managed with version controlled migrations.
> >
> > One of my team mates is displeased with our setup because he cannot create his
> > own tables without bypassing version control. This is a bad idea IMO even if
> > it's technically possible to give users a reserved shared namespace that is
> > never touched by versioned migrations.
>
> Why does a developer(?) need to create his own tables in a production(?)
> database?
Not developers as in "software developer". They work in QGIS to modify data,
create maps/documents. Also pgAdmin or psql to export CSV. For that purpose
they may also write queries. On the production database of course.
There's also a web application on top where I have more control but I still have
to deal with the database schema.
As far as I can tell, own tables are "necessary" to prepare geometric data for
layouts. Beats me why those tables have to be created that way and where the
data is coming from. Or why it has to be a table and not a view.
Maybe it's the mindset and habit. The team started before I joined. Nobody had
any experience in software development much less database development. It all
started with databases where everybody was superuser. A dumping ground for CSV
imports as I like to call it. I still see a lot of mistakes[1] being made.
That's why I am pushing for version control and proper role management.
[1] https://wiki.postgresql.org/wiki/Don%27t_Do_This
--
Erik
Wells Oliver
wells.oliver@gmail.com
wells.oliver@gmail.com
> On 08/11/2022 02:19 CET Wells Oliver <wells.oliver@gmail.com> wrote: > > Feels like more of a process question and less of a technical postgres > question. You're right. But I'm wondering what's possible on the technical front. > In our world, each DB user has their own schema they have full access to, and > can grant privileges as required. Right, the standard approach with search_path = "$user". > All schema definitions tables/views/etc are versioned using git in a structured > repository of flat sql files. It works well. Table defs etc get code reviewed > and deployed by more DB-minded engineers, That's what I meant with version controlled migrations. And I actually don't want users to bypass that process to create and reference their own database structures when those structures should rather be part of the common application schema. > and people are cognizant of writing optimized SQL. That's where I have my doubts when it comes to inexperienced "devs" and no guiding code reviews. -- Erik
Eric - I am both (PostgreSQL) database and QGIS developer and occasionally a DB admin. From that point of view: There is a ton of use cases, where having a private schema for each QGIS user makes sense. For example, GIS analysis will often create datasets with limited lifetime, i.e being part of a process. These datasets has to be stored somewhere. If it's not in a private schema, you'll have them splattered out over your entire filesystems in different file-based GIS formats, including some on the users local hard drive. Further, having even temporary data stored in your database gives the user opportunity to off-load some heavy-duty GIS analysis to the database using the PostGIS extension. Think of your database as a "file system", a "scratch pad". This kind of work is fundamentally different from a traditional 3-tier setup where your normal user probably is not even aware of the underlying data structure. And structure changes can be "hidden" in the business logic layer. > All schema definitions tables/views/etc are versioned using git in a structured > repository of flat sql files. It works well. Table defs etc get code reviewed > and deployed by more DB-minded engineers, This is a good idea. When some kind of analysis workflow is worked out / stabilised , the resulting tables/views etc. can be quality-checked and transferred to the "proper" database. You could establish a separate database (including -server ?) and let QGIS users access the central database using Foreign Data Wrappers. Med venlig hilsen / Best regards Bo Victor Thomsen Den 08-11-2022 kl. 03:29 skrev Erik Wienhold: >> On 08/11/2022 02:19 CET Wells Oliver <wells.oliver@gmail.com> wrote: >> >> Feels like more of a process question and less of a technical postgres >> question. > You're right. But I'm wondering what's possible on the technical front. > >> In our world, each DB user has their own schema they have full access to, and >> can grant privileges as required. > Right, the standard approach with search_path = "$user". > >> All schema definitions tables/views/etc are versioned using git in a structured >> repository of flat sql files. It works well. Table defs etc get code reviewed >> and deployed by more DB-minded engineers, > That's what I meant with version controlled migrations. And I actually don't > want users to bypass that process to create and reference their own database > structures when those structures should rather be part of the common application > schema. > >> and people are cognizant of writing optimized SQL. > That's where I have my doubts when it comes to inexperienced "devs" and no > guiding code reviews. > > -- > Erik > >
Ok, regardless of the the use case, if an object is managed by a process, use the process for those objects. So, to your point, schema separation with no access to modify database or managed schemas would be the minimal abstraction I would consider (no superuser access). I would probably create another server and FDW after reading other replies.
In my opinion, it is impossible to speak to security, data integrity, or reliability of the system without sane permissions. At that point, you are one bad statement away from a headache.
Perhaps requesting the phone numbers for every superuser so you can add them to an oncall rotation / alerts would resolve the preference for access level.
Best regards,
Dan Smith
On Tue, Nov 8, 2022, 00:42 Bo Victor Thomsen <bo.victor.thomsen@gmail.com> wrote:
Eric -
I am both (PostgreSQL) database and QGIS developer and occasionally a DB
admin. From that point of view:
There is a ton of use cases, where having a private schema for each QGIS
user makes sense. For example, GIS analysis will often create datasets
with limited lifetime, i.e being part of a process. These datasets has
to be stored somewhere. If it's not in a private schema, you'll have
them splattered out over your entire filesystems in different file-based
GIS formats, including some on the users local hard drive.
Further, having even temporary data stored in your database gives the
user opportunity to off-load some heavy-duty GIS analysis to the
database using the PostGIS extension. Think of your database as a "file
system", a "scratch pad".
This kind of work is fundamentally different from a traditional 3-tier
setup where your normal user probably is not even aware of the
underlying data structure. And structure changes can be "hidden" in the
business logic layer.
> All schema definitions tables/views/etc are versioned using git in a structured
> repository of flat sql files. It works well. Table defs etc get code reviewed
> and deployed by more DB-minded engineers,
This is a good idea. When some kind of analysis workflow is worked out /
stabilised , the resulting tables/views etc. can be quality-checked and
transferred to the "proper" database.
You could establish a separate database (including -server ?) and let
QGIS users access the central database using Foreign Data Wrappers.
Med venlig hilsen / Best regards
Bo Victor Thomsen
Den 08-11-2022 kl. 03:29 skrev Erik Wienhold:
>> On 08/11/2022 02:19 CET Wells Oliver <wells.oliver@gmail.com> wrote:
>>
>> Feels like more of a process question and less of a technical postgres
>> question.
> You're right. But I'm wondering what's possible on the technical front.
>
>> In our world, each DB user has their own schema they have full access to, and
>> can grant privileges as required.
> Right, the standard approach with search_path = "$user".
>
>> All schema definitions tables/views/etc are versioned using git in a structured
>> repository of flat sql files. It works well. Table defs etc get code reviewed
>> and deployed by more DB-minded engineers,
> That's what I meant with version controlled migrations. And I actually don't
> want users to bypass that process to create and reference their own database
> structures when those structures should rather be part of the common application
> schema.
>
>> and people are cognizant of writing optimized SQL.
> That's where I have my doubts when it comes to inexperienced "devs" and no
> guiding code reviews.
>
> --
> Erik
>
>
> On 08/11/2022 05:42 CET Bo Victor Thomsen <bo.victor.thomsen@gmail.com> wrote: > > I am both (PostgreSQL) database and QGIS developer and occasionally a DB > admin. From that point of view: > > There is a ton of use cases, where having a private schema for each QGIS > user makes sense. For example, GIS analysis will often create datasets > with limited lifetime, i.e being part of a process. These datasets has > to be stored somewhere. If it's not in a private schema, you'll have > them splattered out over your entire filesystems in different file-based > GIS formats, including some on the users local hard drive. > > Further, having even temporary data stored in your database gives the > user opportunity to off-load some heavy-duty GIS analysis to the > database using the PostGIS extension. Think of your database as a "file > system", a "scratch pad". > > This kind of work is fundamentally different from a traditional 3-tier > setup where your normal user probably is not even aware of the > underlying data structure. And structure changes can be "hidden" in the > business logic layer. > > > All schema definitions tables/views/etc are versioned using git in a structured > > repository of flat sql files. It works well. Table defs etc get code reviewed > > and deployed by more DB-minded engineers, > > This is a good idea. When some kind of analysis workflow is worked out / > stabilised , the resulting tables/views etc. can be quality-checked and > transferred to the "proper" database. > > You could establish a separate database (including -server ?) and let > QGIS users access the central database using Foreign Data Wrappers. Thanks a lot. I don't have my team mate's requirement yet so I can't say if private schemas are an option. If those tables are used as layers, I assume they are meant to be shared with other users. I think in that case it is also better to have them manage privileges manually instead of automating that part as I've outlined in my original post. It still gives me headaches when I think about users who may create objects with dependencies on the versioned part of the schema. Those dependencies may prevent migrations that alter dependent objects. But this can indeed be isolated via FDW with some overhead to keep the schemas in sync. -- Erik