Обсуждение: PostgreSQL Authentication and Pooling in JBoss : static 'postgres' user VS dynamic jboss user

Поиск
Список
Период
Сортировка

PostgreSQL Authentication and Pooling in JBoss : static 'postgres' user VS dynamic jboss user

От
Achilleas Mantzios
Дата:
Hello, and merry xmas,
Traditionally we have been using jboss JCA pools with postgresql by having every connection authenticate
against the static postgres user, or 3 other predifined static users.
This was working ok, but made debugging hard, there was no way to confidently link log from the jboss log
with log from postgresql, or just find out which user causes a faulty situation. The pgsql user was also one of those
4.
Lately i have been thinking of authenticating the connections in jboss JCA pools against the active web user,
i configured it, and it actually works in some tests i made. I find this more attractive, since it will give much
betterstatistics, 
( i would trust pgsql stats over jboss stats).
What i'd like to ask, is if anyone of you has done this and had a good experience. The resaons why i ask, is that
documentation in jboss is minimal regarding this issue, and the commonly usual practice is to use the same
user when creating connections in the pool. I should ask the jboss ppl as well, but my experiences on their camp
were not very encouraging (i never got any serious feedback from there).

Well, i'd just like to hear your thoughts on that.

--
Achilleas Mantzios

Re: PostgreSQL Authentication and Pooling in JBoss : static 'postgres' user VS dynamic jboss user

От
John R Pierce
Дата:
On 12/28/10 2:00 AM, Achilleas Mantzios wrote:
> Hello, and merry xmas,
> Traditionally we have been using jboss JCA pools with postgresql by having every connection authenticate
> against the static postgres user, or 3 other predifined static users.
> This was working ok, but made debugging hard, there was no way to confidently link log from the jboss log
> with log from postgresql, or just find out which user causes a faulty situation. The pgsql user was also one of those
4.
> Lately i have been thinking of authenticating the connections in jboss JCA pools against the active web user,
> i configured it, and it actually works in some tests i made. I find this more attractive, since it will give much
betterstatistics, 
> ( i would trust pgsql stats over jboss stats).
> What i'd like to ask, is if anyone of you has done this and had a good experience. The resaons why i ask, is that
> documentation in jboss is minimal regarding this issue, and the commonly usual practice is to use the same
> user when creating connections in the pool. I should ask the jboss ppl as well, but my experiences on their camp
> were not very encouraging (i never got any serious feedback from there).
>
> Well, i'd just like to hear your thoughts on that.

I can't address jboss specific issues, but in general in postgres, all
the connections in a given pool have to be authenticated as the same
postgres user, otherwise you're reconnecting to postgres to authenticate
different users each time which defeats the whole purpose of a
connection pool....   so, are you thinking of creating a pool for each
web user?!?  that could get messy fast.







Re: PostgreSQL Authentication and Pooling in JBoss : static 'postgres' user VS dynamic jboss user

От
Achilleas Mantzios
Дата:
Στις Tuesday 28 December 2010 12:15:22 ο/η John R Pierce έγραψε:
> On 12/28/10 2:00 AM, Achilleas Mantzios wrote:
> > Hello, and merry xmas,
> > Traditionally we have been using jboss JCA pools with postgresql by having every connection authenticate
> > against the static postgres user, or 3 other predifined static users.
> > This was working ok, but made debugging hard, there was no way to confidently link log from the jboss log
> > with log from postgresql, or just find out which user causes a faulty situation. The pgsql user was also one of
those4. 
> > Lately i have been thinking of authenticating the connections in jboss JCA pools against the active web user,
> > i configured it, and it actually works in some tests i made. I find this more attractive, since it will give much
betterstatistics, 
> > ( i would trust pgsql stats over jboss stats).
> > What i'd like to ask, is if anyone of you has done this and had a good experience. The resaons why i ask, is that
> > documentation in jboss is minimal regarding this issue, and the commonly usual practice is to use the same
> > user when creating connections in the pool. I should ask the jboss ppl as well, but my experiences on their camp
> > were not very encouraging (i never got any serious feedback from there).
> >
> > Well, i'd just like to hear your thoughts on that.
>
> I can't address jboss specific issues, but in general in postgres, all
> the connections in a given pool have to be authenticated as the same
> postgres user, otherwise you're reconnecting to postgres to authenticate
> different users each time which defeats the whole purpose of a
> connection pool....   so, are you thinking of creating a pool for each
> web user?!?  that could get messy fast.
>

Yes that's the logic. One connection pool per user.
I find it more clever, fine grained and just better from many perspectives (security/auditing/diagnosis/debugging being
themain ones). 

We have LDAP based authentication in Jboss. And i use the very same LDAP server for postgresql authentication as well.
So i have one point of authentication for both Jboss, PostgreSQL, the very same LDAP server.
User logs in, and his credentials are delegated to PostgreSQL with *no* involvement of any application code.
So from a one-user monolithic declarative model we can go now on fully multi sql user, with *no changes* in the app,
or the .EAR configuration!!!
As we speak i am testing it with max 4 connections/user.
Both systems (postgresql, jboss) support this as is, with just recompilation (postgresql) / configuration (jboss).
The main question is twofold: If any one has any theoretical objections to this model,
and 2nd if anyone has encountered any problems by using this or smth similar.
>
>
>
>
>
>



--
Achilleas Mantzios

Achilleas Mantzios wrote:
> Yes that's the logic. One connection pool per user.
> I find it more clever, fine grained and just better from
> many perspectives (security/auditing/diagnosis/debugging being the main ones).

Too clever by half.  What do you do with the pool when a user is logged off
for a few days, then comes back?  How many pools will you have when a thousand
users log off and another thousand log on?  How many connections will you keep
in each pool?  How will the DB engine keep up with all that?

I predict that you will get no increase in security or auditing capability
that couldn't be handled with the conventional "one user per application"
approach, and that diagnosis and debugging will be complicated in some other
way, and that you'll have performance and resource issues.

> We have LDAP based authentication in Jboss. And i [sic] use the very same LDAP server for postgresql authentication
aswell. 
> So i [sic] have one point of authentication for both Jboss, PostgreSQL, the very same LDAP server.
> User logs in, and his credentials are delegated to PostgreSQL with *no* involvement of any application code.
> So from a one-user monolithic declarative model we can go now on fully multi sql [sic] user, with *no changes* in the
app,
> or the .EAR configuration!!!
> As we speak i [sic] am testing it with max 4 connections/user.
> Both systems (postgresql [sic], jboss [sic]) support this as is, with just recompilation (postgresql) / configuration
(jboss).
> The main question is twofold: If any one has any theoretical objections to this model,
> and 2nd if anyone has encountered any problems by using this or smth [sic] similar.

--
Lew
Ceci n'est pas une pipe.

Re: PostgreSQL Authentication and Pooling in JBoss : static 'postgres' user VS dynamic jboss user

От
Achilleas Mantzios
Дата:
Στις Tuesday 28 December 2010 15:26:11 ο/η Lew έγραψε:
> Achilleas Mantzios wrote:
> > Yes that's the logic. One connection pool per user.
> > I find it more clever, fine grained and just better from
> > many perspectives (security/auditing/diagnosis/debugging being the main ones).
>
> Too clever by half.  What do you do with the pool when a user is logged off
> for a few days, then comes back?

Personally, i do not do anything ;) Its the Jboss JCA pool that does the magic.
To answer your question, simply defining min-pool-size=0 will mean that when the user
goes home, after a defined timeout all his connections will be closed.
When he comes back tomorrow, his connection(s) will be automatically created.

> How many pools will you have when a thousand
> users log off and another thousand log on?  How many connections will you keep
> in each pool?  How will the DB engine keep up with all that?

Each pool will have the minimum connections, in order for the app to work ok.
e.g. if in my code the max simulteanous connections opened are 4, then i will define that number.
The max number of users is known.
>
> I predict that you will get no increase in security or auditing capability
> that couldn't be handled with the conventional "one user per application"
> approach, and that diagnosis and debugging will be complicated in some other
> way, and that you'll have performance and resource issues.
>

You most probably are referring to a low-complexity system.
Lets see this scenario. Imagine you want to indentify a cause for a variety of problems. Examples
a) High load, you want to find what/who causes a very high load on the server
b) Peculiar errors in the pgsql log files, with no streight forward source or explanation
(imagine loads of points of access for a db)

What do you do?
You identify the sql either by looking in pg_stat_activity or by looking at the error in the pgsql log.
Then you do a grep in the application to find possible programs which build and call such SQL.
(Unfortunately in the case of dynamic SQL generation our task becomes an unpleasant procedure of guessing).
Then you look in the jboss server logs and in the jboss access logs (hint : actual http GET/POST with users are *ONLY*
recordedin the access logs) 
to see which user might be calling that program/page. Imagine that the system gets many hits/second.
At this point, you are simply lost, in the best case you might have some clue about who run the program, but you would
notbe 
able to prove anything for sure.
Now compare this, with a nice line from the pgsql log file which looks like:
dynacom amantzio 92607 4d19fd48.169bf 2010-12-28 17:08:01.379 EET ERROR:  permission denied for relation vessels
(where amantzio is the user)

>
> --
> Lew
> Ceci n'est pas une pipe.
>



--
Achilleas Mantzios

Re: PostgreSQL Authentication and Pooling in JBoss : static 'postgres' user VS dynamic jboss user

От
Samuel Gendler
Дата:

On Tue, Dec 28, 2010 at 7:13 AM, Achilleas Mantzios <achill@matrix.gatewaynet.com> wrote:

You most probably are referring to a low-complexity system.
Lets see this scenario. Imagine you want to indentify a cause for a variety of problems. Examples
a) High load, you want to find what/who causes a very high load on the server

Unfortunately the most likely culprit in your system is going to be the huge number of connections that postgres is having to deal with.  You should read up on postgresql performance tuning and learn about how having large numbers of connections to the server can impact postgresql performance.  

If you are looking for systemic problems that are related to authorization, you are only going to create more of those kinds of problems by having such fine-grained authorization configuration within your database - forget to grant an authority to user y on table x when some app state changes and you suddenly have authorization errors.  

If you are looking for data issues like 'who was responsible for setting column x to value y', there are likely much better mechanisms for logging user behaviour than via the postgres logs.  Depending upon what frameworks you are using in your app, it isn't usually very difficult to put code in place which can call a stored proc in the db with the current web user's id/username at the start of any web request or db transaction.  From there, it is relatively simple to implement audit logging in the db - triggers on update/insert/delete which can grab the web user id associated with the current backend process id and write relevant info into audit logging tables or some other mechanism.

Basic mechanism I use is as follows:

- when connection gets assigned to request, call stored proc with unique identifier for web user. Stored proc associates user id with backend process id in a table.  I handle this via an aspect that gets applied to my service layer, but you could easily handle it via a wrapper around your connection pool/session factory just as easily.  The only trick is having a mechanism for resolving the user's identity from within your aspect/wrapper.  I use a static ContextHolder pattern which ensures that the relevant info is accessible from thread local storage as soon as the request starts processing inside the security layer.  If you're using spring-security, the SecurityContextHolder likely has everything you need already available.

- Add triggers to tables on update/insert/delete, as appropriate, which look up current user (via another stored proc that does lookup via backend process id), then inserts user id, type of operation, and NEW.* or OLD.* (for delete) into an audit log table that has the same schema, without unique constraints and with extra columns for holding audit info (timestamp, operation type, user_id, app name, anything else you choose to make available)

If you've got a huge transaction rate, this may not be your best solution, but a well configured postgres install uses pooling to keep active connections to a small number, so the user/backend association table will always be small and in cache, and insert into a table with no constraints is also quick unless you are already io-bound.  It does mean every insert/update/delete results in that operation plus another insert.  If you are read-mostly, this has effectively no real impact on performance. If you have high load and are write-mostly, then it could easily be an unacceptable increase (doubling) in load.

Re: PostgreSQL Authentication and Pooling in JBoss : static 'postgres' user VS dynamic jboss user

От
Achilleas Mantzios
Дата:
Στις Tuesday 28 December 2010 20:08:59 γράψατε:
> On Tue, Dec 28, 2010 at 7:13 AM, Achilleas Mantzios <
> achill@matrix.gatewaynet.com> wrote:
>
> >
> > You most probably are referring to a low-complexity system.
> > Lets see this scenario. Imagine you want to indentify a cause for a variety
> > of problems. Examples
> > a) High load, you want to find what/who causes a very high load on the
> > server
> >
>
> Unfortunately the most likely culprit in your system is going to be the huge
> number of connections that postgres is having to deal with.  You should read
> up on postgresql performance tuning and learn about how having large numbers
> of connections to the server can impact postgresql performance.
On the practical side, increasing the max_connections number to 2000 and having created 1000 connections
on  my test FreeBSD system did not produce any visible degradation in performance.
Now on the theoretical side, It is true that connections eat up (some small mem) from shared mem.
It is true that memory usage will increase by the use of work_mem when simultaneous users perform sorting, etc
also it is true that after several 100s of connections, the task of opening a new connections becomes
a rather expensive and resource intensive operation, but the end result is nowhere as bad as you seem to believe.
Anyways i am not endorsing anything of all this.
All i am saying is that if someone has an accurate aproximation of the distribution of connections
and execution of statements in the system, then he can deploy a setup like the one described,
without killing his performance.
Our system accepts about 100 distinct users per day average, with max value observed during the last 10 months = 123
distinctusers/day. 
(they are ordinary office employees)
So simply setting max_connections=400 would solve the problem, (current value is 100) even in the xtreme case
when all 100 users run the most complex program in the app (requiring 4 connections), in the same time span (as defined
byidle-timeout-minutes) 
(lets remind you here once again that we are talking about jboss here)
Of course i will not have to do that. Post probably i will leave the current value of max_connections,
but i will have to play with  idle-timeout-minutes, (decrease it, to make room faster for new connections)
>
> If you are looking for systemic problems that are related to authorization,
> you are only going to create more of those kinds of problems by having such
> fine-grained authorization configuration within your database - forget to
> grant an authority to user y on table x when some app state changes and you
> suddenly have authorization errors.
This is not a problem, for every new table the initial auth script runs again and it will be all right.
(since GRANT ALL ON ALL TABLES IN SCHEMA <schema> to <user> does not
have any impact for future tables, otherwise not even that would be need).
BTW, while we are at it, would you prefer "forgetting" to grant a user the right  to perform a certain task
or giving anyone the power to exploit any SQL holes in the application run by the one and only postgres super user? ;D
>
> If you are looking for data issues like 'who was responsible for setting
[snip]
> If you
> have high load and are write-mostly, then it could easily be an unacceptable
> increase (doubling) in load.
>

Not only that, but i think that this is a userland solution. I prefer things to be done system-wise.
As an Example, if i deploy the setup as i present it here, and if indeed it turns up to be a disaster
it will cost me about 10 seconds to go into ../server/default/deploy/postgres-xa-ds.xml
comment the <security-domain>somesecdomain</security-domain> tag, uncomment user and password
and we are back to the same good old OnePool (JBossManagedConnectionPool)

Now what would be ultra cool, is if we had the means to reauthenticate inside the same connection.
As far as i have tested (but with a limited scope) with psql client, i didnt manage to do that.
If we could reauthenticate in the same connection, then we would have one standard traditional pool
with many shared connections and the user would be authenticated within this existing connection.

I do not know if this notion is supported in the postgersql protocol, jdbc, or jboss pool
in order to be a viable alternative.

If smth like this existed it would bring an long lost (from the mainframe era) functionality to the
J2EE world!!
--
Achilleas Mantzios

Re: PostgreSQL Authentication and Pooling in JBoss : static 'postgres' user VS dynamic jboss user

От
Samuel Gendler
Дата:


On Wed, Dec 29, 2010 at 3:25 AM, Achilleas Mantzios <achill@matrix.gatewaynet.com> wrote:
Στις Tuesday 28 December 2010 20:08:59 γράψατε:
> On Tue, Dec 28, 2010 at 7:13 AM, Achilleas Mantzios <
> achill@matrix.gatewaynet.com> wrote:
>
> >
> > You most probably are referring to a low-complexity system.
> > Lets see this scenario. Imagine you want to indentify a cause for a variety
> > of problems. Examples
> > a) High load, you want to find what/who causes a very high load on the
> > server
> >
>
> Unfortunately the most likely culprit in your system is going to be the huge
> number of connections that postgres is having to deal with.  You should read
> up on postgresql performance tuning and learn about how having large numbers
> of connections to the server can impact postgresql performance.
On the practical side, increasing the max_connections number to 2000 and having created 1000 connections
on  my test FreeBSD system did not produce any visible degradation in performance.
Now on the theoretical side, It is true that connections eat up (some small mem) from shared mem.
It is true that memory usage will increase by the use of work_mem when simultaneous users perform sorting, etc
also it is true that after several 100s of connections, the task of opening a new connections becomes
a rather expensive and resource intensive operation, but the end result is nowhere as bad as you seem to believe.

1000 idle connections may not be much of a problem, but 1000 connections all performing work simultaneously can be very problematic as you compete for both CPU and I/O resources.  But if that's not the nature of your load, then it wouldn't be a problem, certainly.

 would you prefer "forgetting" to grant a user the right  to perform a certain task
or giving anyone the power to exploit any SQL holes in the application run by the one and only postgres super user? ;D

Well, I wouldn't generally endorse having an application connect as a postgres super user, or even as a user with anything more than the bare minimum required for the application to function. You can potentially isolate one customer's data from another, depending upon your schema, but there's no reason to think that a single user will have any more authorization to do malicious things to the database than a user in your multi-user scenario.

 
Not only that, but i think that this is a userland solution. I prefer things to be done system-wise.
As an Example, if i deploy the setup as i present it here, and if indeed it turns up to be a disaster
it will cost me about 10 seconds to go into ../server/default/deploy/postgres-xa-ds.xml
comment the <security-domain>somesecdomain</security-domain> tag, uncomment user and password
and we are back to the same good old OnePool (JBossManagedConnectionPool)

Granted, though it wouldn't take any longer than that to disable my mechanism by eliminating the aspect which causes the user to be set in the db.  If coded such that no stored user means no work in the trigger, then the triggers become very low overhead, and it isn't much more difficult to disable the triggers entirely.  Yes, it is a userland solution, but it also gives you visibility into userland state that is difficult to get in any other way (we track remote address of the user as well as user identity, for example, and we also get a full audit log of every change made to important tables including the ability to roll back an individual customer's state to an earlier point in time without playing complex games with backups).  But it is certainly not a solution for everyone.  I was just throwing it out there as an option that might solve the problem you are trying to solve.