Обсуждение: pgcrypto and database encryption
Going on with my evaluation of pgcrypto as a possible solution for our (EU) privacy law problems, I have to report the following facts: 1) pgcrypto (understandably) supplies just a small collections of server-side functions, not a general-purpose database-encryption system. 2) This means that pgcrypto does not supply you with any password management tool. All pgcrypto functions expect to receive the password from the calling SQL SELECT code as a parameter. 3) This means that you have to manage by yourself all the details of the password one-time entry at the RDBMS start-up time, its storing (in RAM, only) and its passage to the SQL SELECTs. IMHO, this is still far to satisfy our need for a standard and reliable solution to our "privacy law and DB encryption" problem. 4) What could actually solve our problem is something like the following scenario. Imagine that postmaster (or pg_ctrl) could accept a new CL parameter called "pw". This parameter would contain a sequence of comma-separated databasename/encryption-password pairs. I mean, something like this: postmaster -i -pw=postnuke:"arriba!",phpnuke:"blade runner" Imagine that postmaster stores these passwords in memory (and only in memory) as database-specific and database-wide string variables. Now, pgcrypto functions could read the required password directly from memory (like an environment variable). In this way, we could have an encrypted database and a simple way to manage its passwords. Any comment? 5) There is also a problem related to what pgcrypto can encrypt and what it cannot. For example: pgcrypto encrypt functions cannot be applied to DATE and TIME data types because it would mess up them and make them unacceptable by the RDBMS engine. We would need specific encrypted data types like ENCDATA and ENCTIME to handle these cases. Any comment? PS: I apologize for bothering americans with these all-EU details but it looks like this mailing list is the only one I can use for communicating with others EU pgsql-admins affected by this problem. ----------------------------------------- Alessandro Bottoni and Silvana Di Martino alessandrobottoni@interfree.it silvanadimartino@tin.it
On Sun, 7 Mar 2004, Silvana Di Martino wrote: > 3) This means that you have to manage by yourself all the details of the > password one-time entry at the RDBMS start-up time, its storing (in RAM, > only) and its passage to the SQL SELECTs. IMHO, this is still far to satisfy > our need for a standard and reliable solution to our "privacy law and DB > encryption" problem. > > 4) What could actually solve our problem is something like the following > scenario. Imagine that postmaster (or pg_ctrl) could accept a new CL > parameter called "pw". This parameter would contain a sequence of > comma-separated databasename/encryption-password pairs. I mean, something > like this: > > postmaster -i -pw=postnuke:"arriba!",phpnuke:"blade runner" > > Imagine that postmaster stores these passwords in memory (and only in memory) > as database-specific and database-wide string variables. Now, pgcrypto > functions could read the required password directly from memory (like an > environment variable). > > In this way, we could have an encrypted database and a simple way to manage > its passwords. > > Any comment? > > 5) There is also a problem related to what pgcrypto can encrypt and what it > cannot. For example: pgcrypto encrypt functions cannot be applied to DATE and > TIME data types because it would mess up them and make them unacceptable by > the RDBMS engine. We would need specific encrypted data types like ENCDATA > and ENCTIME to handle these cases. > > Any comment? Who has access to the unencrypted versions of the data? Are there people who will have access to the database who might have access to some of the encrypted data but not all or someone who has access to the database but none of the encrypted data? If so, then you presumably need a system where the decryption will only work for the appropriate people and that needs to be built in. Also, what manipulation do you want to do with these values in the database? Are you going to need to subtract two encrypted dates or determine that one is greater than the other? What about substring or ordering for encrypted strings? Does this have to be automatic or is explicit description of the fact that you want to decrypt them and then do something sufficient? Finally, there are some questions about reliability of the system as a whole. In a system like the -pw above, where do those passwords come from, is it from a human typing at a console? If so, you need to consider downtime and recovery time issues based on who has access to the passwords. This doesn't so much affect the law in question but it may affect contracts that you have if they have downtime requirements.
Silvana Di Martino wrote: > 4) What could actually solve our problem is something like the following > scenario. Imagine that postmaster (or pg_ctrl) could accept a new CL > parameter called "pw". This parameter would contain a sequence of > comma-separated databasename/encryption-password pairs. I mean, something > like this: > > postmaster -i -pw=postnuke:"arriba!",phpnuke:"blade runner" But you mentioned earlier that the DBA cannot know the passwords, so who is going to type all that in? Does the law require protection from a determined DBA, or just casual viewing by the DBA? *If* it's the latter, you could do something like this: 1. Export an environment variable , say PGMASTERPASS containing a hex encoded password, something like: PGMASTERPASS=0102000304 pg_ctl start 2. Use a C function to grab the value of the environment variable -- one exists in PL/R already. You could write your own based on that. 3. Combine the master password with other information to make it sufficiently unique as a key for your various purposes. For example, you might use the md5 hashed password for the current user from pg_shadow. This combining should be done securely -- I'd recommend taking the HMAC of the user password using the master as the key. The result of the HMAC becomes your data encryption/decryption key. > 5) There is also a problem related to what pgcrypto can encrypt and what it > cannot. For example: pgcrypto encrypt functions cannot be applied to DATE and > TIME data types because it would mess up them and make them unacceptable by > the RDBMS engine. We would need specific encrypted data types like ENCDATA > and ENCTIME to handle these cases. Just use bytea for the encrypted stuff, and write plpgsql functions to convert the bytea output of the decrypt function back to its native datatype. Here's a more-or-less complete example of what I mean by all this: --8<-------------------------------------------------------------------- create or replace function text2bytea(text) returns bytea as ' begin return $1; end; ' language plpgsql; create or replace function timestamp2bytea(timestamp with time zone) returns bytea as ' begin return $1; end; ' language plpgsql; create or replace function encrypt_timestamp(timestamp with time zone) returns bytea as ' declare v_in alias for $1; v_masterpass bytea; v_userpass bytea; v_key bytea; v_data bytea; v_iv bytea; --skip for simplicity at the moment begin select into v_masterpass decode(value,''hex'') from plr_environ() where name=''PGMASTERPASS''; select into v_userpass text2bytea(substr(passwd,4)) from pg_shadow where usename = current_user; v_key := hmac(v_userpass, v_masterpass, ''sha1''); v_data := timestamp2bytea(v_in); return encrypt(v_data, v_key, ''aes''); end; ' language plpgsql; create or replace function decrypt_timestamp(bytea) returns timestamp with time zone as ' declare v_in alias for $1; v_masterpass bytea; v_userpass bytea; v_key bytea; v_data bytea; v_iv bytea; --skip for simplicity at the moment begin select into v_masterpass decode(value,''hex'') from plr_environ() where name=''PGMASTERPASS''; select into v_userpass text2bytea(substr(passwd,4)) from pg_shadow where usename = current_user; v_key := hmac(v_userpass, v_masterpass, ''sha1''); v_data := decrypt(v_in, v_key, ''aes''); return v_data; end; ' language plpgsql; -- here you can see the master password regression=# select decode(value,'hex') from plr_environ() where name='PGMASTERPASS'; decode ---------------------- \001\002\000\003\004 (1 row) -- here is an encrypted timestamp regression=# select encrypt_timestamp(now()); encrypt_timestamp -------------------------------------------------------------------------------------------------- \340\333*\0221r\177\022e\011_]X \374\302Y\201\364\264\362\351e\331\006\266\\\331\236\300\256\335 (1 row) -- and to prove that it works, this example goes full circle regression=# select decrypt_timestamp(encrypt_timestamp(now())); decrypt_timestamp ------------------------------- 2004-03-07 10:16:56.192193-08 (1 row) --8<-------------------------------------------------------------------- As I said above (and others in this thread too), if the DBA (or anyone with root access on the database server) is sufficiently determined, they can get around this scheme and view whatever data they want. If you're really concerned about that scenario, the data should be encrypted in your application before it ever gets sent to the database, using a key that is unavailable on the database server. HTH, Joe
> Does the law require protection from a determined DBA, or just casual > viewing by the DBA? *If* it's the latter, you could do something like > this: The EU directive (and therefore the laws of indiviual countries) requires that if someone gets access to your *DATABASE* they cannot get personal details of individuals out of it. That is all. It is intended to protect people against the kind of idiotic organisations that put their customer lists in an Excel sheet on their extranet without a password. This thread has covered many interesting and diverting issues, but the fundamental issue of legal compliance is more than satisfied by: 1) Encrypting 'personal information' stored in a DB 2) Keeping the keys on a different server than the DB 3) Making reasonable efforts [1] to keep those keys secrets M [1] As far as I can tell from discussions with the Data Protection Registrar, you do not have to protect them against someone rooting the app server (since that is essentially impossible without silly investments in specialised hardware or other excessive costs).
Alle 17:40, domenica 7 marzo 2004, Stephan Szabo ha scritto: > Who has access to the unencrypted versions of the data? Are there people > who will have access to the database who might have access to some of the > encrypted data but not all or someone who has access to the database but > none of the encrypted data? If so, then you presumably need a system > where the decryption will only work for the appropriate people and that > needs to be built in. Yes: each "authorized operator" must be able to access data belonging to him, only, even if the database contains data belonging to many different operators. Moreover, SysAdmin (that are not "authorized operator") must not be able to access any data, even if they are able to access the database (as a whole) for maintenance. So, decryption must work for authorized operator only. > Also, what manipulation do you want to do with these values in the > database? Are you going to need to subtract two encrypted dates or > determine that one is greater than the other? Yes: we will need to perform basic operations on encrypted dates, times and so on. Of course, it would be great to be able to delegate the on-the-fly decryption/encryption work to the encryption system. > What about substring or > ordering for encrypted strings? Does this have to be automatic or is > explicit description of the fact that you want to decrypt them and then do > something sufficient? Same as above: we would need to perform substring-wise operations and ordering on encrypted strings. The encryption/decryption mechanism should be automatic. > > Finally, there are some questions about reliability of the system as a > whole. In a system like the -pw above, where do those passwords come from, > is it from a human typing at a console? If so, you need to consider > downtime and recovery time issues based on who has access to the > passwords. This doesn't so much affect the law in question but it may > affect contracts that you have if they have downtime requirements. The password should come from a human typing at the console (not our choice: law imposes it). Downtime will be affected: this is out of our control and must be accepted by our customers. Thanks for your comments. ----------------------------------------- Alessandro Bottoni and Silvana Di Martino alessandrobottoni@interfree.it silvanadimartino@tin.it
Alle 22:16, domenica 7 marzo 2004, matt@ymogen.net ha scritto: > The EU directive (and therefore the laws of indiviual countries) requires > that if someone gets access to your *DATABASE* they cannot get personal > details of individuals out of it. That is all. It is intended to protect > people against the kind of idiotic organisations that put their customer > lists in an Excel sheet on their extranet without a password. > > This thread has covered many interesting and diverting issues, but the > fundamental issue of legal compliance is more than satisfied by: > 1) Encrypting 'personal information' stored in a DB Fine! Now tell me: how do you perform such encryption on PostgreSQL? Using pgcrypto and suppling your password from external (PHP, Python) code? Where do you store this password? In a config.php file? How many different encryption/decryption programs/implementations/logics are you willing to have on your "n" PostgreSQL servers? How do you maintain them (after the original programmers are gone)? Using a few commercial RDBMS, it is just a matter to switch the encryption feature on and supply the required password each time you start the RDBMS service up. Unfortunately, PostgreSQL does not supply us with such a comfortable feature. So, how many commercial licenses of your favorite commercial RDBMS are you willing (or can you afford) to buy to replace all your PostgreSQL servers? How much time (programmer's working hours, each at 50 Euro average cost) are you willing to invest in converting your PostgreSQL databases to SQL Server, for example? I'm perfectly aware that law is clear and simple. Nevertheless, its implementation isn't. > 2) Keeping the keys on a different server than the DB Fine. How and when do you supply the password to the encryption/decryption process? On demand? At postmaster init time? Using which channel/method? XML-RPC? SOAP? How do you protect them from a hacker's program that tries to impersonate the legitimate encrypting program and ask for it? Once again, Devil is in the details... > 3) Making reasonable efforts [1] to keep those keys secrets > [1] As far as I can tell from discussions with the Data Protection > Registrar, you do not have to protect them against someone rooting the app > server (since that is essentially impossible without silly investments in > specialised hardware or other excessive costs). What does "reasonable" mean? We defined already that we are not forced to replace Police in fighting organized crime but we still have to define a lot of details. For example: 15.000 euro for a new database license and the porting of data is a "silly investemen" or a "reasonable investement"? Read the italian law and you will get surprised from the answer. Again, I'm sorry to bother you all with such details. Just give me a solution (that is: a PostgreSQL database encryption method I can actually use) and I will let you alone. See you. ----------------------------------------- Alessandro Bottoni and Silvana Di Martino alessandrobottoni@interfree.it silvanadimartino@tin.it
Alle 18:19, domenica 7 marzo 2004, Joe Conway ha scritto: > Silvana Di Martino wrote: > > 4) What could actually solve our problem is something like the following > > scenario. Imagine that postmaster (or pg_ctrl) could accept a new CL > > parameter called "pw". This parameter would contain a sequence of > > comma-separated databasename/encryption-password pairs. I mean, something > > like this: > > > > postmaster -i -pw=postnuke:"arriba!",phpnuke:"blade runner" > > But you mentioned earlier that the DBA cannot know the passwords, so who > is going to type all that in? Accordingly to law, a "authorized operator". A piece of paper transforms a generic user/sysadmin into a trusted person who can perform such operations (again, not our choice: law imposes it). > Does the law require protection from a determined DBA, or just casual > viewing by the DBA? *If* it's the latter, you could do something like this: > > 1. Export an environment variable , say PGMASTERPASS containing a hex > encoded password, something like: > > PGMASTERPASS=0102000304 pg_ctl start > > 2. Use a C function to grab the value of the environment variable -- one > exists in PL/R already. You could write your own based on that. > > 3. Combine the master password with other information to make it > sufficiently unique as a key for your various purposes. For example, > you might use the md5 hashed password for the current user from > pg_shadow. This combining should be done securely -- I'd recommend > taking the HMAC of the user password using the master as the key. The > result of the HMAC becomes your data encryption/decryption key. This looks like a viable solution. Thanks for the idea. > > 5) There is also a problem related to what pgcrypto can encrypt and what > > it cannot. For example: pgcrypto encrypt functions cannot be applied to > > DATE and TIME data types because it would mess up them and make them > > unacceptable by the RDBMS engine. We would need specific encrypted data > > types like ENCDATA and ENCTIME to handle these cases. > > Just use bytea for the encrypted stuff, and write plpgsql functions to > convert the bytea output of the decrypt function back to its native > datatype. Here's a more-or-less complete example of what I mean by all > this: Thanks again! I will use it! > As I said above (and others in this thread too), if the DBA (or anyone > with root access on the database server) is sufficiently determined, > they can get around this scheme and view whatever data they want. If > you're really concerned about that scenario, the data should be > encrypted in your application before it ever gets sent to the database, > using a key that is unavailable on the database server. Yes, sure: the encryption key must be unavailable on the db server (at least, unavailable on its hard disk). Using Blowfish or Rijandel and a 128 - 1024 character-long password it would be almost impossible to decript our data with widely available hardware in human-reasonable time. Nevertheless, this does not mean that the on-the-fly encryption/decryption process must necessarly take place out of the DB server/host (a hypothesys that would complicate our work a lot). Even a determined and skilled sysadmin will have big problems in finding, reading and decrypting a password stored in the volatile memory of a running (and strictly surveilled) host by a RDBMS application (a task similar to finding a mimetic computer virus in memory). Adding to this the fact that the original password could be supplied by a remote user (or process) operating on a SSH channel (encrypted channel + user/process PKI authentication on both channel ends), it could be even hard to get it by impersonating the legitimate RDBMS program and ask for it to the password server (a "man-in-the-middle" attack). Thanks again for your help. ----------------------------------------- Alessandro Bottoni and Silvana Di Martino alessandrobottoni@interfree.it silvanadimartino@tin.it
On Mon, Mar 08, 2004 at 08:37:37 +0000, Silvana Di Martino <silvanadimartino@tin.it> wrote: > Alle 18:19, domenica 7 marzo 2004, Joe Conway ha scritto: > > Silvana Di Martino wrote: > > > 4) What could actually solve our problem is something like the following > > > scenario. Imagine that postmaster (or pg_ctrl) could accept a new CL > > > parameter called "pw". This parameter would contain a sequence of > > > comma-separated databasename/encryption-password pairs. I mean, something > > > like this: > > > > > > postmaster -i -pw=postnuke:"arriba!",phpnuke:"blade runner" > > > > But you mentioned earlier that the DBA cannot know the passwords, so who > > is going to type all that in? > > Accordingly to law, a "authorized operator". A piece of paper transforms a > generic user/sysadmin into a trusted person who can perform such operations > (again, not our choice: law imposes it). This sounds like the real solution to your problem. You should do what you need to to make the system administrator an authorized user and then use more practical security methods to secure the data.
On Mon, 8 Mar 2004, Silvana Di Martino wrote: > Alle 17:40, domenica 7 marzo 2004, Stephan Szabo ha scritto: > > Who has access to the unencrypted versions of the data? Are there people > > who will have access to the database who might have access to some of the > > encrypted data but not all or someone who has access to the database but > > none of the encrypted data? If so, then you presumably need a system > > where the decryption will only work for the appropriate people and that > > needs to be built in. > > Yes: each "authorized operator" must be able to access data belonging to him, > only, even if the database contains data belonging to many different > operators. Moreover, SysAdmin (that are not "authorized operator") must not > be able to access any data, even if they are able to access the database (as > a whole) for maintenance. So, decryption must work for authorized operator > only. That's what I figured, but given a system where the password is given on startup to the database, you would need another layer for the security, have you though about how that would work? A not terrible option would be to compose the key from something given by the user (perhaps at session start with a query) with something already stored, but I don't know if the law would allow that. Also, there's a question about multiple databases in one instance that might want separate encryption esp where the authorized users might be non-overlapping sets. It might be better to allow a human to use a query to initialize the passwords. > > Also, what manipulation do you want to do with these values in the > > database? Are you going to need to subtract two encrypted dates or > > determine that one is greater than the other? > > Yes: we will need to perform basic operations on encrypted dates, times and so > on. Of course, it would be great to be able to delegate the on-the-fly > decryption/encryption work to the encryption system. > > > What about substring or > > ordering for encrypted strings? Does this have to be automatic or is > > explicit description of the fact that you want to decrypt them and then do > > something sufficient? > > Same as above: we would need to perform substring-wise operations and ordering > on encrypted strings. The encryption/decryption mechanism should be > automatic. As above, automatic runs into questions about who can do it and maint issues, like if encryption happens on insert and the admin cannot see the decrypted form, the admin cannot backup the database using something like pg_dump. This is probably best solved by doing whatever is necessary to make the admin authorized. It should be reasonably easy to make types that took input and encrypted it and had say appropriate output function that checked and decrypted along with implicit casts that do the same thing. It'd be CPU consuming, but workable. Here you run into the question of how it's configured again.
Alle 14:47, lunedì 8 marzo 2004, Bruno Wolff III ha scritto: > > Accordingly to law, a "authorized operator". A piece of paper transforms > > a generic user/sysadmin into a trusted person who can perform such > > operations (again, not our choice: law imposes it). > > This sounds like the real solution to your problem. You should do what > you need to to make the system administrator an authorized user and > then use more practical security methods to secure the data. I'm going to think very carefully over this possibility... ;-) ----------------------------------------- Alessandro Bottoni and Silvana Di Martino alessandrobottoni@interfree.it silvanadimartino@tin.it
Alle 15:08, lunedì 8 marzo 2004, Stephan Szabo ha scritto: > That's what I figured, but given a system where the password is given on > startup to the database, you would need another layer for the security, > have you though about how that would work? A not terrible option would be > to compose the key from something given by the user (perhaps at session > start with a query) with something already stored, but I don't know if the > law would allow that. > Also, there's a question about multiple databases in one instance that > might want separate encryption esp where the authorized users might be > non-overlapping sets. It might be better to allow a human to use a query > to initialize the passwords. I'm not sure (nobody can actually read the mind of our lawmakers) but I think that using the same "password's password" (aka "global server password") for all of the databases managed by a single instance of the PostgreSQL server would be fine if we are able to guarantee that each user has his own password. So, we should have this scenario: - a single "global server password", supplied just once at the server start-up by a trusted admin via SSH/SSL and used to encrypt/decrypt all other passwords. This password should never be stored on disk/db on the same host as the RDBMS. It must be encrypted and kept in volatile memory (RAM), only. If the server is shut down, the trusted admin will have to supply it again. - a set of database-specific passwords used to encrypt/decrypt data. These passwords could be encrypted using the "global server password" and safely stored into a table of the database. If we use strong encryption and a long password (Blowfish or Rijandel with a 512-1024 char-long key) to encrypt the keys it should be very hard (almost impossible) to crack them. The user that connect to a database, must authenticate himself using Ident, SSH or Kerberos. Once accepted the user's connection, a set of server-side function (pgcrypto) encrypt/decryp data using the database-specific key. This is almost the same technique used by Oracle. To implement it, PostgreSQL just lacks a way to supply the "global server password" at start-up. A SQL Query cannot be used for this both because the password must be visible to all the database managed by a server and because this password must not be stored inside the db. Note that the trusted administrator _can_ be replaced by a server process if: - the client process that asks for a "global server password" can authenticat itself in a reliable way (SSH, Kerberos, SSH any other PKI) as a trusted PostgreSQL server. - the communication channel is encrypted (SSH/SSL) > As above, automatic runs into questions about who can do it and maint > issues, like if encryption happens on insert and the admin cannot see the > decrypted form, the admin cannot backup the database using something like > pg_dump. This is probably best solved by doing whatever is necessary to > make the admin authorized. Well, data encryption is meant _also_ to protect data from the admin's eye. As long as I can see, there is no need to access or decrypt data for performing backup, restore and any other database-level operation. From the admin point of view, the database can and should be a black box. > It should be reasonably easy to make types that took input and encrypted > it and had say appropriate output function that checked and decrypted > along with implicit casts that do the same thing. It'd be CPU consuming, > but workable. Here you run into the question of how it's configured > again. Specific encrytped data types are coming out to be a real need. Without them, I cannot see how we could perform data sorting, comparison and any other basic operation. IMHO, this topic should be brought to the attention of the PostgreSQL developers with the intent to develop a standard set of reliable encrypted data types. I'm sure that performance would be hugely affected by encryption but, again, this is out of our reach: law imposes it. ----------------------------------------- Alessandro Bottoni and Silvana Di Martino alessandrobottoni@interfree.it silvanadimartino@tin.it
On Mon, 8 Mar 2004, Silvana Di Martino wrote: > Alle 15:08, luned� 8 marzo 2004, Stephan Szabo ha scritto: > > That's what I figured, but given a system where the password is given on > > startup to the database, you would need another layer for the security, > > have you though about how that would work? A not terrible option would be > > to compose the key from something given by the user (perhaps at session > > start with a query) with something already stored, but I don't know if the > > law would allow that. > > > Also, there's a question about multiple databases in one instance that > > might want separate encryption esp where the authorized users might be > > non-overlapping sets. It might be better to allow a human to use a query > > to initialize the passwords. > > I'm not sure (nobody can actually read the mind of our lawmakers) but I think > that using the same "password's password" (aka "global server password") for > all of the databases managed by a single instance of the PostgreSQL server > would be fine if we are able to guarantee that each user has his own > password. Except that you then need someone who is authorized to view all the databases correct since that user would potentially be able to get all the keys? That still seems to miss the case where the sets of authorized users have no intersections (which might come up in shared hosting). > The user that connect to a database, must authenticate himself using Ident, > SSH or Kerberos. Once accepted the user's connection, a set of server-side > function (pgcrypto) encrypt/decryp data using the database-specific key. I think there might still be issues with people who have access to some (presumably non-encrypted) data in the database while not having access to the encrypted data unless the SQL permissions system were sufficient. > This is almost the same technique used by Oracle. To implement it, PostgreSQL > just lacks a way to supply the "global server password" at start-up. A SQL > Query cannot be used for this both because the password must be visible to > all the database managed by a server and because this password must not be > stored inside the db. I think C functions with access to a shared memory block wouldn't fail those two considerations. > Note that the trusted administrator _can_ be replaced by a server process if: > - the client process that asks for a "global server password" can authenticat > itself in a reliable way (SSH, Kerberos, SSH any other PKI) as a trusted > PostgreSQL server. > - the communication channel is encrypted (SSH/SSL) This sounds alot better all in all. This also seems like it might be possible to prototype outside the database proper so as to get an idea of what it would look like. > > As above, automatic runs into questions about who can do it and maint > > issues, like if encryption happens on insert and the admin cannot see the > > decrypted form, the admin cannot backup the database using something like > > pg_dump. This is probably best solved by doing whatever is necessary to > > make the admin authorized. > > Well, data encryption is meant _also_ to protect data from the admin's eye. As > long as I can see, there is no need to access or decrypt data for performing > backup, restore and any other database-level operation. From the admin point > of view, the database can and should be a black box. But, given the fact that the form of the backup is a set of SQL statements, if the encryption is automatic on input, either the SQL statements contain encrypted data and you need some way to on restore tell the system to not encrypted them again or they contain decrypted data. If accessing the encrypted data without being properly authorized throws an error rather than giving you the encrypted form, how does the admin do the selects necessary on the table to even get the encrypted form. > > It should be reasonably easy to make types that took input and encrypted > > it and had say appropriate output function that checked and decrypted > > along with implicit casts that do the same thing. It'd be CPU consuming, > > but workable. Here you run into the question of how it's configured > > again. > > Specific encrytped data types are coming out to be a real need. Without them, > I cannot see how we could perform data sorting, comparison and any other > basic operation. IMHO, this topic should be brought to the attention of the > PostgreSQL developers with the intent to develop a standard set of reliable > encrypted data types. I'm sure that performance would be hugely affected by Given that it's possible that the types might be an input function, an output function and one or more type conversion functions, I'd think it'd be possible to prototype them first.
Alle 22:07, lunedì 8 marzo 2004, Stephan Szabo ha scritto: > > I'm not sure (nobody can actually read the mind of our lawmakers) but I > > think that using the same "password's password" (aka "global server > > password") for all of the databases managed by a single instance of the > > PostgreSQL server would be fine if we are able to guarantee that each > > user has his own password. > > Except that you then need someone who is authorized to view all the > databases correct since that user would potentially be able to get all the > keys? That still seems to miss the case where the sets of authorized > users have no intersections (which might come up in shared hosting). Yes, right! The DBAdmin would be authenticated and allowed to access _all_ of the databases living on a server and, as a consequence, the encryption system would decrypt and show him _all_ of the data in clear form. To avoid this, we would need an access control system able to distingush a DBAdmin from a authorized operator and act consquentely. This could be done with pgcrypto and PL/PGSQL functions if we had access to the current login username from within a PL/PGSL function. If not, we could likely do it from within a C function. In any case, we will have to implement the whole system. In any case, there will always be an intersection point among all user groups: the DBA(s). > > The user that connect to a database, must authenticate himself using > > Ident, SSH or Kerberos. Once accepted the user's connection, a set of > > server-side function (pgcrypto) encrypt/decryp data using the > > database-specific key. > > I think there might still be issues with people who have access to some > (presumably non-encrypted) data in the database while not having access to > the encrypted data unless the SQL permissions system were sufficient. While pgcrypto can work at a column level, our hypothetical system will work at a database level (because its dependence from the PostgreSQL authentication system). In order to be able to allow the access to a table/column of a db, while forbidding the access to another table/access of the same db, we should be able to read the privileges of the current user and act consequently. I'm not sure but I think there is not any way to read the privileges of the current user from within a PL/PGSQL or C function so this would be impossible. > > This is almost the same technique used by Oracle. To implement it, > > PostgreSQL just lacks a way to supply the "global server password" at > > start-up. A SQL Query cannot be used for this both because the password > > must be visible to all the database managed by a server and because this > > password must not be stored inside the db. > > I think C functions with access to a shared memory block wouldn't fail > those two considerations. I agree. > > Note that the trusted administrator _can_ be replaced by a server process > > if: - the client process that asks for a "global server password" can > > authenticat itself in a reliable way (SSH, Kerberos, SSH any other PKI) > > as a trusted PostgreSQL server. > > - the communication channel is encrypted (SSH/SSL) > > This sounds alot better all in all. This also seems like it might be > possible to prototype outside the database proper so as to get an idea of > what it would look like. I like this approach much more than any other. A "password server" based on HTTP, SOAP or XML-RPC (and protected by SSL or SSH) would be an interesting general purpose tool per se. Its use with PostgreSQL would "just" require to add to the RDBMS engine the ability to communicate with the password server via HTTP, SOAP or XML-RPC. That is: another interesting general purpose feature of the RDBMS engine per se. > But, given the fact that the form of the backup is a set of SQL > statements, if the encryption is automatic on input, either the SQL > statements contain encrypted data and you need some way to on restore tell > the system to not encrypted them again or they contain decrypted data. If > accessing the encrypted data without being properly authorized throws an > error rather than giving you the encrypted form, how does the admin do the > selects necessary on the table to even get the encrypted form. Yes, right... unfortunatley. We would need a way to tell the system to always backup _encrypted_ data and restore them without re-encrypting. A specific couple of programs like pg_enc_dump and pg_enc_restore would be needed. The internals of such programs could be managed in this way: - always backup (download) data in their original format (encrypted if they are encrypted, clear if they are clear) - from within your SQL dump file, when needed, signal to the RDBMS engine that we are restoring an encrypted DB. This should be used by SQL code (pgcrypto functions) to prevent a second, inopportune encryption of encrypted data. I think that either a SQL query or a specific, hypothetical "sql file header" variable could be used for this task. Again, a specific set of encrypted data types, each including the intelligence needed to handle this conditional encryption, will be of great help. > > Specific encrytped data types are coming out to be a real need. Without > > them, I cannot see how we could perform data sorting, comparison and any > > other basic operation. IMHO, this topic should be brought to the > > attention of the PostgreSQL developers with the intent to develop a > > standard set of reliable encrypted data types. I'm sure that performance > > would be hugely affected by > > Given that it's possible that the types might be an input function, an > output function and one or more type conversion functions, I'd think it'd > be possible to prototype them first. I agree. This could be a good starting point. Pgcrypto could be used as a basis for this code, apparently. See you. ----------------------------------------- Alessandro Bottoni and Silvana Di Martino alessandrobottoni@interfree.it silvanadimartino@tin.it