Re: Anonymized database dumps
От | Bill Moran |
---|---|
Тема | Re: Anonymized database dumps |
Дата | |
Msg-id | 20120319175527.5eb99af57c91b932f561f97a@potentialtech.com обсуждение исходный текст |
Ответ на | Re: Anonymized database dumps (Kiriakos Georgiou <kg.postgresql@olympiakos.com>) |
Ответы |
Re: Anonymized database dumps
|
Список | pgsql-general |
In response to Kiriakos Georgiou <kg.postgresql@olympiakos.com>: > The data anonymizer process is flawed because you are one misstep away from data spillage. In our case, it's only one layer. Other layers that exist: * The systems where this test data is instantiated can't send email * The systems where this exist have limited access (i.e., not all developers can access it, and it's not used for typical testing -- only for specific testing that requires production-like data) You are correct, however, in that there's always the danger of spillage if new sensitive data is added and the sanitation script is not properly updated. It's part of the ongoing overhead of maintaining such a system. > Sensitive data should be stored encrypted to begin. For test databases you or your developers can invoke a process thatreplaces the real encrypted data with fake encrypted data (for which everybody has the key/password.) Or if the overheadis too much (ie billions of rows), you can have different decrypt() routines on your test databases that return fakedata without touching the real encrypted columns. The thing is, this process has the same potential data spillage issues as sanitizing the data. I find it intriguing, however, and I'm going to see if there are places where this approach might have advantages over our current one. Since much of our sensitive data is already de-identified, it provides an additional level of protection on that level as well. -- Bill Moran http://www.potentialtech.com http://people.collaborativefusion.com/~wmoran/
В списке pgsql-general по дате отправления: