Re: Creating composite keys from csv
| От | Adrian Klaver |
|---|---|
| Тема | Re: Creating composite keys from csv |
| Дата | |
| Msg-id | 54FDA1EB.5040200@aklaver.com обсуждение исходный текст |
| Ответ на | Creating composite keys from csv (Eli Murray <ejmurra2@illinimedia.com>) |
| Ответы |
Re: Creating composite keys from csv
|
| Список | pgsql-general |
On 03/08/2015 08:49 PM, Eli Murray wrote: > Hi all, > > I'm a student journalist working on a project for our student paper > which lists salaries and positions for every staff member at the > university. We received the data from an FOI request but the university > is refusing to give us primary keys for the data. > > The issue we've run into is that if there are two staff members with the > same name (and there are) our current web app adds their salaries > together and considers them one person. Now, luckily, we can create a > composite key if we combine their name column with their salary column. > Unfortunately, the format of the data we have makes it more difficult > than that (of course!) because some employees can hold multiple paying > positions. > > Here's some example data: > > Name, Position, Salary,Total Salary, ... > Jane Doe, Dean, 100.000, 148.000, ... > John Locke, Custodian, 30.000, 30.000, ... > Jane Doe, Academic Adviser, 48.000, 148.000, ... > Jane Doe, Trainer, 46.000, 46.000, ... > > Basically, what we'd like to do is create a serial primary key but > instead of having it increment every row, it needs to check the name and > total salary columns and only increment if that person doesn't already > exist. If they do exist, it should just assign the previously created > number to the column. Well the above is not going to work, because the id would not be unique across rows and therefore could not be a primary key. If I am following what you want is a staff id that identifies a particular staff member across rows and is derived from the (Name, Total Salary) combination, is that correct? If so you could use a serial column to generate a surrogate primary key for each row without worrying about the names and total salary. Then it becomes an issue of generating the staff id for unique staff members. For that I would see John McKowns answer. However, our team is small and between us we have > very little experience working with databases and we haven't found a way > to accomplish this goal yet. In fact, we may be trying to solve this in > the wrong way entirely. > > So, to put it succinctly, how would you approach this problem? What are > our options? Do we need to write a script to clean the data into > separate csv tables before we import it to postgres, or is this > something we can do in postgres? We'd really appreciate any help you all > may be able to offer. > > Best! > Eli Murray > > -- Adrian Klaver adrian.klaver@aklaver.com
В списке pgsql-general по дате отправления: