Re: "Ungroup" data for import into PostgreSQL
От | Adrian Klaver |
---|---|
Тема | Re: "Ungroup" data for import into PostgreSQL |
Дата | |
Msg-id | 54B8630E.1020203@aklaver.com обсуждение исходный текст |
Ответ на | Re: "Ungroup" data for import into PostgreSQL (Jim Nasby <Jim.Nasby@BlueTreble.com>) |
Список | pgsql-general |
On 01/15/2015 04:56 PM, Jim Nasby wrote: > On 1/15/15 9:43 AM, George Weaver wrote: >> Hi List, >> >> I need to import data from a large Excel spreadsheet into a PostgreSQL >> table. I have a program that uses ODBC to connect to Excel and >> extract data using SQL queries. The program then inserts the data >> into a PostgreSQL table. >> >> The challenge with this particular spreadsheet is that it is arrayed >> thus: >> >> Model No 1 Product Code 15 >> Serial No 1 No on Hand >> Serial No 2 No on Hand >> Serial No 3 No on Hand >> Model No 4 Product Code 9 >> Serial No 12 No on Hand >> Model No 5 Product Code 27 >> Serial No 6 No on Hand >> Serial No 14 No on Hand >> >> etc. >> >> I need the data in PostgreSQL arrayed thus >> >> Model No 1 Product Code 15 Serial No 1 No on Hand >> Model No 1 Product Code 15 Serial No 2 No on Hand >> Model No 1 Product Code 15 Serial No 3 No on Hand >> Model No 4 Product Code 9 Serial No 12 No on Hand >> Model No 5 Product Code 27 Serial No 6 No on Hand >> Model No 5 Product Code 27 Serial No 14 No on Hand >> >> I can import the data procedurely using plpgsql to match the >> individual rows to the master for each row (import the raw data into a >> buffer table in PostgreSQL and then looping through the rows in the >> buffer table and checking to see when the Model No changes). > > Note that if you're doing that you better be putting the rownumber from > excel into the table... result sets are NOT guaranteed to be in insert > order! > >> I'm wondering if there is a more elegant way to do this using straight >> sql from Excel? > > Well, that's really an excel question, not a Postgres question... > > If you load the whole spreadsheet into a single table and have a way to > differentiate between the different rows then you might be able to do > something with CTE's to relate a serial number to the product code. That > might be faster than plpgsql. > > You might also be able to do something creative with formulas in excel > to copy the product code data to the serial # rows. You could then > import the whole thing and re-normalize it. > > There's probably some stuff you could do with VBA too. If you care about > performance you don't want to execute SQL statements for each > spreadsheet row. Or if you really want to slice and dice and you use Python, then take a look at Pandas: http://pandas.pydata.org/ In particular the IO functions: http://pandas.pydata.org/pandas-docs/stable/io.html -- Adrian Klaver adrian.klaver@aklaver.com
В списке pgsql-general по дате отправления: