Обсуждение: Creating a DB

Поиск
Список
Период
Сортировка

Creating a DB

От
Sami Pietilä
Дата:
Hi,

I am trying to create local copy of DB from NCBI DB.
ftp://ftp.ncbi.nih.gov/snp/database/README.create_local_dbSNP.txt

The FTP site provides sql files and bcp -files.

I have started by trying to create tables as instructed:
psql -f dbSNP_main_table.sql dbSNP_main

Unfortunately there seems to be some syntax related issues:

psql:dbSNP_main_table.sql:579: ERROR:  syntax error at or near "["
LINE 1: CREATE TABLE [Allele]

How can I get postresql to accept provided sql -files?

BR,
Sami


Re: Creating a DB

От
Thom Brown
Дата:
On 16 September 2012 10:02, Sami Pietilä <sami.pietila@gmail.com> wrote:
> Hi,
>
> I am trying to create local copy of DB from NCBI DB.
> ftp://ftp.ncbi.nih.gov/snp/database/README.create_local_dbSNP.txt
>
> The FTP site provides sql files and bcp -files.
>
> I have started by trying to create tables as instructed:
> psql -f dbSNP_main_table.sql dbSNP_main
>
> Unfortunately there seems to be some syntax related issues:
>
> psql:dbSNP_main_table.sql:579: ERROR:  syntax error at or near "["
> LINE 1: CREATE TABLE [Allele]
>
> How can I get postresql to accept provided sql -files?

The file appears to be formatted for Microsoft SQL Server. You will
need to do a find and replace for '[' and ']', and replace them with
nothing. Also the keyword GO isn't valid in PostgreSQL, so I suggest
doing a case-sensitive replace of those with a semi-colon so that it
terminates every statement correctly.

Some of those tables use smalldatetime which is non-standard. Replace
all instances of these with timestamp. The same goes for tinyint.
Replace those with int.

It looks like everything else in the CREATE TABLE statements should
work. Note that PostgreSQL case-folds unquoted mixed-case object names
to lower-case, so if you want to preserve the letter casing of object
names, you'll need to quote them with double-quotes.  However, if you
do this, you'll have to double-quote the object name every time you
refer to it in a query.

In the dbSNP_main_index.sql.gz file, there's also additional changes
to be made.  Again, replace '[' and ']' with nothing, replace 'GO'
with ';' and replace all instances of "NONCLUSTERED" and "CLUSTERED"
with nothing.  There's a mistake in the file though (at least as far
as PostgreSQL goes); the index 'i_allele_id' appears twice, but apply
to different tables, so you may wish to alter the name of one of them
so they aren't the same.

As for the rest, it might be worth reading up about the differences
between SQL Server and PostgreSQL syntax and convert the rest based on
your findings.

--
Thom


Re: Creating a DB

От
Sami Pietilä
Дата:
Hi,

Thanks for detailed reply.

However, are there any utilities in postgres to help with migrations.
Perhaps something that would do syntax conversion automatically?

BR,
Sami

2012/9/16 Thom Brown <thom@linux.com>:
> On 16 September 2012 10:02, Sami Pietilä <sami.pietila@gmail.com> wrote:
>> Hi,
>>
>> I am trying to create local copy of DB from NCBI DB.
>> ftp://ftp.ncbi.nih.gov/snp/database/README.create_local_dbSNP.txt
>>
>> The FTP site provides sql files and bcp -files.
>>
>> I have started by trying to create tables as instructed:
>> psql -f dbSNP_main_table.sql dbSNP_main
>>
>> Unfortunately there seems to be some syntax related issues:
>>
>> psql:dbSNP_main_table.sql:579: ERROR:  syntax error at or near "["
>> LINE 1: CREATE TABLE [Allele]
>>
>> How can I get postresql to accept provided sql -files?
>
> The file appears to be formatted for Microsoft SQL Server. You will
> need to do a find and replace for '[' and ']', and replace them with
> nothing. Also the keyword GO isn't valid in PostgreSQL, so I suggest
> doing a case-sensitive replace of those with a semi-colon so that it
> terminates every statement correctly.
>
> Some of those tables use smalldatetime which is non-standard. Replace
> all instances of these with timestamp. The same goes for tinyint.
> Replace those with int.
>
> It looks like everything else in the CREATE TABLE statements should
> work. Note that PostgreSQL case-folds unquoted mixed-case object names
> to lower-case, so if you want to preserve the letter casing of object
> names, you'll need to quote them with double-quotes.  However, if you
> do this, you'll have to double-quote the object name every time you
> refer to it in a query.
>
> In the dbSNP_main_index.sql.gz file, there's also additional changes
> to be made.  Again, replace '[' and ']' with nothing, replace 'GO'
> with ';' and replace all instances of "NONCLUSTERED" and "CLUSTERED"
> with nothing.  There's a mistake in the file though (at least as far
> as PostgreSQL goes); the index 'i_allele_id' appears twice, but apply
> to different tables, so you may wish to alter the name of one of them
> so they aren't the same.
>
> As for the rest, it might be worth reading up about the differences
> between SQL Server and PostgreSQL syntax and convert the rest based on
> your findings.
>
> --
> Thom