Re: Beyond the 1600 columns limit on windows

Поиск
Список
Период
Сортировка
От Evandro's mailing lists (Please, don't send personal messages to this address)
Тема Re: Beyond the 1600 columns limit on windows
Дата
Msg-id ffac56890511090751p4a1b2b21k75764e47d1e60037@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Beyond the 1600 columns limit on windows  ("John D. Burger" <john@mitre.org>)
Ответы Re: Beyond the 1600 columns limit on windows
Re: Beyond the 1600 columns limit on windows
Список pgsql-general
Yes it is exactly that. I will follow you advice and create a abstraction layer for the data access that will return the sparse dataset using the standard dataset as input.
 
There is just one thing I disagree you said it that the performance is not good, right. However, it is practical! Nothing is easier and more practical than keeping the sparse representation inside of the database for my application.

 
On 11/8/05, John D. Burger <john@mitre.org> wrote:
Evandro's mailing lists (Please, don't send personal messages to this
address) wrote:

> It has nothing to do with normalisation. It is a program for
> scientific applications.
> Datavalues are broken into column to allow multiple linear regression
> and multivariate regression trees computations.

Having done similar things in the past, I wonder if your current DB
design includes a column for every feature-value combination:

instanceID  color=red  color=blue  color=yellow  ...  height=71
height=72
-------------------------------------------------
42           True       False       False
43           False     True        False
44           False     False       True
...

This is likely to be extremely sparse, and you might use a sparse
representation accordingly.  As several folks have suggested, the
representation in the database needn't be the same as in your code.

> Even SPSSthe most well-known statistic sw uses the same approach and
> data structure that my software uses.
> Probably I should use another data structure but would not be as
> eficient and practical as the one I use now.

The point is that, if you want to use Postgres, this is not in fact
efficient and practical.  In fact, it might be the case that mapping
from a sparse DB representation to your internal data structures is
=more= efficient than naively using the same representation in both
places.

- John D. Burger
  MITRE



--
Evandro M Leite Jr
PhD Student & Software developer
University of Southampton, UK
Personal website: http://evandro.org
Academic website: http://www.soton.ac.uk/~evandro
Please, use Jr(at)evandro.org for personal messages

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: clustering by partial indexes
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Postmaster failing to start on reboot