Table and View Planning for Census Data

Поиск

Список

Период

Сортировка

От	Lee Hachadoorian
Тема	Table and View Planning for Census Data
Дата	28 мая 2008 г. 15:28:17
Msg-id	5ab13580805281128j60a29102vaf023288be8d184d@mail.gmail.com обсуждение исходный текст
Список	pgsql-novice

Дерево обсуждения

I have a question about how to efficiently set up tables and views to
work with census data.  The census releases data partitioned (rows) by
state and segmented (columns) by arbitrary groupings of matrices, or
"subject tables".  For example matrices P1 (total population), P2
(urban and rural), P3 (population by race), etc are all grouped in
segment 1.  Because the census data set is huge, I am downloading and
importing data as needed, rather than all at once.  Currently I am the
only user, although it could possibly be used by several coworkers in
the future.  It's highly unlikely that more than a few people would be
hitting the database at the same time.

1) Often I will be querying for a specific geographic level (e.g.
counties) across state lines (e.g New York metro area, which includes
data from three states).  How much would I gain from trying to set up
partitioning, as opposed to combining the states into one big table?
Or, since I would rarely be returning multiple geographic levels (like
census tracts *and* counties) at the same time, should I combine the
states but partition by geographic level?

2) The segments are arbitrary groupings, so I don't really want to
have to write a query for data from matrix P3 where I have to also
know that matrix P3 is in database table segment1.  I can think of
three ways to deal with this.  Please let me know which one is best,
or suggest something else I haven't thought of.
  a) Create a view which combines all columns, then query for desired
columns.  (But does a view have the same row size limitation as a
table?  If so this wouldn't work, because the number of columns will
be in the thousands.)
  b) Create a view for each matrix.  Then I can query by matrix and
column (which is easier for my brain).
  c) Break each matrix out into its own database table.  (Most work,
how much of a performance gain will there be?)

Any input would be be appreciated.

Lee Hachadoorian
PhD Student, Geography
Program in Earth & Environmental Sciences
CUNY Graduate Center

В списке pgsql-novice по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Table and View Planning for Census Data