Re: Homegrown Data Warehouse

Поиск
Список
Период
Сортировка
От Keith
Тема Re: Homegrown Data Warehouse
Дата
Msg-id CAHw75vsFtXTADfz3vS7MGqjV1UHvbJgiaoK210-Cp9YNadKepw@mail.gmail.com
обсуждение исходный текст
Ответ на Homegrown Data Warehouse  (Binand Sethumadhavan <binand@gmail.com>)
Список pgsql-novice


On Mon, Jul 11, 2016 at 11:01 PM, Binand Sethumadhavan <binand@gmail.com> wrote:
We have a postgresql 9.2 instance pair (master+slave) supporting our
online application.
We have a postgresql 9.5 instance which contains a dump of the above database.

The primary purpose of the dump database is to facilitate analysis. It
is updated from the slave by way of a perl script and psql. Master has
archival policies etc. in place but dump contains complete historic
data.

My problem is that when the dump job runs, queries pause or when heavy
analysis queries run, dump process pauses. This creates significant
problems. One particular use-case has a production application picking
analyzed data from the dump database and running a campaign based on
that.

It is basically a hack to implement data warehouse, ETL, datamarts,
analysis & reporting and campaign management via a set of
perl/shell/sql scripts. It used to work fine, but now with analysis
workloads increasing it is creating problems.

I'd like to know where to go next; because at the moment I'm out of
ideas. I do not have the budget for a paid solution.

Basically, I'm looking for suggestions on architecture and
implementation of such a system. If there are any web references
please mention them too.

TIA,

Binand


--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice

We use a tool that I've been working on called mimeo to do logical (per-table) replication to move data to a data warehouse system like you have. Mimeo has several replication methods available, the most commonly used being trigger based replication (DML replay). Some other replication methods that can be more efficient are available as well, but only work in specific situations.

https://github.com/omniti-labs/mimeo

There's another extension in the works called pglogical, which uses the newer logical WAL streaming method available in 9.4. It's much more efficient, but not quite as flexible and also requires all systems involved be on the same version.

https://2ndquadrant.com/en/resources/pglogical/

Keith
https://www.keithf4.com/

В списке pgsql-novice по дате отправления:

Предыдущее
От: Kenneth Marshall
Дата:
Сообщение: Re: Homegrown Data Warehouse
Следующее
От: Nigel Straightgrain
Дата:
Сообщение: How to upgrade from PostgreSQL v9.1.2 to v9.5.3?