Re: On columnar storage

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: On columnar storage
Дата	13 июня 2015 г. 15:41:24
Msg-id	16527.1434210072@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: On columnar storage (Alvaro Herrera <alvherre@2ndquadrant.com>)
Ответы	Re: On columnar storage
Список	pgsql-hackers

Дерево обсуждения

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> I can't help thinking that this could tie in with the storage level API
>> that I was waving my arms about last year.  Or maybe not --- the goals
>> are substantially different --- but I think we ought to reflect on that
>> rather than just doing a narrow hack for column stores used in the
>> particular way you're describing here.

> I can't seem to remember this proposal you mention.  Care to be more
> specific?  Perhaps a link to archives is enough.

I never got to the point of having a concrete proposal, but there was a
discussion about it at last year's PGCon unconference; were you there?

Anyway the idea was to try to cut a clearer line between heap storage
and the upper levels of the system, particularly the catalog/DDL code
that we have so much of.  Based on Salesforce's experience so far,
it's near impossible to get rid of HeapTuple as the lingua franca for
representing rows in the upper system levels, so we've not really tried;
but it would be nice if the DDL code weren't so much in bed with
heap-specific knowledge, like the wired-into-many-places assumption that
row insert and update actions require index updates but deletions don't.
We're also not very happy with the general assumption that a TID is an
adequate row identifier (as our storage engine does not have TIDs), so
I'm a bit disappointed to see you doubling down on that restriction
rather than trying to lift it.

Now much of this pain only comes into play if one is trying to change
the underlying storage format for system catalogs, which I gather is
not considered in your proposal.  But if you want one format for catalogs
and another for user tables then you have issues like how do you guarantee
atomic commit and crash safety across multiple storage engines.  That way
lies a mess, especially if you're trying to keep the engines at arms'
length which is what a pluggable architecture implies.  MySQL is a
cautionary example we should be keeping in mind while thinking about this.

I don't really have any answers in this area, just questions.
        regards, tom lane

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: On columnar storage