Re: PoC: adding CustomJoin, separate from CustomScan
От | Paul A Jungwirth |
---|---|
Тема | Re: PoC: adding CustomJoin, separate from CustomScan |
Дата | |
Msg-id | CA+renyVXciqFe193nfQrHuC2CenEbDoQb0s3hgzjs-WULzuQAQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: PoC: adding CustomJoin, separate from CustomScan (Tomas Vondra <tomas@vondra.me>) |
Список | pgsql-hackers |
On Fri, Jul 25, 2025 at 9:23 AM Tomas Vondra <tomas@vondra.me> wrote: > I don't think CustomScans help with parser/grammar at all. It's just a > planner/executor node, it has no way to interact with parser. Maybe some > sort of "custom operator" would work, not sure. Not sure what exactly > you need in the planner. I agree, although if other people are using CustomScans I wonder what the user does to prompt their appearance, since the grammar is not extendable. If my SupportRequestInlineSRF patch got in, you could have functions that replace themselves with CustomScans/CustomJoins. That would be cool, both for temporal joins and for another thing I've wanted to build for years: a pandas-like dataframe library, or at least a way to pass "predicates" to an array without reifying them at every step. I have a couple extensions, aggs_for_arrays and aggs_for_vecs, that would combine really nicely with that. > For planning, CustomScan joins are simply part of the regular join > search. We generate all the various paths for a joinrel, and then give > the set_join_pathlist_hook hook a chance to add some more. AFAIK it > doesn't affect the join order search, or anything like that. At least > not directly. I finally found some research on the algebraic properties of temporal operators. Snodgrass and some others edited a collection of academic papers in 1993 and published it as *Temporal Databases: Theory, Design, and Implementation*. On page 175 they have: **Theorem 12** The following equivalences hold for the valid-time algebra. Q ∪̂ R ≡ R ∪̂ Q Q ⨯̂ R ≡ R ⨯̂ Q σ̂F1(σ̂F2(Q)) ≡ σ̂F2(σ̂F1(Q)) Q ∪̂ (R ∪̂ S) ≡ (Q ∪̂ R) ∪̂ S Q ⨯̂ (R ⨯̂ S) ≡ (Q ⨯̂ R) ⨯̂ S Q ⨯̂ (R ∪̂ S) ≡ (Q ⨯̂ R) ∪̂ (Q ⨯̂ S) σ̂F(Q ∪̂ R) ≡ σ̂F(Q) ∪̂ σ̂F(R) σ̂F(Q −̂ R) ≡ σ̂F(Q) −̂ σ̂F(R) π̂X(Q ∪̂ R) ≡ π̂X(Q) ∪̂ π̂X(R) **Theorem 13** The distributive property of Cartesian product over difference, or Q ⨯̂ (R −̂ S) ≡ (Q ⨯̂ R) −̂ (Q ⨯̂ S), does not hold for the valid-time algebra. If that gets garbled in transmission, it might be more legible here: https://illuminatedcomputing.com/posts/2017/12/temporal-databases-bibliography/ I haven't read the whole chapter yet, but at first glance it seems applicable to how people do valid-time tables today. It's interesting that most identities are still valid for temporal tables, but not all. I don't know what the Postgres planner takes advantage of, but maybe it is *better* to use CustomScan, to prevent invalid transformations. But then you are giving up some optimization potential. Yours, -- Paul ~{:-) pj@illuminatedcomputing.com
В списке pgsql-hackers по дате отправления: