Обсуждение: distibuted transactions, SQL+XPath+XTree

Поиск
Список
Период
Сортировка

distibuted transactions, SQL+XPath+XTree

От
Тюрин Дмитрий
Дата:
Hi list,

I see the following business opportunity for Postgres:
I) Simple man can't program middleware to connect XML-client and
Postgres.
II) Request into several databases does not exist.
III) Notebooks need several switching-on and switching-off during
transaction.
IV) Distance between strings are not supported, that makes
aproximate searching impossible.
V) There is no possibility to hide some (not all) records of table,
granted to other users, from these users

Proposed solutions to these opportunities:
I) DBMS inserts into tables and selects from tables along correlation
Primary Key - Foreign Key.
II) Databases get nicknames, groups of databases get name of group.
These names are used in requests.
III) Operator to freeze transaction.
IV) Operator to order records on base of distance between strings.
V) Subdivide records of all tables into classes, specify class
number in record.

I ask you to implement these solutions, that Postgres get
advantage before other DBMS-es. I have prepered several drawing
http://sql50.euro.ru/sql5.11.3.ppt to explain ideas.
More details are described below.

===

1) [slides 2-12] Problem: Browser is very widespread client in epoch of internet.
Non-programmers can master 'insert', 'select', 'update', 'delete',
but are not capable to use sophisticated syntax of proprietary
web-server for input of XML. It's necessary to exclude this
syntax, and give possibility to install DBMS and immediately
use it, like user install and use Teleport, FlashGet, browser
and so on.
 Solution: DBMS itself must communicate via HTTP, accept XML, and
place data from it into tables under some agreement. My proposal
about agreement:
*) xml-element is written into table with identical name (i.e.
tag name coincides with table name), xml-attribute is written
into field with identical name (i.e. attribute name coincides
with field name);
*) tables are bound into tree by values of Primary Keys and
Foreign Keys;
*) value of primary keys of new record is assigned by trigger. If user uses simple scheme, than this is enough! An
ambiguity
can exist in complex scheme because of several refering fields,
than user must append symbol '#' and name of necessary refering
field to end of name of sending XML-tag (it looks like new
tag name with symbol '#' inside name). Let's name this by
term 'determination' [symbol '$' is used for list to have
possibility to solve ambiguity for list simultaneously with
ambiguity for enclosed XML-element, i.e. to append two refering
field to name of sending XML-tag].
 P.S. [slides 13-21] Of course, we spead decision to manual 'insert'.

2) [slides 22-31] Problem: Usage of both SQL/XML-functions, and syntax of proprietary
web-server give very bulky code to extract tree as XML. This
makes more difficulties for contact of DBMS on CML, GML,
HumanML, OPML, RCML, SBML, ebXML, MDDL, RIXML, XBRL, xCBL and
other (turing all relational fields into XML-elements is
suitable for browser, but not suitable for other cases).
 Solution: To avoid sophisticated programing, 'select' itself must return
data to client (if only 'select' is not used inside 'insert ...
select ...'). I propose laconic 'select a.b.c' to select data
from tables 'a', 'b', 'c'. Let's name this by term 'XTree' -
in analogy with 'XPath'. If user uses simple scheme, than this is enough! An ambiguity
can exist in complex scheme because of several refering fields,
than user must append symbol '#' and name of necessary refering
field to end of table name (it looks like new table name with
symbol '#' inside name). Let's name this by term 'refinement'.
 P.S. [slides 32-39] All possible compositions of determinations in XML-tree and
all possible compositions of refinement in 'select' are
considered, appropriate XML- and SQL-syntaxes are proposed. P.S. [slides 40-49] Examples of usage of refinement are
demonstrated.

3) [slides 50-58] Problem: Non-predictable/non-repeated input data (XML-elements) is
written into XML-field of relational table. XQuery is offered
to process data in these xml-fields. But user is not capable
to manipulate records by SQL and XML-elements by XQuery in
one request (even in case of refusal from relational storage
in favour of XML-database, that means in favour of
non-relational 'engine', enclosed cycles of XQuery create very
bulky code, in which user is not orientated).
 Solution: I propose to append XPath into SQL, that SQL can process
XML-elements and attributes (i.e. to avoid XQuery). Thus SQL
can process records and XML-elements simultaneously.
 P.S. [slides 59-72] Of course, we generalize XPath and XTree upto XLang, and
consider all possible use cases.

---

4) [slides 73-83-116] Problem: SQL would more flexible and convenient for distributed
request (gethering data from several databases and scattering
them into several databases), than branded programs; including
SQL is more convenient for replication, than branded programs.
But there is no necessary syntax.
 Solution: Each database has nickname. Nicknames are specified in
requests as prefix before table name. Group of databases is named society. Name of society also
can be specified as prefix before table name, and means
nicknames of all databases of group. Thus one SQL-statement,
containing society name, means a great number of SQL-statements
with nicknames. That nicknames, several societies or several
mentions of one society don't specify the same database
simultaneously, we place symbol '%' before them (let's name so
prefix as restricted prefix). That several mentions of one
society synchronously specify the same database, we place any
(identical) word and symbol '%' before this mentions (let's
name this word as marker, and this prefix as marked prefix). 'Default' database is database, in which all nicknames
and
societies are stored. And prefix 'all' means all databases,
known for default database.  Nickname can has numeric parameter NID (nick identifier).
It is not accessable to change in requests to process data,
and is designated as '%%'.
 In purpose of security, distributed requests must satisfy
some requirements. I propose whole mistrust to DBMS:
*) database does not store login of other database (that to  not give foreign login at crack)
*) database does not edit (update, insert, delete) other  database (that temporaty access to other DBMS, got at  crack,
can'tdestroy other database)
 
*) database does not get data from other database, if it’s  possible (that data from other database not become
accessabletoo at crack)
 
And i also propose to expect quite simplicity of client:
*) client does not know SQL (it can’t simplify or decompose  SQL).
So DBMS-1 can't create and enter SQL-command into DBMS-2
directly or indirectly (asking client to forward command).
And client can't derive SQL-command on base of entered
SQL-command (all, what it has, is last SQL-command, stored
in own stack). I propose to DBMS-1 to transfer __XML__-commands to client,
which force client to make simple (string) transformation of
SQL-command, stored in client stack, and send result of
transformation into DBMS-2. Transformations must be so limited,
that to not allow appearance of SQL-command, harm for DBMS-2.
I propose to arrange these XML-commands as <?name?> to
distinguish them from XML-data (traditionaly arranged as
<name>).

5) [slide 118] Problem: User makes transaction from notebook, and needs to switch-off
notebook without commit or rollback of transaction to continue
transaction from left stage at next switching-on.
 Solution: I propose command 'freeze', similar to command 'disconnet',
which save transaction in current state; and command 'unfreeze',
similar to command 'connect', which continue frozen transaction
(instead to start new transaction). 'Freeze' returns identifier
of frozen transaction, which should be used in 'Unfreeze'.
 P.S. [slide 119] Now savepoint can be used only to rollback to it. I propose
command 'commit savepoint' (commit all actions, made before
savepoint), that is useful in much cases before command
'freeze'. 
6) [slide 120] Problem: Commiting of distrubited transaction (being executed in
several databases) is not fails in all databases at once.
It's not reasonable to rollback transaction in databases,
which remain healthy, to begin transaction in them from
very beginning - it's reasonable to wait repearing of failed
databases to commit transaction together with them.
So we need to freeze command 'commit' in healthy databases in
process of executing it (as well as in case of freezing
transaction, let client messages will be SQL-commands, and
server messages will be XML-commands).
 Solution: I propose to enter client message 'postpone' to freeze commit
on second phase, and client message 'adjourn' to freeze on third
phase.

7) [slide 124] Problem: At stream processing (when new records enter quickly,
in much quantity), it's necessary to execute aggregate only
on several last entered records (to organize slip slot), but
creating index on field, sequencing records, sorting on this
index with purpose to cut only needed quantity of records
brakes processing of stream.
 Solution: Limit quantity of records in a table, make queue of records,
automatically delete records from beginning of queue at
arrival of new records - and start aggregates for all records
of such specially orginized table. To save records,
automatically deleted from beginning of the queue, it's
possible to copy them automatically into other usual table
(which will save them permanently).

8) [slides 125-129] Problem: Distance between strings are not supported, that makes
search of similar strings and ordering by degree of
resemblance.
 Solution: Method of calculation of distance between strings and
operator, ordering records by this factor.

9) [slide 132] Problem: No possibility to make some (personal) records unaccessable
for other users.
 Solution: Subdivide records of all tables into not crossing classes,
specify number of class in special field of records.




Dmitry Turin
HTML6     (6. 5.4)  http://html60.euro.ru
SQL5      (5.11.3)  http://sql50.euro.ru
Unicode7  (7. 2.1)  http://unicode70.euro.ru
Computer2 (2. 0.2)  http://computer20.euro.ru



Re: distibuted transactions, SQL+XPath+XTree

От
Richard Huxton
Дата:
Тюрин Дмитрий wrote:
> Hi list,

Hi Dmitry, nice to have you back again.

> I see the following business opportunity for Postgres:
> I) Simple man can't program middleware to connect XML-client and
> Postgres.

Aha! still trying to push an XML command system and http server into the
backend.

> II) Request into several databases does not exist.

Well, there are middleware layers that'll do so, or various
case-specific solutions requiring dblink. The key problem is what you do
with cross-database dependencies. How were you thinking of dealing with
this?

> III) Notebooks need several switching-on and switching-off during
> transaction.

How are you dealing with the locking issues?

> IV) Distance between strings are not supported, that makes
> aproximate searching impossible.

Would that be "not supported" in the sense of "contrib/fuzzystrmatch"?

> V) There is no possibility to hide some (not all) records of table,
> granted to other users, from these users

Apart from views or the veil pgfoundry project of course.

> I ask you to implement these solutions, that Postgres get
> advantage before other DBMS-es. I have prepered several drawing
> http://sql50.euro.ru/sql5.11.3.ppt to explain ideas.
> More details are described below.

Were you looking to hire developers, or do you have customers who are
looking to hire developers?

--   Richard Huxton  Archonet Ltd


Re: distibuted transactions, SQL+XPath+XTree

От
Dmitry Turin
Дата:
Richard,

>> I see the following business opportunity for Postgres:
>> I) Simple man can't program middleware to connect XML-client and
>> Postgres.
RH> Aha! still trying to push an XML command system and http server into the
RH> backend.

Yes.

>> II) Request into several databases does not exist.
RH> what you do with cross-database dependencies

As i understand, you are saying about inter-database FK

connect         address www.default.bz;
create nick db1 address www.site.com;
create nick db2 address www.data.net;

connect addr=“www.data.net” user=“Smith” pwd=“qxuwb”;
create table b ( b1 xml, b2 number, b3 xml
);

connect addr=“www.site.com” user=“Tomson” pwd=“ncwhif”;
create table a ( a1 number                          references db1:b(@b1/k/m/@m1), a2 xml,  foreign key (@a2/p/q/@q1)
referencesdb1:b(@b2), a3 xml,  foreign key (@a3/p/q/@q1) references db1:b(@b3/k/m/@m1)
 
);

e.g. slides #75-76 in http://sql50.euro.ru/sql5.11.3.ppt ?
There are two case, which i'm naming 'set' ( a<= b, slide #93-94) and 'relay-rece' ( a => b , slide #95-98).
I propose to not try all "second" database, i.e. to not use
reference from "second" database to "first" database (case of 'set').
Case 'relay-race' works clearly.

Of course, i raise question about designation of interbase FK in
modellers (slide #105).

>> III) Notebooks need several switching-on and switching-off during
>> transaction.
RH> How are you dealing with the locking issues?

I propose to freeze transaction (#118):

create user u identified by p waited 1.0/0; -- yy.mm.dd/hh.mm.ss;
freeze; -- like disconnect
-----
<?res code=0 frozen=7482 ?> <!-- from ‘default.edu’ -->
<?res code=0 frozen=8726 ?> <!-- from ‘site.com’ -->
<?res code=0 frozen=9278 ?> <!-- from ‘data.net’ -->
<?res code=0 frozen=3825 ?> <!-- from ‘data.net’: second transaction -->
<?res code=0 frozen=6384 ?> <!-- from ‘store.org’ -->
<?res code=6 ?>             <!-- from ‘place.ws’: database is broken -->

unfreeze addr=site.com user=Tomson pwd=ncwhif safe=8726; -- like ‘connect’
unfreeze addr=data.net user=Smith  pwd=qxuwb  safe=9278;

P.S.
Of course, i propose possibility to freeze failed commit (#120).

>> IV) Distance between strings are not supported, that makes
>> aproximate searching impossible.
RH> Would that be "not supported" in the sense of "contrib/fuzzystrmatch"?

I propose
1) to use Levenshtein distance recursively:
for letters in words, for words in phrases (separated by marks of
punctuation), for phrases (e.g. sentances) in string
2) to count convolutions (several words into abbreviation, and back)

>> V) There is no possibility to hide some (not all) records of table,
>> granted to other users, from these users
RH> Apart from views

depending of user, which look in view

RH> or the veil pgfoundry project of course.

I'm not understanding, clarify, please.

>> I ask you to implement these solutions, that Postgres get
>> advantage before other DBMS-es. I have prepered several drawing
>> http://sql50.euro.ru/sql5.11.3.ppt to explain ideas.
>> More details are described below.
RH> Were you looking to hire developers, or do you have customers who are
RH> looking to hire developers?

Neither this, nor that.
I'm looking for volunteers.

P.S.
I'm also seggesting to implement

1) restricted table for stream processing:

create table a ( a1 number, a2 number
) size 5;
create table aa (…);
-- resize a to 10;
-- resize a to infinity;
-- redirect a to aa;

create table a ( a1 number, a2 number
) size 5 direct aa;

2) timer:

create timer TimerName start         yy.mm.dd/hh.mm.ss end          yy.mm.dd/hh.mm.ss schedule (yy.mm.dd/hh.mm.ss,
yy.mm.dd/hh.mm.ss,…) per           yy.mm.dd/hh.mm.ss as begin … end;
 
create timer t1 schedule (01/0, 03/0, 10/0) per 01.00/0 as … ; -- 1-st, 3-rd, 10-th day of each month
create timer t2 schedule (0/0.05, 0/0.10, 0/0.20) per 0/01 as … ; -- 5-, 10-, 20-th minutes of each hour
create timer t3 schedule (0) per 1/0 as … ;  -- each day 


Dmitry Turin
SQL5      (5.11.3)  http://sql50.euro.ru
HTML6     (6. 5.4)  http://html60.euro.ru
Unicode7  (7. 2.1)  http://unicode70.euro.ru
Computer2 (2. 0.2)  http://computer20.euro.ru