Re: advice on schema for multilingual text
От | Michael Glaesemann |
---|---|
Тема | Re: advice on schema for multilingual text |
Дата | |
Msg-id | 81044C87-E88F-48D4-9668-42C5485047B6@myrealbox.com обсуждение исходный текст |
Ответ на | advice on schema for multilingual text ("Daniel McBrearty" <danielmcbrearty@gmail.com>) |
Список | pgsql-general |
On Apr 9, 2006, at 0:31 , Daniel McBrearty wrote: > Hi > > I have a website that has multilingual text stored in the database. > Currently I just have a flat table (lets called it "translations"), > one row per text item, one column per language. This works OK, for > now, but I am looking at a redesign. Mostly I want to keep > information about the languages in teh db as well, so that look > like an extra table, one row per lang. > > The problem that now arises is that there is an expected > correlation between the "languages" and "translations" tables - > there should be a row in languages for each column of translations. > AFAIK (could well be wrong, I am no expert in db theory) there is > no real way to express in the ddl. Not ideal. > > An alternative layout would now be to lose the "translations" > table, and have two tables in place; one called "base_text" > containing the text to be translated, and another called, say, > "tx_text" which contains the translations. Each row of "tx_text" > references both "base_text" and also "languages". > > This looks like a nice layout, as there is an abstract rep of the > languages, and we lose the "translations" table which can get very > wide. It's nice that the schema doesn't actually change to add a > new language. > > BUT there are certain invariants that need to be enforced. The main > one is this: > > There must only be one row in "site_text" for any given language > referencing a given row of "base_text". You'd want a unique index on (base_text, language), like: create table tx_text ( base_text text not null references base_text(base_text) , language text not null references languages (language) , primary key (base_text, language) , tx_text text not null ); > I would also like to have a column in "languages", type boolean, > called "is_base" - this says what the base language is. Here, only > ONE row can have a true value. (Obviously it has default value of > false and is not null). Here you want a partial unique index on languages where is_base is true create table languages ( language text primary key , is_base boolean not null ); create unique index languages_only_one_true_base_idx on languages (is_base) where is_base; > Another invariant now comes in - the language referenced by every > row of "site_text" MUST have "is_base" set to FALSE. I can think of two ways to do this: one is to write a trigger to enforce this, something like: create function non_base_language_translations_check returns trigger() language plpgsql as $$ begin if exists ( select * from tx_text natural join languages where not is_base ) then raise exception 'Language of translation text must not be a base language.'; end if; end; $$; Then use this function on triggers that fire on insert and update on tx_text and on update on languages. Another is to include the is_base column in tx_text (with both language and is_base referencing languages) and use a check constraint to make sure is_base is false: create table tx_text ( base_text text not null references base_text(base_text) , language text not null , is_base boolean not null check not is_base , foreign key (language, is_base) references languages (language, is_base) on update cascade , primary key (base_text, language) , tx_text text not null ); The latter method is denormalized, which is not something I usually recommend. I don't know how the constraint checking overhead of using a trigger compares with using the foreign key and check constraint. A third idea would be to have two languages tables: one with a single row for the base language and another for the target languages (You might even do this using table inheritance, though I haven't thought this completely through). tx_text would reference the target_languages table (or child table, as the case may be). I'm sure others have opinions on this as well. Hope this helps. Michael Glaesemann grzm myrealbox com
В списке pgsql-general по дате отправления:
Следующее
От: "Joshua D. Drake"Дата:
Сообщение: Re: Is difference between PostgreSQL and mySQL licences