Обсуждение: Re: Mysql 321 - Mysql 322 - msql

Поиск
Список
Период
Сортировка

Re: Mysql 321 - Mysql 322 - msql

От
The Hermit Hacker
Дата:
On Fri, 27 Nov 1998, Andrew McNaughton wrote:

> It's possibly stronger on features, but it's slower than mysql.  It is
> speed he's emphasizing.
I've never actually installed mysql, so can't really compare the
two, but I've been using PostgreSQL for everything I need an RDBMS for
since I first took on the project 3 years ago now (wow, time flies)...each
release has gotten progressively faster, but we've pretty much hit a limit
as far as optimizations are concerned, there probably isn't a *noticeable*
difference between v6.3.2 and v6.4...
We are hoping to have the PREPARE statement put into v6.5, which
should give a performance improvement in "repeatative queries", as the
planning for the query can be done beforehand, taking out a step...

> there was some discussion earlier this year on this list about adding
> indexes suitable for fulltext searching  to PostgreSQL.  Did anything
> happen in the end? It's the one feature I'd really like to have.  I
> suspect it would be an important one to James also.
What do you mean by "fulltext searching"?

Marc G. Fournier                                
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: [HACKERS] Re: Mysql 321 - Mysql 322 - msql

От
Tom Ivar Helbekkmo
Дата:
The Hermit Hacker <scrappy@hub.org> writes:

>     What do you mean by "fulltext searching"?

He's talking about inverted text indices, where text is indexed such
that a word is the key, and the index returns pointers to all the
places where that word occurs.  Knowledge of word structure is usually
built in, so that "hacks", "hacker", "hackers", "hacking" and so on
are known to be derivatives of "hack", and can match it if requested.
Noise words such as "a", "the" and so forth are usually not indexed.

Inverted indexed text storage tends to take up much space, but there
are ways to reduce this, and the best implementations do it remarkably
well.  A simple example: it is not really necessary to actually store
the original text; it can instead be a sequence of links to the store
of all individual words in the text database.

See http://glimpse.cs.arizona.edu/ for a powerful inverted indexing
engine and various related software.

-tih
-- 
Popularity is the hallmark of mediocrity.  --Niles Crane, "Frasier"


Re: [HACKERS] Re: Mysql 321 - Mysql 322 - msql

От
The Hermit Hacker
Дата:
On 27 Nov 1998, Tom Ivar Helbekkmo wrote:

> The Hermit Hacker <scrappy@hub.org> writes:
> 
> >     What do you mean by "fulltext searching"?
> 
> He's talking about inverted text indices, where text is indexed such
> that a word is the key, and the index returns pointers to all the
> places where that word occurs.  Knowledge of word structure is usually
> built in, so that "hacks", "hacker", "hackers", "hacking" and so on
> are known to be derivatives of "hack", and can match it if requested.
> Noise words such as "a", "the" and so forth are usually not indexed.
> 
> Inverted indexed text storage tends to take up much space, but there
> are ways to reduce this, and the best implementations do it remarkably
> well.  A simple example: it is not really necessary to actually store
> the original text; it can instead be a sequence of links to the store
> of all individual words in the text database.
> 
> See http://glimpse.cs.arizona.edu/ for a powerful inverted indexing
> engine and various related software.
Just curious, but other then specialized applications like
Glimpse, does anyone actually support/do this?

Marc G. Fournier                                
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: [HACKERS] Re: Mysql 321 - Mysql 322 - msql

От
Tom Lane
Дата:
The Hermit Hacker <scrappy@hub.org> writes:
> On 27 Nov 1998, Tom Ivar Helbekkmo wrote:
>> See http://glimpse.cs.arizona.edu/ for a powerful inverted indexing
>> engine and various related software.

>     Just curious, but other then specialized applications like
> Glimpse, does anyone actually support/do this?

I dearly love Glimpse.  (Sample things I use it for: rooting through
nearly 10 years worth of archived email; finding all references to a
particular name in the Postgres sources, almost instantly; ditto for the
even larger Ptolemy sources; looking for files that I can't remember
where I put ... it's great.  And aren't the Postgres mailing list
archive indexes Glimpse-driven?)

I don't currently have any databases that could benefit from full-text
indexes.  But I can think of applications where it'd be important,
particularly after we get rid of the limit on tuple sizes so that it
becomes reasonable to put fair-size chunks of text into database
entries.  For example: would it be useful to put my email archive into
a Postgres database, one message per tuple?  Maybe ... but if I can't
glimpse it afterwards, forgetaboutit.

You could probably glue something like this together from existing
spare parts, say by running a nightly cron job that dumps out the
text fields of your database for indexing by Glimpse.  But it wouldn't
be integrated into SQL --- you'd have to query the index separately
outside of SQL, then use the results to drive a query to fetch the
selected records.

A seamless integration would make Glimpse indexes be a new type of
index associated with a new match operator, something likecreate index index1 on table using glimpse
(text_field);select* from table where glimpse(text_field, 'pattern');
 
I have no idea how hard that would be...
        regards, tom lane


Re: [HACKERS] Re: Mysql 321 - Mysql 322 - msql

От
The Hermit Hacker
Дата:
On Fri, 27 Nov 1998, Tom Lane wrote:

> The Hermit Hacker <scrappy@hub.org> writes:
> > On 27 Nov 1998, Tom Ivar Helbekkmo wrote:
> >> See http://glimpse.cs.arizona.edu/ for a powerful inverted indexing
> >> engine and various related software.
> 
> >     Just curious, but other then specialized applications like
> > Glimpse, does anyone actually support/do this?
> 
> I dearly love Glimpse.  (Sample things I use it for: rooting through
> nearly 10 years worth of archived email; finding all references to a
> particular name in the Postgres sources, almost instantly; ditto for the
> even larger Ptolemy sources; looking for files that I can't remember
> where I put ... it's great.  And aren't the Postgres mailing list
> archive indexes Glimpse-driven?)
Nope, I use ht/Dig for it...

> A seamless integration would make Glimpse indexes be a new type of
> index associated with a new match operator, something like
>     create index index1 on table using glimpse (text_field);
>     select * from table where glimpse(text_field, 'pattern');
> I have no idea how hard that would be...
Anyone?  This one I'd love to see...

Marc G. Fournier                                
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: [HACKERS] Re: Mysql 321 - Mysql 322 - msql

От
Tom
Дата:
On Fri, 27 Nov 1998, The Hermit Hacker wrote:

>     Just curious, but other then specialized applications like
> Glimpse, does anyone actually support/do this?
 Well, Oracle has their ConText option that does stuff like this.

> Marc G. Fournier                                
> Systems Administrator @ hub.org 
> primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 

Tom



Re: [HACKERS] Re: Mysql 321 - Mysql 322 - msql

От
Tom Ivar Helbekkmo
Дата:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> I don't currently have any databases that could benefit from full-text
> indexes.  But I can think of applications where it'd be important,
> particularly after we get rid of the limit on tuple sizes so that it
> becomes reasonable to put fair-size chunks of text into database
> entries.  For example: would it be useful to put my email archive into
> a Postgres database, one message per tuple?  Maybe ... but if I can't
> glimpse it afterwards, forgetaboutit.

Another very important application is the keeping of structured
documents in a database system.  Advanced SGML environments do this,
and Philip Greenspun of MIT, the author of the excellent book
"Database Backed Web Sites" (see http://photo.net/ for information)
recommends doing it for HTML and other data for web publishing.  The
web server AOLserver is just one example of an application that can do
this -- and if I'm not mistaken, AOLserver can even use PostgreSQL.

Anyway, once the data is in the database, and much of it is text, it
becomes very interesting to be able to efficiently index and search.

-tih
-- 
Popularity is the hallmark of mediocrity.  --Niles Crane, "Frasier"


Re: [HACKERS] Re: Mysql 321 - Mysql 322 - msql

От
Bruce Momjian
Дата:
> I don't currently have any databases that could benefit from full-text
> indexes.  But I can think of applications where it'd be important,
> particularly after we get rid of the limit on tuple sizes so that it
> becomes reasonable to put fair-size chunks of text into database
> entries.  For example: would it be useful to put my email archive into
> a Postgres database, one message per tuple?  Maybe ... but if I can't
> glimpse it afterwards, forgetaboutit.
> 
> You could probably glue something like this together from existing
> spare parts, say by running a nightly cron job that dumps out the
> text fields of your database for indexing by Glimpse.  But it wouldn't
> be integrated into SQL --- you'd have to query the index separately
> outside of SQL, then use the results to drive a query to fetch the
> selected records.

We do have contrib/fulltextindex.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026