Обсуждение: How to Handle ltree path Data Type

Поиск
Список
Период
Сортировка

How to Handle ltree path Data Type

От
Don Parris
Дата:
Hi all,

Do any of you have experience with ltree?  Note: this may be more of a general Python3 question, but I thought I would start here, since I use Psycopg to access my DB.

I have implemented a hierarchical category structure using the ltree module in my PostgreSQL db. 

The category table uses ltree like so:
CREATE TABLE category(
cat_id serial pk
path ltree);

The path field takes data in the form of:
Level1.Level2.Level3.Level4.etc.

I can select * from category and get a standard result set.  However, I cannot seem to perform a search for a specific category or branch of the category tree.  At least, I only get a result set when searching on the first branch of categories in the table.  I believe I need to use type casting, but am not sure.

Ultimately, I want the user to be able to search for and select the desired category from the result list, and then use that selection to insert the correct cat_id into the appropriate column in another table.



Below is a more detailed description of what I am trying to do...

I can do this in SQL:
> SELECT path FROM test WHERE path <@ 'Top.Science';

and get a result like this (from the documentationl):
Top.Science
 Top.Science.Astronomy
 Top.Science.Astronomy.Astrophysics
 Top.Science.Astronomy.Cosmology

I get the impression that I need to cast the ltree path field to text, but I am not certain.

In an early iteration of this effort, I got the following result when running my program:
                                                                                                                                                                    
Search for Category:  Income             //Prompt to collect search term
 --------------
Traceback (most recent call last):                                                                                                                                                           
  File "dev/ldinero/ldinero.py", line 74, in <module>                                                                                                                                        
    category_search()
  File "/home/donp/dev/ldinero/ldinero_conn.py", line 29, in category_search
    {'term': search_term})
psycopg2.ProgrammingError: operator does not exist: ltree ~* unknown
LINE 1: SELECT ltree2text(path) FROM category WHERE path ~* 'income'...
                                                         ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.

If I remove the asterisk, that appears to resolve the issue.  However, I only get results when searching on the first category in the structure.  So I am still not quite "there".

Thanks,
Don
--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
GPG Key ID: F5E179BE

Re: How to Handle ltree path Data Type

От
Daniele Varrazzo
Дата:
On Fri, Apr 5, 2013 at 2:35 AM, Don Parris <parrisdc@gmail.com> wrote:
> Hi all,
>
> Do any of you have experience with ltree?  Note: this may be more of a
> general Python3 question, but I thought I would start here, since I use
> Psycopg to access my DB.

I'm working on a project using them heavily and have no problem with them.


> psycopg2.ProgrammingError: operator does not exist: ltree ~* unknown
> LINE 1: SELECT ltree2text(path) FROM category WHERE path ~* 'income'...
>                                                          ^
> HINT:  No operator matches the given name and argument type(s). You might
> need to add explicit type casts.
>
> If I remove the asterisk, that appears to resolve the issue.  However, I
> only get results when searching on the first category in the structure.  So
> I am still not quite "there".

~* is not an ltree operator (they are listed at
http://www.postgresql.org/docs/9.2/static/ltree.html#SECT2). If you
cast from ltree to text and try to use the text operators you don't
get any gain from it, e.g. from a gist index you could build on the
column.

From my experience so far, you can do pretty much everything just
using the ~ operator (ltree ~ lquery, not the text one). E.g. your
example query can be expressed as "SELECT path FROM test WHERE path ~
'Top.Science.*';" All the other operators are noise, I don't use any
of them (except the || for concatenation, no other search operator).

From psycopg you may need sometimes to use an explicit ::lquery cast
on a placeholder; apart from that you can just work with strings on
the Python side. In a long time I've been using it, I've found no use
for a specific object wrapping ltree/lquery.

    cur.execute("SELECT path FROM test WHERE path ~ %s::lquery",
["Top.Science.*"])

in this case the lquery cast would not have been needed: the query
would have worked anyway. You need it when the lquery is put together
on postgres side instead of python side, such as:

    cur.execute("SELECT path FROM test WHERE path ~ (%(prefix)s ||
'.*')::lquery", {prefix: "Top.Science"})

the cast is mandatory, because postgres can infer the operator from
ltree ~ unknown but (%(prefix)s || '.*') is a text expression and
there is no cast allowing ltree ~ text to work). In these cases the
error message is clean: it couldn't find the operator "ltree ~ text",
so you know you have to cast the text to lquery.

Hope this helps.

-- Daniele


Re: How to Handle ltree path Data Type

От
Don Parris
Дата:
On Fri, Apr 5, 2013 at 6:31 AM, Daniele Varrazzo <daniele.varrazzo@gmail.com> wrote:
On Fri, Apr 5, 2013 at 2:35 AM, Don Parris <parrisdc@gmail.com> wrote:
> Hi all,
>
> Do any of you have experience with ltree?  Note: this may be more of a
> general Python3 question, but I thought I would start here, since I use
> Psycopg to access my DB.

I'm working on a project using them heavily and have no problem with them.


> psycopg2.ProgrammingError: operator does not exist: ltree ~* unknown
> LINE 1: SELECT ltree2text(path) FROM category WHERE path ~* 'income'...
>                                                          ^
> HINT:  No operator matches the given name and argument type(s). You might
> need to add explicit type casts.
>
> If I remove the asterisk, that appears to resolve the issue.  However, I
> only get results when searching on the first category in the structure.  So
> I am still not quite "there".

~* is not an ltree operator (they are listed at
http://www.postgresql.org/docs/9.2/static/ltree.html#SECT2). If you
cast from ltree to text and try to use the text operators you don't
get any gain from it, e.g. from a gist index you could build on the
column.

Thanks Daniele,

I realized that when I looked at the documentation, and changed it.  As I mentioned, I did get the search partially working.  However, I now wonder if I am posing my user-generated query in the wrong way:

search_term = input('Search for Category: ')  # get input from user
cur = con.cursor()
cur.execute("""SELECT * FROM category WHERE path ~ %(term)s;""",  # Run select query against the user's search term
    {'term': search_term})

If I run the above query, using the very first item in the category table as a search term, I will get a result.  If I use any other term below that, I get no result at all.  This closely mirrors my search queries against other tables, but apparently does not work quite the same in this case.  This is why I thought maybe it had to do with the ltree data type.

I can also run the query like so (with the same results as I stated above):
cur.execute("""SELECT * FROM category WHERE path <@ %(term)s;""",  # Run select query against the user's search term
    {'term': search_term}



Regards,
Don
--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
GPG Key ID: F5E179BE

Re: How to Handle ltree path Data Type

От
Daniele Varrazzo
Дата:
On Fri, Apr 5, 2013 at 11:51 PM, Don Parris <parrisdc@gmail.com> wrote:

> I realized that when I looked at the documentation, and changed it.  As I
> mentioned, I did get the search partially working.  However, I now wonder if
> I am posing my user-generated query in the wrong way:
>
> search_term = input('Search for Category: ')  # get input from user
> cur = con.cursor()
> cur.execute("""SELECT * FROM category WHERE path ~ %(term)s;""",  # Run
> select query against the user's search term
>     {'term': search_term})
>
> If I run the above query, using the very first item in the category table as
> a search term, I will get a result.  If I use any other term below that, I
> get no result at all.  This closely mirrors my search queries against other
> tables, but apparently does not work quite the same in this case.  This is
> why I thought maybe it had to do with the ltree data type.

You are probably confusing text ~ text (regexp espression) with ltree
~ lquery, which are indeed two different operators. If you want to
search the ltrees with the label 'term' in any position you must match
the lquery '*.term.*'. Try:

    cur.execute("""SELECT * FROM category WHERE path ~ %(query)s;""",
        {'query': '*.%s.*' % search_term})

-- Daniele


Re: How to Handle ltree path Data Type

От
Don Parris
Дата:
On Fri, Apr 5, 2013 at 7:47 PM, Daniele Varrazzo <daniele.varrazzo@gmail.com> wrote:
On Fri, Apr 5, 2013 at 11:51 PM, Don Parris <parrisdc@gmail.com> wrote:

> I realized that when I looked at the documentation, and changed it.  As I
> mentioned, I did get the search partially working.  However, I now wonder if
> I am posing my user-generated query in the wrong way:
>
> search_term = input('Search for Category: ')  # get input from user
> cur = con.cursor()
> cur.execute("""SELECT * FROM category WHERE path ~ %(term)s;""",  # Run
> select query against the user's search term
>     {'term': search_term})
>
> If I run the above query, using the very first item in the category table as
> a search term, I will get a result.  If I use any other term below that, I
> get no result at all.  This closely mirrors my search queries against other
> tables, but apparently does not work quite the same in this case.  This is
> why I thought maybe it had to do with the ltree data type.

You are probably confusing text ~ text (regexp espression) with ltree
~ lquery, which are indeed two different operators. If you want to
search the ltrees with the label 'term' in any position you must match
the lquery '*.term.*'. Try:

    cur.execute("""SELECT * FROM category WHERE path ~ %(query)s;""",
        {'query': '*.%s.*' % search_term})

-- Daniele

Thanks Daniele,

Yes, your suggestion works much better.  My query above found the first record in the table, along with its children, but no other records after that.  In other words, it found the first row, or first few rows - and only those rows.  However, you are correct that I really would like to be able to search for a given word, regardless of its location in the path and see the full path in the result set - along with other rows that share that part of the path label.

While I do understand a fair chunk (most?) of the ltree documentation, I was missing the need for this element: {'query': '*.%s.*' % search_term}.

I'm not sure I would have figured that out any time soon.  :-/


Thanks again!
Don
--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
GPG Key ID: F5E179BE