Обсуждение: Regarding query minimizer (simplifier)

Поиск
Список
Период
Сортировка

Regarding query minimizer (simplifier)

От
"Jung, Jinho"
Дата:


Hello, I am Jinho Jung, Phd Student from GeorgiaTech, and try to find any SQL queries that cause performance regression. While conducting evaluation, I found an interesting query which makes x80 times slower execution in version 10.5 than version 9.4. Please see the attached files, if you are interested. 

To analyze the root cause of problem, I had to find minimal query that contains same problem. (i.e., same regression issue) The initial query was about 14K and I should manually remove part of query. (it took about 30 minutes to get 500 bytes query) So I am wondering if there are tools to automatically simplify statements to analyze the performance issue.

Thanks,
Jinho Jung

Вложения

Re: Regarding query minimizer (simplifier)

От
Tom Lane
Дата:
"Jung, Jinho" <jinho.jung@gatech.edu> writes:
> Hello, I am Jinho Jung, Phd Student from GeorgiaTech, and try to find any SQL queries that cause performance
regression.While conducting evaluation, I found an interesting query which makes x80 times slower execution in version
10.5than version 9.4. Please see the attached files, if you are interested. 

Hm, testing this in the regression database, it seems pretty speedy
across all supported branches, and indeed slower in 9.4 than later
branches (~25 ms vs ~10 ms).

It seems likely that you're testing in a very different database,
perhaps one with many more tables ... but if you don't explain the
test scenario, we aren't going to have much luck investigating.

            regards, tom lane


Re: Regarding query minimizer (simplifier)

От
"Jung, Jinho"
Дата:


Hello Tom, 


Sorry for the misleading. Could you try these two queries? I made the query even slower in latest version of postgres. These are information about how we set up evaluation environment and query result. 


Thanks,

Jinho Jung


Install Multiple version of DBs in one machine
======================================
# Install 10.5
    $ wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -        
    $ sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/ xenial-pgdg main" > /etc/apt/sources.list.d/pgdg_xenial.list'
    $ sudo apt update
    $ sudo apt-get install postgresql-10

# Install 9.6
    $ sudo apt-get install postgresql-9.6 

# Install 9.5
    $ sudo apt-get install postgresql-9.5

# Install 9.4 
    $ sudo apt-get install postgresql-9.4 postgresql-contrib-9.4 libpq-dev postgresql-server-dev-9.4

# check 
    $ pg_lsclusters


Original regression query
==========================
explain analyze
select
  1
from
  information_schema.role_usage_grants as ref_2,
  lateral (
    select
      max((null)) over (partition by ref_3.amopfamily) as c8
    from
        pg_catalog.pg_amop as ref_3
    ) as subq_0
;

ORIGINAL querying time
on old version(9.4/9.5): 5.7ms
on latest version(10): 91.76ms



CORRELATED query to maximize error
===================================
explain analyze
select *
from information_schema.role_usage_grants f1
where grantor = 
    ( select max(ref_2.grantor)
      from
       information_schema.role_usage_grants as ref_2,
       lateral (
         select
           max((null)) over (partition by ref_3.amopfamily) as c8
         from
             pg_catalog.pg_amop as ref_3
         ) as subq_0
      where ref_2.object_catalog = f1.object_catalog
    )
;


CORRELATED querying time
on old version(9.4/9.5): 0.6s
on latest version(10): 113s
188 times slower




From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Saturday, October 13, 2018 5:59:06 PM
To: Jung, Jinho
Cc: pgsql-hackers@lists.postgresql.org
Subject: Re: Regarding query minimizer (simplifier)
 
"Jung, Jinho" <jinho.jung@gatech.edu> writes:
> Hello, I am Jinho Jung, Phd Student from GeorgiaTech, and try to find any SQL queries that cause performance regression. While conducting evaluation, I found an interesting query which makes x80 times slower execution in version 10.5 than version 9.4. Please see the attached files, if you are interested.

Hm, testing this in the regression database, it seems pretty speedy
across all supported branches, and indeed slower in 9.4 than later
branches (~25 ms vs ~10 ms).

It seems likely that you're testing in a very different database,
perhaps one with many more tables ... but if you don't explain the
test scenario, we aren't going to have much luck investigating.

                        regards, tom lane
Вложения

Re: Regarding query minimizer (simplifier)

От
Jinho Jung
Дата:
Hello, 

We appreciate you taking time for test! When we do more evaluation, we noticed that the previously attached query made regression only on DBs that we installed from APT manager (i.e., apt-get command) not on DBs that we built from the source code. But we also confirmed that there are many cases that cause regression to all DBs (installed from APT and build from source code)

Hope you can also test these queries too. These are the execution time on our machine.

1.sql
10.5  : 20ms
9.4.19: 1,227ms

4.sql
10.5  : 13ms
9.4.19: 88,721ms

20.sql
10.5  : 271ms
9.4.19: 6,104ms

22.sql
10.5  : 8ms
9.4.19: 105ms

Jinho Jung

On Tue, Oct 23, 2018 at 9:52 AM Jung, Jinho <jinho.jung@gatech.edu> wrote:


Hello Tom, 


Sorry for the misleading. Could you try these two queries? I made the query even slower in latest version of postgres. These are information about how we set up evaluation environment and query result. 


Thanks,

Jinho Jung


Install Multiple version of DBs in one machine
======================================
# Install 10.5
    $ wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -        
    $ sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/ xenial-pgdg main" > /etc/apt/sources.list.d/pgdg_xenial.list'
    $ sudo apt update
    $ sudo apt-get install postgresql-10

# Install 9.6
    $ sudo apt-get install postgresql-9.6 

# Install 9.5
    $ sudo apt-get install postgresql-9.5

# Install 9.4 
    $ sudo apt-get install postgresql-9.4 postgresql-contrib-9.4 libpq-dev postgresql-server-dev-9.4

# check 
    $ pg_lsclusters


Original regression query
==========================
explain analyze
select
  1
from
  information_schema.role_usage_grants as ref_2,
  lateral (
    select
      max((null)) over (partition by ref_3.amopfamily) as c8
    from
        pg_catalog.pg_amop as ref_3
    ) as subq_0
;

ORIGINAL querying time
on old version(9.4/9.5): 5.7ms
on latest version(10): 91.76ms



CORRELATED query to maximize error
===================================
explain analyze
select *
from information_schema.role_usage_grants f1
where grantor = 
    ( select max(ref_2.grantor)
      from
       information_schema.role_usage_grants as ref_2,
       lateral (
         select
           max((null)) over (partition by ref_3.amopfamily) as c8
         from
             pg_catalog.pg_amop as ref_3
         ) as subq_0
      where ref_2.object_catalog = f1.object_catalog
    )
;


CORRELATED querying time
on old version(9.4/9.5): 0.6s
on latest version(10): 113s
188 times slower




From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Saturday, October 13, 2018 5:59:06 PM
To: Jung, Jinho
Cc: pgsql-hackers@lists.postgresql.org
Subject: Re: Regarding query minimizer (simplifier)
 
"Jung, Jinho" <jinho.jung@gatech.edu> writes:
> Hello, I am Jinho Jung, Phd Student from GeorgiaTech, and try to find any SQL queries that cause performance regression. While conducting evaluation, I found an interesting query which makes x80 times slower execution in version 10.5 than version 9.4. Please see the attached files, if you are interested.

Hm, testing this in the regression database, it seems pretty speedy
across all supported branches, and indeed slower in 9.4 than later
branches (~25 ms vs ~10 ms).

It seems likely that you're testing in a very different database,
perhaps one with many more tables ... but if you don't explain the
test scenario, we aren't going to have much luck investigating.

                        regards, tom lane
Вложения

Re: Regarding query minimizer (simplifier)

От
Jinho Jung
Дата:
Order is reversed. 

1.sql
9.4.19: 20ms
10.5   : 1,227ms

4.sql
9.4.19: 13ms
10.5   : 88,721ms

20.sql
9.4.19: 271ms
10.5   : 6,104ms

22.sql
9.4.19: 8ms
10.5   : 105ms

On Tue, Oct 23, 2018 at 3:15 PM Jinho Jung <visusee@gmail.com> wrote:
Hello, 

We appreciate you taking time for test! When we do more evaluation, we noticed that the previously attached query made regression only on DBs that we installed from APT manager (i.e., apt-get command) not on DBs that we built from the source code. But we also confirmed that there are many cases that cause regression to all DBs (installed from APT and build from source code)

Hope you can also test these queries too. These are the execution time on our machine.

1.sql
10.5  : 20ms
9.4.19: 1,227ms

4.sql
10.5  : 13ms
9.4.19: 88,721ms

20.sql
10.5  : 271ms
9.4.19: 6,104ms

22.sql
10.5  : 8ms
9.4.19: 105ms

Jinho Jung

On Tue, Oct 23, 2018 at 9:52 AM Jung, Jinho <jinho.jung@gatech.edu> wrote:


Hello Tom, 


Sorry for the misleading. Could you try these two queries? I made the query even slower in latest version of postgres. These are information about how we set up evaluation environment and query result. 


Thanks,

Jinho Jung


Install Multiple version of DBs in one machine
======================================
# Install 10.5
    $ wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -        
    $ sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/ xenial-pgdg main" > /etc/apt/sources.list.d/pgdg_xenial.list'
    $ sudo apt update
    $ sudo apt-get install postgresql-10

# Install 9.6
    $ sudo apt-get install postgresql-9.6 

# Install 9.5
    $ sudo apt-get install postgresql-9.5

# Install 9.4 
    $ sudo apt-get install postgresql-9.4 postgresql-contrib-9.4 libpq-dev postgresql-server-dev-9.4

# check 
    $ pg_lsclusters


Original regression query
==========================
explain analyze
select
  1
from
  information_schema.role_usage_grants as ref_2,
  lateral (
    select
      max((null)) over (partition by ref_3.amopfamily) as c8
    from
        pg_catalog.pg_amop as ref_3
    ) as subq_0
;

ORIGINAL querying time
on old version(9.4/9.5): 5.7ms
on latest version(10): 91.76ms



CORRELATED query to maximize error
===================================
explain analyze
select *
from information_schema.role_usage_grants f1
where grantor = 
    ( select max(ref_2.grantor)
      from
       information_schema.role_usage_grants as ref_2,
       lateral (
         select
           max((null)) over (partition by ref_3.amopfamily) as c8
         from
             pg_catalog.pg_amop as ref_3
         ) as subq_0
      where ref_2.object_catalog = f1.object_catalog
    )
;


CORRELATED querying time
on old version(9.4/9.5): 0.6s
on latest version(10): 113s
188 times slower




From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Saturday, October 13, 2018 5:59:06 PM
To: Jung, Jinho
Cc: pgsql-hackers@lists.postgresql.org
Subject: Re: Regarding query minimizer (simplifier)
 
"Jung, Jinho" <jinho.jung@gatech.edu> writes:
> Hello, I am Jinho Jung, Phd Student from GeorgiaTech, and try to find any SQL queries that cause performance regression. While conducting evaluation, I found an interesting query which makes x80 times slower execution in version 10.5 than version 9.4. Please see the attached files, if you are interested.

Hm, testing this in the regression database, it seems pretty speedy
across all supported branches, and indeed slower in 9.4 than later
branches (~25 ms vs ~10 ms).

It seems likely that you're testing in a very different database,
perhaps one with many more tables ... but if you don't explain the
test scenario, we aren't going to have much luck investigating.

                        regards, tom lane