Обсуждение: Why is sorting on two columns so slower than sorting on one column?

Поиск
Список
Период
Сортировка

Why is sorting on two columns so slower than sorting on one column?

От
Jie Li
Дата:
<font size="2">Hi,<br /><br />Here is the test table, <br /><br />postgres=# \d big_wf <br />    Table
"public.big_wf"<br/> Column |  Type   | Modifiers <br />--------+---------+-----------<br /> age    | integer | <br
/> id    | integer | <br /><br />postgres=# \dt+ big_wf <br />                     List of relations<br /> Schema | 
Name | Type  |  Owner   |  Size  | Description <br />--------+--------+-------+----------+--------+-------------<br
/> public| big_wf | table | workshop | 142 MB | <br /><br /><br />The first query sorting on one column:<br
/>postgres=#explain analyze select * from big_wf order by age;<br
/>                                                      QUERY
PLAN                                                       <br />
-------------------------------------------------------------------------------------------------------------------------<br
/> Sort (cost=565525.45..575775.45 rows=4100000 width=8) (actual time=11228.155..16427.149 rows=4100000 loops=1)<br />
  Sort Key: age<br />   Sort Method:  external sort  Disk: 72112kB<br />   ->  Seq Scan on big_wf 
(cost=0.00..59142.00rows=4100000 width=8) (actual time=6.196..4797.620 rows=4100000 loops=1)<br /> Total runtime:
19530.452ms<br /> (5 rows)<br /><br />The second query sorting on two columns:<br />postgres=# explain analyze select *
frombig_wf order by age,id;<br />                                                       QUERY
PLAN                                                       <br />
-------------------------------------------------------------------------------------------------------------------------<br
/> Sort (cost=565525.45..575775.45 rows=4100000 width=8) (actual time=37544.779..48206.702 rows=4100000 loops=1)<br />
  Sort Key: age, id<br />   Sort Method:  external merge  Disk: 72048kB<br />   ->  Seq Scan on big_wf 
(cost=0.00..59142.00rows=4100000 width=8) (actual time=6.796..5518.663 rows=4100000 loops=1)<br /> Total runtime:
51258.000ms<br /> (5 rows)<br /><br />The verision is 9.0.1 and the work_mem is 20MB. One special thing is, the first
column(age)of all the tuples are of the same value, so the second column(id)<span
id="internal-source-marker_0.431774702563559"style="font-family: Arial; color: rgb(0, 0, 0); background-color:
transparent;font-weight: normal; font-style: normal; text-decoration: none; vertical-align: baseline;"></span> is
alwaysneeded for comparison.<span id="internal-source-marker_0.431774702563559" style="font-family: Arial; color:
rgb(0,0, 0); background-color: transparent; font-weight: normal; font-style: normal; text-decoration: none;
vertical-align:baseline;">  While the first sorting takes about only 6 seconds, the second one takes over 30 seconds, 
Isthis too much than expected? Is there any </span></font><font size="2"><span
id="internal-source-marker_0.431774702563559"style="font-family: Arial; color: rgb(0, 0, 0); background-color:
transparent;font-weight: normal; font-style: normal; text-decoration: none; vertical-align: baseline;">possible
</span></font><spanid="internal-source-marker_0.431774702563559" style="font-size: 11pt; font-family: Arial; color:
rgb(0,0, 0); background-color: transparent; font-weight: normal; font-style: normal; text-decoration: none;
vertical-align:baseline;"><font size="2">optimization ?</font><br /><br />Thanks,<br />Li Jie<br /></span> 

Re: Why is sorting on two columns so slower than sorting on one column?

От
Kenneth Marshall
Дата:
On Thu, Dec 23, 2010 at 02:33:12AM -0500, Jie Li wrote:
> Hi,
> 
> Here is the test table,
> 
> postgres=# \d big_wf
>     Table "public.big_wf"
>  Column |  Type   | Modifiers
> --------+---------+-----------
>  age    | integer |
>  id     | integer |
> 
> postgres=# \dt+ big_wf
>                      List of relations
>  Schema |  Name  | Type  |  Owner   |  Size  | Description
> --------+--------+-------+----------+--------+-------------
>  public | big_wf | table | workshop | 142 MB |
> 
> 
> The first query sorting on one column:
> postgres=# explain analyze select * from big_wf order by age;
>                                                        QUERY
> PLAN
>
-------------------------------------------------------------------------------------------------------------------------
>  Sort  (cost=565525.45..575775.45 rows=4100000 width=8) (actual
> time=11228.155..16427.149 rows=4100000 loops=1)
>    Sort Key: age
>    Sort Method:  external sort  Disk: 72112kB
>    ->  Seq Scan on big_wf  (cost=0.00..59142.00 rows=4100000 width=8)
> (actual time=6.196..4797.620 rows=4100000 loops=1)
>  Total runtime: 19530.452 ms
> (5 rows)
> 
> The second query sorting on two columns:
> postgres=# explain analyze select * from big_wf order by age,id;
>                                                        QUERY
> PLAN
>
-------------------------------------------------------------------------------------------------------------------------
>  Sort  (cost=565525.45..575775.45 rows=4100000 width=8) (actual
> time=37544.779..48206.702 rows=4100000 loops=1)
>    Sort Key: age, id
>    Sort Method:  external merge  Disk: 72048kB
>    ->  Seq Scan on big_wf  (cost=0.00..59142.00 rows=4100000 width=8)
> (actual time=6.796..5518.663 rows=4100000 loops=1)
>  Total runtime: 51258.000 ms
> (5 rows)
> 
> The verision is 9.0.1 and the work_mem is 20MB. One special thing is, the
> first column(age) of all the tuples are of the same value, so the second
> column(id) is always needed for comparison.  While the first sorting takes
> about only 6 seconds, the second one takes over 30 seconds,  Is this too
> much than expected? Is there any possible optimization ?
> 
> Thanks,
> Li Jie

Hi Li,

If I understand your description, in the first query the sort does
not actually have to do anything because the column values for "age"
are all degenerate. In the second query, you actually need to sort
the values which is why it takes longer. If the first column values
are the same, then simply sorting by "id" alone would be faster.
You could also bump up work_mem for the query to perform the sort
in memory.

Regards,
Ken



Re: Why is sorting on two columns so slower than sorting on one column?

От
Marti Raudsepp
Дата:
On Thu, Dec 23, 2010 at 09:33, Jie Li <jay23jack@gmail.com> wrote:
> While the first sorting takes
> about only 6 seconds, the second one takes over 30 seconds,  Is this too
> much than expected? Is there any possible optimization ?

If you're doing these queries often, you should:
CREATE INDEX ix_big_wf_age_id ON big_wf (age, id)

If that's still not fast enough, you can physically sort rows in the
table using the newly created index:
CLUSTER big_wf USING ix_big_wf_age_id;

Please post back your results. :)

Regards,
Marti


Re: Why is sorting on two columns so slower than sortingon one column?

От
Kenneth Marshall
Дата:
On Thu, Dec 23, 2010 at 10:19:46PM +0800, Li Jie wrote:
> Hi Ken,
> 
> Thanks for your tips! Yes it is the case, and I run another query sorting on the second column whose values are
random.
> 
> postgres=# explain analyze select * from big_wf order by id;
>                                                       QUERY PLAN
 
 
>
-------------------------------------------------------------------------------------------------------------------------
> Sort  (cost=565525.45..575775.45 rows=4100000 width=8) (actual time=25681.875..36458.824 rows=4100000 loops=1)
>   Sort Key: id
>   Sort Method:  external merge  Disk: 72048kB
>   ->  Seq Scan on big_wf  (cost=0.00..59142.00 rows=4100000 width=8) (actual time=8.595..5569.500 rows=4100000
loops=1)
> 
> Now the sorting takes about 20 seconds, so it seems reasonable compared to 30 seconds, right? But one thing I'm
confusedis that, why is additional comparison really so expensive?  Does it incur additional I/O? From the cost model,
itseems not, all the "cost" are the same (575775.45).
 
> 
> Thanks,
> Li Jie

In the first query, the cost is basically the I/O cost to read the
table from disk. The actual sort does not do anything since the
sort values are the same. In the second query, the sort has to
swap things in memory/disk to get them in the correct order for
the result. This actually takes CPU and possibly additional I/O
which is why it is slower. In the case of sorting by just the "id"
column, the size of the sorted values is smaller which would need
fewer batches to complete the sort since the sort is bigger than
the work_mem.

Cheers,
Ken


Re: Why is sorting on two columns so slower thansortingon one column?

От
Kenneth Marshall
Дата:
On Thu, Dec 23, 2010 at 10:42:26PM +0800, Li Jie wrote:
> ----- Original Message ----- 
> From: "Kenneth Marshall" <ktm@rice.edu>
> To: "Li Jie" <jay23jack@gmail.com>
> Cc: "pgsql-hackers" <pgsql-hackers@postgresql.org>
> Sent: Thursday, December 23, 2010 10:30 PM
> Subject: Re: [HACKERS] Why is sorting on two columns so slower thansortingon one column?
> 
> 
> > On Thu, Dec 23, 2010 at 10:19:46PM +0800, Li Jie wrote:
> >> Hi Ken,
> >> 
> >> Thanks for your tips! Yes it is the case, and I run another query sorting on the second column whose values are
random.
> >> 
> >> postgres=# explain analyze select * from big_wf order by id;
> >>                                                       QUERY PLAN
    
 
> >>
-------------------------------------------------------------------------------------------------------------------------
> >> Sort  (cost=565525.45..575775.45 rows=4100000 width=8) (actual time=25681.875..36458.824 rows=4100000 loops=1)
> >>   Sort Key: id
> >>   Sort Method:  external merge  Disk: 72048kB
> >>   ->  Seq Scan on big_wf  (cost=0.00..59142.00 rows=4100000 width=8) (actual time=8.595..5569.500 rows=4100000
loops=1)
> >> 
> >> Now the sorting takes about 20 seconds, so it seems reasonable compared to 30 seconds, right? But one thing I'm
confusedis that, why is additional comparison really so expensive?  Does it incur additional I/O? From the cost model,
itseems not, all the "cost" are the same (575775.45).
 
> >> 
> >> Thanks,
> >> Li Jie
> > 
> > In the first query, the cost is basically the I/O cost to read the
> > table from disk. The actual sort does not do anything since the
> > sort values are the same. In the second query, the sort has to
> > swap things in memory/disk to get them in the correct order for
> > the result. This actually takes CPU and possibly additional I/O
> > which is why it is slower. In the case of sorting by just the "id"
> > column, the size of the sorted values is smaller which would need
> > fewer batches to complete the sort since the sort is bigger than
> > the work_mem.
> > 
> > Cheers,
> > Ken
> 
> Hi Ken,
> 
> Thanks for your analysis.
> 
> But in the last query that sorts on "id",  since the query selects all the columns for output, the actual sorted size
isthe same, and the only difference is the comparison cost. The query sorting on two columns needs to do twice the
comparison.Am I right?
 
> 
> Thanks,
> Li Jie

I think you are right. Sorry for the confusion.

Ken


Re: Why is sorting on two columns so slower thansortingon one column?

От
Tom Lane
Дата:
Kenneth Marshall <ktm@rice.edu> writes:
> On Thu, Dec 23, 2010 at 10:42:26PM +0800, Li Jie wrote:
>> But in the last query that sorts on "id",  since the query selects all the columns for output, the actual sorted
sizeis the same, and the only difference is the comparison cost. The query sorting on two columns needs to do twice the
comparison.Am I right?
 

> I think you are right. Sorry for the confusion.

I doubt the cost of comparing two integers is the issue here; rather
it's more likely one of how many merge passes were needed.  You could
find out instead of just speculating by turning on trace_sort and
comparing the log outputs.
        regards, tom lane


Re: Why is sorting on two columns so slower thansortingon one column?

От
"Li Jie"
Дата:
----- Original Message ----- 
From: "Kenneth Marshall" <ktm@rice.edu>
To: "Li Jie" <jay23jack@gmail.com>
Cc: "pgsql-hackers" <pgsql-hackers@postgresql.org>
Sent: Thursday, December 23, 2010 10:30 PM
Subject: Re: [HACKERS] Why is sorting on two columns so slower thansortingon one column?


> On Thu, Dec 23, 2010 at 10:19:46PM +0800, Li Jie wrote:
>> Hi Ken,
>> 
>> Thanks for your tips! Yes it is the case, and I run another query sorting on the second column whose values are
random.
>> 
>> postgres=# explain analyze select * from big_wf order by id;
>>                                                       QUERY PLAN
  
 
>>
-------------------------------------------------------------------------------------------------------------------------
>> Sort  (cost=565525.45..575775.45 rows=4100000 width=8) (actual time=25681.875..36458.824 rows=4100000 loops=1)
>>   Sort Key: id
>>   Sort Method:  external merge  Disk: 72048kB
>>   ->  Seq Scan on big_wf  (cost=0.00..59142.00 rows=4100000 width=8) (actual time=8.595..5569.500 rows=4100000
loops=1)
>> 
>> Now the sorting takes about 20 seconds, so it seems reasonable compared to 30 seconds, right? But one thing I'm
confusedis that, why is additional comparison really so expensive?  Does it incur additional I/O? From the cost model,
itseems not, all the "cost" are the same (575775.45).
 
>> 
>> Thanks,
>> Li Jie
> 
> In the first query, the cost is basically the I/O cost to read the
> table from disk. The actual sort does not do anything since the
> sort values are the same. In the second query, the sort has to
> swap things in memory/disk to get them in the correct order for
> the result. This actually takes CPU and possibly additional I/O
> which is why it is slower. In the case of sorting by just the "id"
> column, the size of the sorted values is smaller which would need
> fewer batches to complete the sort since the sort is bigger than
> the work_mem.
> 
> Cheers,
> Ken

Hi Ken,

Thanks for your analysis.

But in the last query that sorts on "id",  since the query selects all the columns for output, the actual sorted size
isthe same, and the only difference is the comparison cost. The query sorting on two columns needs to do twice the
comparison.Am I right?
 

Thanks,
Li Jie

Re: Why is sorting on two columns so slower than sorting on one column?

От
"Li Jie"
Дата:
Hi Marti,

Thanks for your help! I guess I understand what you mean, a clustered index will make sorting as cheap as a seq scan,
right?

But what I meant is, is there any potential optimization for the backend implementation? Intuitively, if sorting on one
columnor two columns will incur the same I/O costs, why should there be so much difference?
 

Thanks,
Li Jie

----- Original Message ----- 
From: "Marti Raudsepp" <marti@juffo.org>
To: "Jie Li" <jay23jack@gmail.com>
Cc: "pgsql-hackers" <pgsql-hackers@postgresql.org>
Sent: Thursday, December 23, 2010 10:17 PM
Subject: Re: [HACKERS] Why is sorting on two columns so slower than sorting on one column?


On Thu, Dec 23, 2010 at 09:33, Jie Li <jay23jack@gmail.com> wrote:
> While the first sorting takes
> about only 6 seconds, the second one takes over 30 seconds, Is this too
> much than expected? Is there any possible optimization ?

If you're doing these queries often, you should:
CREATE INDEX ix_big_wf_age_id ON big_wf (age, id)

If that's still not fast enough, you can physically sort rows in the
table using the newly created index:
CLUSTER big_wf USING ix_big_wf_age_id;

Please post back your results. :)

Regards,
Marti

Re: Why is sorting on two columns so slower than sortingon one column?

От
"Li Jie"
Дата:
Hi Ken,

Thanks for your tips! Yes it is the case, and I run another query sorting on the second column whose values are
random.

postgres=# explain analyze select * from big_wf order by id;                                                     QUERY
PLAN                                                       
 

-------------------------------------------------------------------------------------------------------------------------
Sort  (cost=565525.45..575775.45 rows=4100000 width=8) (actual time=25681.875..36458.824 rows=4100000 loops=1) Sort
Key:id Sort Method:  external merge  Disk: 72048kB ->  Seq Scan on big_wf  (cost=0.00..59142.00 rows=4100000 width=8)
(actualtime=8.595..5569.500 rows=4100000 loops=1)
 

Now the sorting takes about 20 seconds, so it seems reasonable compared to 30 seconds, right? But one thing I'm
confusedis that, why is additional comparison really so expensive?  Does it incur additional I/O? From the cost model,
itseems not, all the "cost" are the same (575775.45).
 

Thanks,
Li Jie

----- Original Message ----- 
From: "Kenneth Marshall" <ktm@rice.edu>
To: "Jie Li" <jay23jack@gmail.com>
Cc: "pgsql-hackers" <pgsql-hackers@postgresql.org>
Sent: Thursday, December 23, 2010 10:03 PM
Subject: Re: [HACKERS] Why is sorting on two columns so slower than sortingon one column?


> On Thu, Dec 23, 2010 at 02:33:12AM -0500, Jie Li wrote:
>> Hi,
>> 
>> Here is the test table,
>> 
>> postgres=# \d big_wf
>>     Table "public.big_wf"
>>  Column |  Type   | Modifiers
>> --------+---------+-----------
>>  age    | integer |
>>  id     | integer |
>> 
>> postgres=# \dt+ big_wf
>>                      List of relations
>>  Schema |  Name  | Type  |  Owner   |  Size  | Description
>> --------+--------+-------+----------+--------+-------------
>>  public | big_wf | table | workshop | 142 MB |
>> 
>> 
>> The first query sorting on one column:
>> postgres=# explain analyze select * from big_wf order by age;
>>                                                        QUERY
>> PLAN
>>
-------------------------------------------------------------------------------------------------------------------------
>>  Sort  (cost=565525.45..575775.45 rows=4100000 width=8) (actual
>> time=11228.155..16427.149 rows=4100000 loops=1)
>>    Sort Key: age
>>    Sort Method:  external sort  Disk: 72112kB
>>    ->  Seq Scan on big_wf  (cost=0.00..59142.00 rows=4100000 width=8)
>> (actual time=6.196..4797.620 rows=4100000 loops=1)
>>  Total runtime: 19530.452 ms
>> (5 rows)
>> 
>> The second query sorting on two columns:
>> postgres=# explain analyze select * from big_wf order by age,id;
>>                                                        QUERY
>> PLAN
>>
-------------------------------------------------------------------------------------------------------------------------
>>  Sort  (cost=565525.45..575775.45 rows=4100000 width=8) (actual
>> time=37544.779..48206.702 rows=4100000 loops=1)
>>    Sort Key: age, id
>>    Sort Method:  external merge  Disk: 72048kB
>>    ->  Seq Scan on big_wf  (cost=0.00..59142.00 rows=4100000 width=8)
>> (actual time=6.796..5518.663 rows=4100000 loops=1)
>>  Total runtime: 51258.000 ms
>> (5 rows)
>> 
>> The verision is 9.0.1 and the work_mem is 20MB. One special thing is, the
>> first column(age) of all the tuples are of the same value, so the second
>> column(id) is always needed for comparison.  While the first sorting takes
>> about only 6 seconds, the second one takes over 30 seconds,  Is this too
>> much than expected? Is there any possible optimization ?
>> 
>> Thanks,
>> Li Jie
> 
> Hi Li,
> 
> If I understand your description, in the first query the sort does
> not actually have to do anything because the column values for "age"
> are all degenerate. In the second query, you actually need to sort
> the values which is why it takes longer. If the first column values
> are the same, then simply sorting by "id" alone would be faster.
> You could also bump up work_mem for the query to perform the sort
> in memory.
> 
> Regards,
> Ken
>

Re: Why is sorting on two columns so slower thansortingon one column?

От
Jie Li
Дата:
<br /><br /><div class="gmail_quote">On Thu, Dec 23, 2010 at 10:26 AM, Tom Lane <span dir="ltr"><<a
href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>></span>wrote:<br /><blockquote class="gmail_quote"
style="margin:0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div class="im">Kenneth
Marshall<<a href="mailto:ktm@rice.edu">ktm@rice.edu</a>> writes:<br /> > On Thu, Dec 23, 2010 at 10:42:26PM
+0800,Li Jie wrote:<br /></div><div class="im">>> But in the last query that sorts on "id",  since the query
selectsall the columns for output, the actual sorted size is the same, and the only difference is the comparison cost.
Thequery sorting on two columns needs to do twice the comparison. Am I right?<br /><br /></div><div class="im">> I
thinkyou are right. Sorry for the confusion.<br /><br /></div>I doubt the cost of comparing two integers is the issue
here;rather<br /> it's more likely one of how many merge passes were needed.  You could<br /> find out instead of just
speculatingby turning on trace_sort and<br /> comparing the log outputs.<br /><br />                        regards,
tomlane<br /></blockquote></div><br />I follow your advice and set the trace_sort, here is the two queries result, the
logoutputs looks similar and  they seem to use the same number of tapes. But the time is different. Could you have a
look:<br/><br /><span id="internal-source-marker_0.16665964164891722" style="font-size: 11pt; font-family: Arial;
color:rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal; text-decoration: none;
vertical-align:baseline;">postgres=# explain analyze select * from big_wf order by id;</span><br /><span
style="font-size:11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal;
font-style:normal; text-decoration: none; vertical-align: baseline;">
                                                     QUERY PLAN                                                      
 </span><br/><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent;
font-weight:normal; font-style: normal; text-decoration: none; vertical-align:
baseline;">-------------------------------------------------------------------------------------------------------------------------</span><br
/><spanstyle="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight:
normal;font-style: normal; text-decoration: none; vertical-align: baseline;"> Sort  (cost=565525.45..575775.45
rows=4100000width=8) (actual time=28102.985..39351.309 rows=4100000 loops=1)</span><br /><span style="font-size: 11pt;
font-family:Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal;
text-decoration:none; vertical-align: baseline;">   Sort Key: id</span><br /><span style="font-size: 11pt; font-family:
Arial;color: rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal; text-decoration:
none;vertical-align: baseline;">   Sort Method:  external merge  Disk: 72048kB</span><br /><span style="font-size:
11pt;font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal;
text-decoration:none; vertical-align: baseline;">   ->  Seq Scan on big_wf  (cost=0.00..59142.00 rows=4100000
width=8)(actual time=9.190..7262.789 rows=4100000 loops=1)</span><br /><span style="font-size: 11pt; font-family:
Arial;color: rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal; text-decoration:
none;vertical-align: baseline;"> Total runtime: 42953.855 ms</span><br /><span style="font-size: 11pt; font-family:
Arial;color: rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal; text-decoration:
none;vertical-align: baseline;">(5 rows)</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0,
0);background-color: transparent; font-weight: normal; font-style: normal; text-decoration: none; vertical-align:
baseline;">STATEMENT: explain analyze select * from big_wf order by id;</span><br /><span style="font-size: 11pt;
font-family:Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal;
text-decoration:none; vertical-align: baseline;">LOG:  begin tuple sort: nkeys = 1, workMem = 20480, randomAccess =
f</span><br/><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent;
font-weight:normal; font-style: normal; text-decoration: none; vertical-align: baseline;">STATEMENT:  explain analyze
select* from big_wf order by id;</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0);
background-color:transparent; font-weight: normal; font-style: normal; text-decoration: none; vertical-align:
baseline;">LOG: switching to external sort with 74 tapes: CPU 0.29s/0.28u sec elapsed 0.71 sec</span><br /><span
style="font-size:11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal;
font-style:normal; text-decoration: none; vertical-align: baseline;">STATEMENT:  explain analyze select * from big_wf
orderby id;</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color:
transparent;font-weight: normal; font-style: normal; text-decoration: none; vertical-align: baseline;">LOG:  finished
writingrun 1 to tape 0: CPU 0.68s/2.12u sec elapsed 3.41 sec</span><br /><span style="font-size: 11pt; font-family:
Arial;color: rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal; text-decoration:
none;vertical-align: baseline;">STATEMENT:  explain analyze select * from big_wf order by id;</span><br /><span
style="font-size:11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal;
font-style:normal; text-decoration: none; vertical-align: baseline;">LOG:  finished writing run 2 to tape 1: CPU
1.22s/4.24usec elapsed 6.53 sec</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0);
background-color:transparent; font-weight: normal; font-style: normal; text-decoration: none; vertical-align:
baseline;">STATEMENT: explain analyze select * from big_wf order by id;</span><br /><span style="font-size: 11pt;
font-family:Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal;
text-decoration:none; vertical-align: baseline;">LOG:  finished writing run 3 to tape 2: CPU 1.67s/6.38u sec elapsed
9.67sec</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color:
transparent;font-weight: normal; font-style: normal; text-decoration: none; vertical-align: baseline;">STATEMENT:
 explainanalyze select * from big_wf order by id;</span><br /><span style="font-size: 11pt; font-family: Arial; color:
rgb(0,0, 0); background-color: transparent; font-weight: normal; font-style: normal; text-decoration: none;
vertical-align:baseline;">LOG:  finished writing run 4 to tape 3: CPU 2.22s/8.61u sec elapsed 12.93 sec</span><br
/><spanstyle="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight:
normal;font-style: normal; text-decoration: none; vertical-align: baseline;">STATEMENT:  explain analyze select * from
big_wforder by id;</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color:
transparent;font-weight: normal; font-style: normal; text-decoration: none; vertical-align: baseline;">LOG:  finished
writingrun 5 to tape 4: CPU 2.72s/10.80u sec elapsed 16.09 sec</span><br /><span style="font-size: 11pt; font-family:
Arial;color: rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal; text-decoration:
none;vertical-align: baseline;">STATEMENT:  explain analyze select * from big_wf order by id;</span><br /><span
style="font-size:11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal;
font-style:normal; text-decoration: none; vertical-align: baseline;">LOG:  finished writing run 6 to tape 5: CPU
3.23s/12.86usec elapsed 19.12 sec</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0);
background-color:transparent; font-weight: normal; font-style: normal; text-decoration: none; vertical-align:
baseline;">STATEMENT: explain analyze select * from big_wf order by id;</span><br /><span style="font-size: 11pt;
font-family:Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal;
text-decoration:none; vertical-align: baseline;">LOG:  finished writing run 7 to tape 6: CPU 3.81s/15.02u sec elapsed
23.84sec</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color:
transparent;font-weight: normal; font-style: normal; text-decoration: none; vertical-align: baseline;">STATEMENT:
 explainanalyze select * from big_wf order by id;</span><br /><span style="font-size: 11pt; font-family: Arial; color:
rgb(0,0, 0); background-color: transparent; font-weight: normal; font-style: normal; text-decoration: none;
vertical-align:baseline;">LOG:  finished writing run 8 to tape 7: CPU 4.36s/17.13u sec elapsed 27.05 sec</span><br
/><spanstyle="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight:
normal;font-style: normal; text-decoration: none; vertical-align: baseline;">STATEMENT:  explain analyze select * from
big_wforder by id;</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color:
transparent;font-weight: normal; font-style: normal; text-decoration: none; vertical-align: baseline;">LOG:
 performsortstarting: CPU 4.38s/17.20u sec elapsed 27.15 sec</span><br /><span style="font-size: 11pt; font-family:
Arial;color: rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal; text-decoration:
none;vertical-align: baseline;">STATEMENT:  explain analyze select * from big_wf order by id;</span><br /><span
style="font-size:11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal;
font-style:normal; text-decoration: none; vertical-align: baseline;">LOG:  finished writing run 9 to tape 8: CPU
4.39s/17.88usec elapsed 27.97 sec</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0);
background-color:transparent; font-weight: normal; font-style: normal; text-decoration: none; vertical-align:
baseline;">STATEMENT: explain analyze select * from big_wf order by id;</span><br /><span style="font-size: 11pt;
font-family:Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal;
text-decoration:none; vertical-align: baseline;">LOG:  finished writing final run 10 to tape 9: CPU 4.39s/17.88u sec
elapsed27.97 sec</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color:
transparent;font-weight: normal; font-style: normal; text-decoration: none; vertical-align: baseline;">STATEMENT:
 explainanalyze select * from big_wf order by id;</span><br /><span style="font-size: 11pt; font-family: Arial; color:
rgb(0,0, 0); background-color: transparent; font-weight: normal; font-style: normal; text-decoration: none;
vertical-align:baseline;">LOG:  performsort done (except 10-way final merge): CPU 4.39s/17.98u sec elapsed 28.10
sec</span><br/><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent;
font-weight:normal; font-style: normal; text-decoration: none; vertical-align: baseline;">STATEMENT:  explain analyze
select* from big_wf order by id;</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0);
background-color:transparent; font-weight: normal; font-style: normal; text-decoration: none; vertical-align:
baseline;">LOG: external sort ended, 9006 disk blocks used: CPU 8.01s/27.02u sec elapsed 42.92 sec</span><br /><span
style="font-size:11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal;
font-style:normal; text-decoration: none; vertical-align: baseline;">STATEMENT:  explain analyze select * from big_wf
orderby id;</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color:
transparent;font-weight: normal; font-style: normal; text-decoration: none; vertical-align: baseline;"></span><br
/><spanstyle="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight:
normal;font-style: normal; text-decoration: none; vertical-align: baseline;"></span><br /><span style="font-size: 11pt;
font-family:Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal;
text-decoration:none; vertical-align: baseline;">postgres=# explain analyze select * from big_wf order by
age,id;</span><br/><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color:
transparent;font-weight: normal; font-style: normal; text-decoration: none; vertical-align: baseline;"> Sort
 (cost=565525.45..575775.45rows=4100000 width=8) (actual time=43709.851..57048.645 rows=4100000 loops=1)</span><br
/><spanstyle="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight:
normal;font-style: normal; text-decoration: none; vertical-align: baseline;">   Sort Key: age, id</span><br /><span
style="font-size:11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal;
font-style:normal; text-decoration: none; vertical-align: baseline;">   Sort Method:  external merge  Disk:
72048kB</span><br/><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color:
transparent;font-weight: normal; font-style: normal; text-decoration: none; vertical-align: baseline;">   ->  Seq
Scanon big_wf  (cost=0.00..59142.00 rows=4100000 width=8) (actual time=10.090..7075.208 rows=4100000 loops=1)</span><br
/><spanstyle="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight:
normal;font-style: normal; text-decoration: none; vertical-align: baseline;"> Total runtime: 60721.824 ms</span><br
/><spanstyle="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight:
normal;font-style: normal; text-decoration: none; vertical-align: baseline;"></span><br /><span style="font-size: 11pt;
font-family:Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal;
text-decoration:none; vertical-align: baseline;">STATEMENT:  explain analyze select * from big_wf order by
age,id;</span><br/><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color:
transparent;font-weight: normal; font-style: normal; text-decoration: none; vertical-align: baseline;">LOG:  begin
tuplesort: nkeys = 2, workMem = 20480, randomAccess = f</span><br /><span style="font-size: 11pt; font-family: Arial;
color:rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal; text-decoration: none;
vertical-align:baseline;">STATEMENT:  explain analyze select * from big_wf order by age,id;</span><br /><span
style="font-size:11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal;
font-style:normal; text-decoration: none; vertical-align: baseline;">LOG:  switching to external sort with 74 tapes:
CPU0.28s/0.30u sec elapsed 0.67 sec</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0);
background-color:transparent; font-weight: normal; font-style: normal; text-decoration: none; vertical-align:
baseline;">STATEMENT: explain analyze select * from big_wf order by age,id;</span><br /><span style="font-size: 11pt;
font-family:Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal;
text-decoration:none; vertical-align: baseline;">LOG:  finished writing run 1 to tape 0: CPU 0.71s/3.63u sec elapsed
4.90sec</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color:
transparent;font-weight: normal; font-style: normal; text-decoration: none; vertical-align: baseline;">STATEMENT:
 explainanalyze select * from big_wf order by age,id;</span><br /><span style="font-size: 11pt; font-family: Arial;
color:rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal; text-decoration: none;
vertical-align:baseline;">LOG:  finished writing run 2 to tape 1: CPU 1.30s/7.53u sec elapsed 10.30 sec</span><br
/><spanstyle="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight:
normal;font-style: normal; text-decoration: none; vertical-align: baseline;">STATEMENT:  explain analyze select * from
big_wforder by age,id;</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0);
background-color:transparent; font-weight: normal; font-style: normal; text-decoration: none; vertical-align:
baseline;">LOG: finished writing run 3 to tape 2: CPU 1.88s/11.36u sec elapsed 15.39 sec</span><br /><span
style="font-size:11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal;
font-style:normal; text-decoration: none; vertical-align: baseline;">STATEMENT:  explain analyze select * from big_wf
orderby age,id;</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color:
transparent;font-weight: normal; font-style: normal; text-decoration: none; vertical-align: baseline;">LOG:  finished
writingrun 4 to tape 3: CPU 2.43s/15.20u sec elapsed 20.51 sec</span><br /><span style="font-size: 11pt; font-family:
Arial;color: rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal; text-decoration:
none;vertical-align: baseline;">STATEMENT:  explain analyze select * from big_wf order by age,id;</span><br /><span
style="font-size:11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal;
font-style:normal; text-decoration: none; vertical-align: baseline;">LOG:  finished writing run 5 to tape 4: CPU
3.05s/18.96usec elapsed 25.44 sec</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0);
background-color:transparent; font-weight: normal; font-style: normal; text-decoration: none; vertical-align:
baseline;">STATEMENT: explain analyze select * from big_wf order by age,id;</span><br /><span style="font-size: 11pt;
font-family:Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal;
text-decoration:none; vertical-align: baseline;">LOG:  finished writing run 6 to tape 5: CPU 3.68s/22.74u sec elapsed
30.47sec</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color:
transparent;font-weight: normal; font-style: normal; text-decoration: none; vertical-align: baseline;">STATEMENT:
 explainanalyze select * from big_wf order by age,id;</span><br /><span style="font-size: 11pt; font-family: Arial;
color:rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal; text-decoration: none;
vertical-align:baseline;">LOG:  finished writing run 7 to tape 6: CPU 4.24s/26.63u sec elapsed 36.61 sec</span><br
/><spanstyle="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight:
normal;font-style: normal; text-decoration: none; vertical-align: baseline;">STATEMENT:  explain analyze select * from
big_wforder by age,id;</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0);
background-color:transparent; font-weight: normal; font-style: normal; text-decoration: none; vertical-align:
baseline;">LOG: finished writing run 8 to tape 7: CPU 4.78s/30.41u sec elapsed 41.56 sec</span><br /><span
style="font-size:11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal;
font-style:normal; text-decoration: none; vertical-align: baseline;">STATEMENT:  explain analyze select * from big_wf
orderby age,id;</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color:
transparent;font-weight: normal; font-style: normal; text-decoration: none; vertical-align: baseline;">LOG:
 performsortstarting: CPU 4.81s/30.56u sec elapsed 41.75 sec</span><br /><span style="font-size: 11pt; font-family:
Arial;color: rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal; text-decoration:
none;vertical-align: baseline;">STATEMENT:  explain analyze select * from big_wf order by age,id;</span><br /><span
style="font-size:11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal;
font-style:normal; text-decoration: none; vertical-align: baseline;">LOG:  finished writing run 9 to tape 8: CPU
4.84s/32.07usec elapsed 43.56 sec</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0);
background-color:transparent; font-weight: normal; font-style: normal; text-decoration: none; vertical-align:
baseline;">STATEMENT: explain analyze select * from big_wf order by age,id;</span><br /><span style="font-size: 11pt;
font-family:Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal;
text-decoration:none; vertical-align: baseline;">LOG:  finished writing final run 10 to tape 9: CPU 4.84s/32.07u sec
elapsed43.56 sec</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color:
transparent;font-weight: normal; font-style: normal; text-decoration: none; vertical-align: baseline;">STATEMENT:
 explainanalyze select * from big_wf order by age,id;</span><br /><span style="font-size: 11pt; font-family: Arial;
color:rgb(0, 0, 0); background-color: transparent; font-weight: normal; font-style: normal; text-decoration: none;
vertical-align:baseline;">LOG:  performsort done (except 10-way final merge): CPU 4.85s/32.16u sec elapsed 43.70
sec</span><br/><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent;
font-weight:normal; font-style: normal; text-decoration: none; vertical-align: baseline;">STATEMENT:  explain analyze
select* from big_wf order by age,id;</span><br /><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0);
background-color:transparent; font-weight: normal; font-style: normal; text-decoration: none; vertical-align:
baseline;">LOG: external sort ended, 9006 disk blocks used: CPU 8.60s/41.93u sec elapsed 60.73 sec</span><br /><span
style="font-size:11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: normal;
font-style:normal; text-decoration: none; vertical-align: baseline;">STATEMENT:  explain analyze select * from big_wf
orderby age,id;</span><br /> 

Re: Why is sorting on two columns so slower thansortingon one column?

От
Simon Riggs
Дата:
On Fri, 2010-12-24 at 00:27 -0500, Jie Li wrote:

>         I doubt the cost of comparing two integers is the issue here;
>         rather
>         it's more likely one of how many merge passes were needed.
>          You could
>         find out instead of just speculating by turning on trace_sort
>         and
>         comparing the log outputs.

> postgres=# explain analyze select * from big_wf order by id;

> LOG:  switching to external sort with 74 tapes: CPU 0.29s/0.28u sec
> elapsed 0.71 sec
> LOG:  external sort ended, 9006 disk blocks used: CPU 8.01s/27.02u
> sec 

> elapsed 42.92 sec
> STATEMENT:  explain analyze select * from big_wf order by id;


> STATEMENT:  explain analyze select * from big_wf order by age,id;

> LOG:  begin tuple sort: nkeys = 2, workMem = 20480, randomAccess = f
> STATEMENT:  explain analyze select * from big_wf order by age,id;
> LOG:  switching to external sort with 74 tapes: CPU 0.28s/0.30u sec
> elapsed 0.67 sec
> LOG:  external sort ended, 9006 disk blocks used: CPU 8.60s/41.93u sec
> elapsed 60.73 sec
> STATEMENT:  explain analyze select * from big_wf order by age,id;

I think the answer is that only the first column comparison is
optimised. Second and subsequent comparisons are not optimised.

-- Simon Riggs           http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services



Re: Why is sorting on two columns so slower thansortingon one column?

От
Robert Haas
Дата:
On Mon, Dec 27, 2010 at 3:58 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> I think the answer is that only the first column comparison is
> optimised. Second and subsequent comparisons are not optimised.

What sort of optimization are you referring to here?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company