On Mon, Jun 15, 2009 at 2:23 PM, Mike Toews<mwtoews@sfu.ca> wrote:
> Hi,
>
> I'm having difficulty constructing a query that will find breaks where data
> change in a time-series. I've done some searching for this too, but I
> haven't found anything.
>
> Here is my example situation, consider my source table:
> date bin
> 2009-01-01 red
> 2009-01-02 red
> 2009-01-03 blue
> 2009-01-04 blue
> 2009-01-05 blue
> 2009-01-06 red
> 2009-01-07 blue
> 2009-01-08 blue
> 2009-01-09 red
> 2009-01-10 red
>
>
> I would like to get the first and last of each consecutive series based on
> column "bin". My result for the table would look like:
> first last bin
> 2009-01-01 2009-01-02 red
> 2009-01-03 2009-01-05 blue
> 2009-01-06 2009-01-06 red
> 2009-01-07 2009-01-08 blue
> 2009-01-09 2009-01-10 red
>
>
> This is easy to compute using a spreadsheet or in R, but how would I do this
> with SQL? I'm using 8.3. Advice is appreciated.
(Written in email and untested- also, someone will probably provide a
better way, I hope, but this should at least work)
select date as first,
(select date from table t3 where t3.date<(select date from table t5
where t5.date>t1.date and t5.bin<>t1.bin order by date asc limit 1)
order by date desc limit 1) as last,
bin
from table t1 where (select bin from table t2 where t2.date<t1.order
order by date desc limit 1)<>t1.bin;
Ugly, and I'm pretty sure there's a much better way, but my brain is
failing me right now- hopefully this'll at least get you started,
though.
--
- David T. Wilson
david.t.wilson@gmail.com