Re: Find inconsistencies in data with date range

Поиск
Список
Период
Сортировка
От Jason Aleski
Тема Re: Find inconsistencies in data with date range
Дата
Msg-id 550111A0.2070105@gmail.com
обсуждение исходный текст
Ответ на Re: Find inconsistencies in data with date range  (s d <daku.sandor@gmail.com>)
Список pgsql-sql
In case anyone else needs similar code, I was able to get this working.  Below is the code that pulls the missing dates using a cursor and returns the information into a table. I'm sure there may be a way to make the code more efficient, but considering this will only get ran maybe once a quarter (for quarterly reports), it works for me.  With 700+ stores, it takes about 30 minutes to fully run from a reporting server.  I have a JAVA program that queries the function "SELECT * FROM eod_missing_dates();"  Then sends all the missing dates to a RabbitMQ server to with a worker program to try to rebuild the missing eod summaries and if not, it will send a message to the store managers.  Hopefully this code will help someone else!


CREATE OR REPLACE FUNCTION eod_missing_dates()
  RETURNS TABLE(store_id uuid, location character varying, missing_ts timestamp with time zone) AS
$BODY$
DECLARE
  location_cursor CURSOR FOR SELECT * FROM locations ORDER BY store_id;
  store_rec location%ROWTYPE;
BEGIN

  CREATE TEMP TABLE dr_temptable(dr_ts, dr_dow) AS (SELECT t1.GenDate as gendate, extract(dow from GenDate) AS dayofweek
  FROM (SELECT date as GenDate
        FROM generate_series('1950-01-01'::date,CURRENT_TIMESTAMP::date,'1 day'::interval) date
       ) AS t1 
  WHERE extract(dow from GenDate) NOT IN (0,6));

  OPEN location_cursor;
  LOOP
    FETCH location_cursor INTO store_rec;
    EXIT WHEN store_rec IS NULL;

    IF NOT FOUND THEN
      EXIT;
    END IF;
   
    RAISE INFO '%', 'Checking data/dates for ' || store_rec.location;
    RETURN QUERY SELECT store_rec.row_id as store_id, store_rec.location AS location, dr_ts AS missing_ts FROM dr_temptable AS t1
                 WHERE dr_ts NOT IN (SELECT t1.eod_ts FROM daily_salessummary AS t1
                      JOIN location AS t2 ON t1.store_id = t2.row_id
                      WHERE t2.location = store_rec.location)                     
                      AND dr_ts > (SELECT start_date FROM locations WHERE location=store_rec.location);   
  END LOOP;
  CLOSE location_cursor; 
  DROP TABLE dr_temptable;
END;
$BODY$
  LANGUAGE plpgsql VOLATILE
;


Jason Aleski / IT Specialist


On 6 March 2015 at 22:38, Jason Aleski <jason.aleski@gmail.com> wrote:
I know I can do this Java, but I'd rather have this running as a Stored Procedure.  What I am wanting to do is identify and potentially correct the summary data for date inconsistencies.  We have policies/red flag reports in place to keep this from happening, but we are now cleaning up history.  The query below works on a per store basis, but I'd like to be able to run this for all stores in the location table.

I've looked at some procedure codes regarding looping, but everything I try to create seems to give me problems.  THe code I'm trying is also below.  Does anyone have any suggestions on how to accomplish this?



Working Tables
  locations - table contains store information, startup date, address, etc
  daily_salessummary - table holds daily sales summary by store (summary should be updated nightly).  eod_ts is End of Day Timestamp.

Query
WITH datelist AS(
  SELECT t1.GenDate as gendate, extract(dow from GenDate) AS dayofweek
  FROM (SELECT date as GenDate
        FROM generate_series('1985-01-01'::date,CURRENT_TIMESTAMP::date,'1 day'::interval) date
       ) AS t1 
)
SELECT gendate FROM datelist AS t1
WHERE gendate NOT IN (SELECT t1.eod_ts FROM daily_salessummary AS t1
                      JOIN locations AS t2 ON t1.location_id = t2.row_id
                      WHERE t2.locationCode = 'US_FL_TAMPA_141')
                     
  AND gendate > (SELECT start_date FROM locations WHERE locationCode = 'US_FL_TAMPA_141')


Desired Output - could output to an exceptions table
StoreCode 'US_FL_TAMA_141' missing daily summary for 1998-01-01
StoreCode 'MX_OAXACA_SALINA_8344' missing daily summary for 2011-06-05


ProcedureSQL (contains unknown errors)
DECLARE
  CURSOR location_table IS
    SELECT locationCode FROM locations;
  BEGIN
    FOR thisSymbol IN ticker_tables LOOP
      EXECUTE IMMEDIATE 'WITH datelist AS(
                                              SELECT t1.GenDate as gendate, extract(dow from GenDate) AS dayofweek
                                                FROM (SELECT date as GenDate
                                                      FROM generate_series('1985-01-01'::date,CURRENT_TIMESTAMP::date,'1 day'::interval) date
                                                     ) AS t1 
                                              )
                                              SELECT gendate FROM datelist AS t1
                                              WHERE gendate NOT IN (SELECT t1.eod_ts FROM daily_salessummary AS t1
                                                                    JOIN locations AS t2 ON t1.location_id = t2.row_id
                                                                    WHERE t2.locationCode = '' || location_table.locationCode  || '')                   
                                             AND gendate > (SELECT start_date FROM locations WHERE locationCode = '' || location_table.locationCode  || '')';
    END LOOP;
  END;



-- 
Jason Aleski / IT Specialist


В списке pgsql-sql по дате отправления: