Re: Find inconsistencies in data with date range
| От | Jason Aleski |
|---|---|
| Тема | Re: Find inconsistencies in data with date range |
| Дата | |
| Msg-id | 550111A0.2070105@gmail.com обсуждение исходный текст |
| Ответ на | Re: Find inconsistencies in data with date range (s d <daku.sandor@gmail.com>) |
| Список | pgsql-sql |
In case anyone else needs similar code, I was able to get this working. Below is the code that pulls the missing dates using a cursor and returns the information into a table. I'm sure there may be a way to make the code more efficient, but considering this will only get ran maybe once a quarter (for quarterly reports), it works for me. With 700+ stores, it takes about 30 minutes to fully run from a reporting server. I have a JAVA program that queries the function "SELECT * FROM eod_missing_dates();" Then sends all the missing dates to a RabbitMQ server to with a worker program to try to rebuild the missing eod summaries and if not, it will send a message to the store managers. Hopefully this code will help someone else!
CREATE OR REPLACE FUNCTION eod_missing_dates()
RETURNS TABLE(store_id uuid, location character varying, missing_ts timestamp with time zone) AS
$BODY$
DECLARE
location_cursor CURSOR FOR SELECT * FROM locations ORDER BY store_id;
store_rec location%ROWTYPE;
BEGIN
CREATE TEMP TABLE dr_temptable(dr_ts, dr_dow) AS (SELECT t1.GenDate as gendate, extract(dow from GenDate) AS dayofweek
FROM (SELECT date as GenDate
FROM generate_series('1950-01-01'::date,CURRENT_TIMESTAMP::date,'1 day'::interval) date
) AS t1
WHERE extract(dow from GenDate) NOT IN (0,6));
OPEN location_cursor;
LOOP
FETCH location_cursor INTO store_rec;
EXIT WHEN store_rec IS NULL;
IF NOT FOUND THEN
EXIT;
END IF;
RAISE INFO '%', 'Checking data/dates for ' || store_rec.location;
RETURN QUERY SELECT store_rec.row_id as store_id, store_rec.location AS location, dr_ts AS missing_ts FROM dr_temptable AS t1
WHERE dr_ts NOT IN (SELECT t1.eod_ts FROM daily_salessummary AS t1
JOIN location AS t2 ON t1.store_id = t2.row_id
WHERE t2.location = store_rec.location)
AND dr_ts > (SELECT start_date FROM locations WHERE location=store_rec.location);
END LOOP;
CLOSE location_cursor;
DROP TABLE dr_temptable;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
;
CREATE OR REPLACE FUNCTION eod_missing_dates()
RETURNS TABLE(store_id uuid, location character varying, missing_ts timestamp with time zone) AS
$BODY$
DECLARE
location_cursor CURSOR FOR SELECT * FROM locations ORDER BY store_id;
store_rec location%ROWTYPE;
BEGIN
CREATE TEMP TABLE dr_temptable(dr_ts, dr_dow) AS (SELECT t1.GenDate as gendate, extract(dow from GenDate) AS dayofweek
FROM (SELECT date as GenDate
FROM generate_series('1950-01-01'::date,CURRENT_TIMESTAMP::date,'1 day'::interval) date
) AS t1
WHERE extract(dow from GenDate) NOT IN (0,6));
OPEN location_cursor;
LOOP
FETCH location_cursor INTO store_rec;
EXIT WHEN store_rec IS NULL;
IF NOT FOUND THEN
EXIT;
END IF;
RAISE INFO '%', 'Checking data/dates for ' || store_rec.location;
RETURN QUERY SELECT store_rec.row_id as store_id, store_rec.location AS location, dr_ts AS missing_ts FROM dr_temptable AS t1
WHERE dr_ts NOT IN (SELECT t1.eod_ts FROM daily_salessummary AS t1
JOIN location AS t2 ON t1.store_id = t2.row_id
WHERE t2.location = store_rec.location)
AND dr_ts > (SELECT start_date FROM locations WHERE location=store_rec.location);
END LOOP;
CLOSE location_cursor;
DROP TABLE dr_temptable;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
;
Jason Aleski / IT Specialist
On 6 March 2015 at 22:38, Jason Aleski <jason.aleski@gmail.com> wrote:I know I can do this Java, but I'd rather have this running as a Stored Procedure. What I am wanting to do is identify and potentially correct the summary data for date inconsistencies. We have policies/red flag reports in place to keep this from happening, but we are now cleaning up history. The query below works on a per store basis, but I'd like to be able to run this for all stores in the location table.
I've looked at some procedure codes regarding looping, but everything I try to create seems to give me problems. THe code I'm trying is also below. Does anyone have any suggestions on how to accomplish this?
Working Tables
locations - table contains store information, startup date, address, etc
daily_salessummary - table holds daily sales summary by store (summary should be updated nightly). eod_ts is End of Day Timestamp.
Query
WITH datelist AS(
SELECT t1.GenDate as gendate, extract(dow from GenDate) AS dayofweek
FROM (SELECT date as GenDate
FROM generate_series('1985-01-01'::date,CURRENT_TIMESTAMP::date,'1 day'::interval) date
) AS t1
)
SELECT gendate FROM datelist AS t1
WHERE gendate NOT IN (SELECT t1.eod_ts FROM daily_salessummary AS t1
JOIN locations AS t2 ON t1.location_id = t2.row_id
WHERE t2.locationCode = 'US_FL_TAMPA_141')
AND gendate > (SELECT start_date FROM locations WHERE locationCode = 'US_FL_TAMPA_141')
Desired Output - could output to an exceptions table
StoreCode 'US_FL_TAMA_141' missing daily summary for 1998-01-01
StoreCode 'MX_OAXACA_SALINA_8344' missing daily summary for 2011-06-05
ProcedureSQL (contains unknown errors)
DECLARE
CURSOR location_table IS
SELECT locationCode FROM locations;
BEGIN
FOR thisSymbol IN ticker_tables LOOP
EXECUTE IMMEDIATE 'WITH datelist AS(
SELECT t1.GenDate as gendate, extract(dow from GenDate) AS dayofweek
FROM (SELECT date as GenDate
FROM generate_series('1985-01-01'::date,CURRENT_TIMESTAMP::date,'1 day'::interval) date
) AS t1
)
SELECT gendate FROM datelist AS t1
WHERE gendate NOT IN (SELECT t1.eod_ts FROM daily_salessummary AS t1
JOIN locations AS t2 ON t1.location_id = t2.row_id
WHERE t2.locationCode = '' || location_table.locationCode || '')
AND gendate > (SELECT start_date FROM locations WHERE locationCode = '' || location_table.locationCode || '')';
END LOOP;
END;-- Jason Aleski / IT Specialist
В списке pgsql-sql по дате отправления: