Purpose: to search and count anomalies in feature correlations for shops
Path on DS-instance: /home/ubuntu/DusFolder/anomaly_research/fea_conv/feature_anomaly_detection
change parameters in conv_config.py and run eshop_anomaly_count.py
OR
change parameters in kpi_config.py and run eshop_anomaly_kpi.py
OR
change parameters in kpi_config.py and run eshop_anomaly_kpi_basic.py for fixed weekly report
ACCOUNT_ID - shop id
SLICE - batch size in sessions (for eshop_anomaly_count only)
BEG_DATE - begin date of timeframe
END_DATE - end date of timeframe
LQ - left quantile aka bottom border for anomalies
RQ - right quantile aka top border for anomalies
AN_BORD - minimum border for anomaly count to select for tops_kpi (for eshop_anomaly_kpi only)
--to_sql - key to export result to db
--no_sql - key to not export result to db (default)
marked anomalies in feature correlations. matrix contains periods which have correlation values outside of selected quantile frames
aggregated anomaly count for each period of fixed size
for each KPI in ['bounce_rate', 'conversion_rate', 'med_duration'] anomalies were identified both in general and with division by channels
attribute // threshold from below // threshold from above - quantiles on a given dataset
for each Source the best combination of MCID for each of the 3 KPI is derived. main condition is for combination of MCID to have more than 10 sessions for each source. second condition is for each source to have 3+ alternative MCID
results from tops_kpi.csv for last day in text representation with initial parameters
same as eshop_anomaly_kpi.py but much faster
timeframe is set to be the previous week and an_bord = 0, the only specks to change are account_id and RQ, LQ.
for every source the top-5/bot-5 candidates of each KPI metric is shown if possible
same result as for eshop_anomaly_kpi.py but with said changes
very different format compared to result for eshop_anomaly_kpi.py