feature_anomaly_detection

Purpose: to search and count anomalies in feature correlations for shops
Path on DS-instance: /home/ubuntu/DusFolder/anomaly_research/fea_conv/feature_anomaly_detection

how to run main code

change parameters in conv_config.py and run eshop_anomaly_count.py
OR
change parameters in kpi_config.py and run eshop_anomaly_kpi.py
OR
change parameters in kpi_config.py and run eshop_anomaly_kpi_basic.py for fixed weekly report

parameters

ACCOUNT_ID - shop id
SLICE - batch size in sessions (for eshop_anomaly_count only)
BEG_DATE - begin date of timeframe
END_DATE - end date of timeframe
LQ - left quantile aka bottom border for anomalies
RQ - right quantile aka top border for anomalies
AN_BORD - minimum border for anomaly count to select for tops_kpi (for eshop_anomaly_kpi only)
--to_sql - key to export result to db
--no_sql - key to not export result to db (default)

result tables

eshop_anomaly_count.py :

anomaly_matrix.csv

marked anomalies in feature correlations. matrix contains periods which have correlation values outside of selected quantile frames

batch_counts.csv (sql data.feature_anomaly_batch_counts)

aggregated anomaly count for each period of fixed size

eshop_anomaly_kpi.py :

anomaly_table.csv (sql data.eshop_anomaly_table)

for each KPI in ['bounce_rate', 'conversion_rate', 'med_duration'] anomalies were identified both in general and with division by channels

metric_lines.csv

attribute // threshold from below // threshold from above - quantiles on a given dataset

tops_kpi.csv (sql data.eshop_anomaly_tops_kpi)

for each Source the best combination of MCID for each of the 3 KPI is derived. main condition is for combination of MCID to have more than 10 sessions for each source. second condition is for each source to have 3+ alternative MCID

cache_log.txt

results from tops_kpi.csv for last day in text representation with initial parameters

eshop_anomaly_kpi_basic.py

same as eshop_anomaly_kpi.py but much faster
timeframe is set to be the previous week and an_bord = 0, the only specks to change are account_id and RQ, LQ.
for every source the top-5/bot-5 candidates of each KPI metric is shown if possible

tops_kpi.csv

same result as for eshop_anomaly_kpi.py but with said changes

cache_log.txt

very different format compared to result for eshop_anomaly_kpi.py

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
class_lib		class_lib
segmentation_functools		segmentation_functools
.gitignore		.gitignore
Features_conversion.ipynb		Features_conversion.ipynb
README.md		README.md
Traffic_anomaly.ipynb		Traffic_anomaly.ipynb
anomaly_matrix.csv		anomaly_matrix.csv
atb-feature-import orig.ipynb		atb-feature-import orig.ipynb
batch_config.py		batch_config.py
batch_counts.csv		batch_counts.csv
cache_log.txt		cache_log.txt
conv_config.py		conv_config.py
eshop_anomaly_count.py		eshop_anomaly_count.py
eshop_anomaly_kpi.py		eshop_anomaly_kpi.py
eshop_anomaly_kpi_basic.py		eshop_anomaly_kpi_basic.py
kpi_analysys.ipynb		kpi_analysys.ipynb
kpi_config.py		kpi_config.py
tops_kpi.csv		tops_kpi.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

feature_anomaly_detection

how to run main code

parameters

result tables

eshop_anomaly_count.py :

anomaly_matrix.csv

batch_counts.csv (sql data.feature_anomaly_batch_counts)

eshop_anomaly_kpi.py :

anomaly_table.csv (sql data.eshop_anomaly_table)

metric_lines.csv

tops_kpi.csv (sql data.eshop_anomaly_tops_kpi)

cache_log.txt

eshop_anomaly_kpi_basic.py

tops_kpi.csv

cache_log.txt

About

Releases

Packages

Languages

monetha/feature_anomaly_detection

Folders and files

Latest commit

History

Repository files navigation

feature_anomaly_detection

how to run main code

parameters

result tables

eshop_anomaly_count.py :

anomaly_matrix.csv

batch_counts.csv (sql data.feature_anomaly_batch_counts)

eshop_anomaly_kpi.py :

anomaly_table.csv (sql data.eshop_anomaly_table)

metric_lines.csv

tops_kpi.csv (sql data.eshop_anomaly_tops_kpi)

cache_log.txt

eshop_anomaly_kpi_basic.py

tops_kpi.csv

cache_log.txt

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages