This data analysis notebook demonstrates lossless, lossy visualizations techinques, and classification methods. We demonstrate analysis of scientific data on hot-swappable datasets.
Datasets supported are numeric tabular datasets with a 'class' column, .csv files provided in datasets folder. For testing purposes we focus on fisher_iris.csv, others included.
The notebook is more easily viewed with a Jupyter viewer, web options include:
- Pairplot
Lossless Visualizations:
- Parallel coordinates
- Parallel hulls
- Andrew's curves
- Star plot
- GLC-Linear
Lossy Visualizations:
- Radviz
- T-SNE
- PCA
Classification Methods:
- Associative Rules (no reduction, only single-pass.)
- Parallel coordinates interval visualization
- LDA
- Decision Tree with feature importance
- Support Vector Machine
- optimal parameter search
- Gaussian Naive Bayes
This repository and all contents contained are freely available for personal and commercial use under the MIT License, see LICENSE file for full license details.