Skip to content

Commit

Permalink
Wrapping up
Browse files Browse the repository at this point in the history
- Some bug fixs and code improvements
- Update README.md
- Add latest data
  • Loading branch information
caominhduy committed Aug 10, 2020
1 parent c864c0a commit 172774a
Show file tree
Hide file tree
Showing 26 changed files with 829,904 additions and 810,916 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@

.DS_Store
*.pyc
demo.py
test.py
88 changes: 29 additions & 59 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,88 +23,58 @@ Follow these instructions to get the project up and running on your local machin

These are what you **must** install before using our project.

1. [NumPy](https://pypi.org/project/numpy/) and [Matplotlib](https://pypi.org/project/matplotlib/)
1. [NumPy](https://pypi.org/project/numpy/), [Pandas](https://pandas.pydata.org/) and [Matplotlib](https://pypi.org/project/matplotlib/)

2. [Scikit-learn](https://scikit-learn.org/stable/install.html)

3. [TensorFlow](https://www.tensorflow.org/install) (release ≥ 2.0.0)
Optional:

1. [TensorFlow](https://www.tensorflow.org/install) (release ≥ 2.0.0) and [TensorFlow_docs](https://github.com/tensorflow/docs)

2. [Plotly](https://plotly.com/) including: plotly, chart_studio, and [Plotly Orca](https://github.com/plotly/orca)
3. [psutil](https://pypi.org/project/psutil/)

Your local machine must also have Python 3 (≥ 3.7) installed beforehand.

### Run
To run project, first clone this repository, then run this command
To run this project, first clone this repository.
```
python3 pandemic-central
git clone https://github.com/solveforj/pandemic-central.git
```
or you can also download the zip package from Lastest Release **(we recommend cloning and pulling method since it contains the lastest data files and hot fixes.)**
<br>

For more details, please also read `USAGE.md`.
For a basic usage, use this command
```
python covid.py -d
```
or
```
python covid.py --default
```
This command should download the data from sources, preprocess them, train, and export predictions.
<br><br><br>
For full list of available commands, use
```
python covid.py --help
```

### GitHub
Make sure you always clone and pull the lastest data from Pandemic Central.
**Notice that our repository can always be found at https://github.com/solveforj/pandemic-central.**


## Project Structure
This is not complete project structure, read USAGE.md for more details.
```
pandemic-central/
├── __init__.py
├── __main__.py
├── data/
├── raw_data/
├── processed_data/
├── models/
├── generate_data.py
├── LICENSE.txt
├── predict.py
├── preprocess.py
├── README.md
├── tf_predict.py
├── train.py
└── USAGE.md
```
In which:
- `raw_data/` contains the raw mobility datasets (in csv or txt formats) for preprocessing.

- `processed_data/` contains processed and merged mobility data that is ready for training.

- `data/` contains other necessary raw or processed datasets such as census or epidemiology.

- `models/` contains saved TensorFlow models from training and for later deployment.

- `preprocess.py` preprocesses raw Google and Apple mobility data (among other tasks) for eventual integration into training datasets.

- `generate_data.py` processes and merges all mobility, socioeconomic, and health data into the final training datasets.

- `train.py` trains a Random Forest Regression model using Scikit-Learn and appends predictions to the dataset. *This is currently the default model.*

- `tf_predict.py` trains a TensorFlow Neural Network model. *This currently an experimental model.*

- `predict.py` generates predictions for each county for the last 5 weeks, generating the latest detailed predictions, which we add to this repository daily.

- `LICENSE.txt` is MIT license.

- `README.md` is what you are reading now.

- `USAGE.md` is a detailed manual for specific use case.
Make sure you always clone and pull the latest version from Pandemic Central.
**Our repository can always be found at https://github.com/solveforj/pandemic-central.**

## Authors
* [**Joseph Galasso**](https://github.com/solveforj/)
* [**Duy Cao**](https://github.com/caominhduy/)
* [**Kimberly Diwa**](https://github.com/kdiwa/)

## Support
Since this is still in its earliest versions, bugs and incompletions are unavoidable. Please feel free to comment or make a pull request.
Since this is still in its earliest versions, bugs and incompletions are unavoidable. Please feel free to comment or contact our developers!
Your contributions are very valuable to us and this project.

For technical support, please email our developers:
[jgalasso@itsonit.com](mailto:jgalasso@itsonit.com) (Joseph) or [dcao@udallas.edu](mailto:dcao@udallas.edu) (Duy). Thank you for your patience.

## Versioning
Our latest version is v1.0.2. For version details, see **Releases** tags.
Our latest version is v2.0.0. For version details, see **Releases** tags.

## Credits
Our project can not be completed without these great sources. We do not own any data; all input data we use are open-source or permission-granted. More details about how we process this data may be found in `generate_data.py` and `preprocess.py`.
Expand All @@ -128,6 +98,6 @@ Our latest version is v1.0.2. For version details, see **Releases** tags.
9. [US Census Population Data](https://www.census.gov/data/tables/time-series/demo/popest/2010s-counties-detail.html)
10. [USDA FIPS Code List](https://www.ers.usda.gov/data-products/rural-urban-commuting-area-codes/)

We also thank the TensorFlow and Python communities for very detailed and helpful official documentations.
We also thank Plotly, TensorFlow and Python communities for very detailed and helpful documentations.

**Please check out these resources for yourself!**
220 changes: 0 additions & 220 deletions USAGE.md

This file was deleted.

7 changes: 5 additions & 2 deletions covid.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
__author__ = 'Duy Cao, Joseph Galasso'
__copyright__ = '© Pandamic Central, 2020'
__license__ = 'MIT'
__status__ = 'beta'
__status__ = 'release'
__url__ = 'https://github.com/solveforj/pandemic-central'
__version__ = '2.0.0'

Expand All @@ -24,7 +24,9 @@ def main(args):
if args.map:
from data.graphics.draw import draw_map
draw_map()

if args.ag:
merge(apple_google_mobility=True)

if __name__ == '__main__':
parser = argparse.ArgumentParser(description='COVID-19 County Prediction\n',\
usage='use "-h" or "--help" for more instructions')
Expand All @@ -34,5 +36,6 @@ def main(args):
parser.add_argument('-o', '--predict', action='store_true', help='Predict and export predictions only')
parser.add_argument('--map', action='store_true', help='Render map for existing predictions')
parser.add_argument('--tf', action='store_true', help=argparse.SUPPRESS)
parser.add_argument('--ag', action='store_true', help=argparse.SUPPRESS)
args = parser.parse_args()
main(args)
10 changes: 8 additions & 2 deletions data/CCVI/preprocess.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,17 @@
"""
This module preprocesses CCVI Index.
Data source: https://docs.google.com/spreadsheets/d/1qEPuziEpxj-VG11IAZoa5RWEr4GhNoxMn7aBdU76O5k/edit#gid=549685106
"""

import pandas as pd

__author__ = 'Duy Cao, Joseph Galasso'
__copyright__ = '© Pandamic Central, 2020'
__license__ = 'MIT'
__version__ = '2.0.0'
__status__ = 'beta'
__status__ = 'release'
__url__ = 'https://github.com/solveforj/pandemic-central'
__version__ = '2.0.0'

def preprocess_disparities():
print('• Processing CCVI Data')
Expand Down
Loading

0 comments on commit 172774a

Please sign in to comment.