This project focuses on analyzing Amazon Prime user data to uncover insights related to user demographics, subscription behavior, and engagement trends. The dataset is processed using Python, and the analysis results are visualized for better understanding.
- Amazon_prime_analysis.xlsx: Contains raw and processed Amazon Prime user data.
- Cleaned_amazon_prime_users.csv: Cleaned dataset used for further analysis.
- Updated_amazon_prime_users.csv: Updated dataset with refined data.
- Amazon_Prime_Final_EDA_Output.ipynb: Jupyter notebook containing exploratory data analysis (EDA) and key insights.
Ensure you have the following installed:
- Python 3.x
- Jupyter Notebook
- Pandas
- Matplotlib / Seaborn (for visualization)
- Clone this repository:
git clone https://github.com/rohitblaze10/amazon-prime-analysis.git cd amazon-prime-analysis
- Install the required Python libraries:
pip install pandas matplotlib seaborn jupyter
- Open Jupyter Notebook:
jupyter notebook
- Run
Amazon_Prime_Final_EDA_Output.ipynb
to process and analyze the data. - View results in
Amazon_prime_analysis.xlsx
or the cleaned CSV files.
- Subscription Plans:
- The majority of users subscribe to the Monthly plan, followed by Annual and Family plans.
- User Engagement:
- Users with auto-renewal enabled tend to have higher engagement metrics.
- Smart TVs are the most used devices for accessing Prime services.
- Popular Purchase Categories:
- The top three purchase categories among Prime users are Electronics, Books, and Clothing.
- Customer Support Interactions:
- Users with low feedback ratings tend to contact customer support more frequently.
- Demographics:
- The dataset includes users across various age groups and locations, with balanced gender representation.
Below are some key visualizations from our analysis:
-
Subscription Plan Distribution – Highlights the most popular plans.
-
Device Usage for Prime Access – Shows preferred devices for streaming.
-
Most Purchased Categories – Displays frequently bought product categories.
nd more...
- Improve data cleaning and preprocessing techniques.
- Implement predictive modeling to forecast user retention.
- Add more advanced visualizations.
Feel free to fork this repository and make improvements. Pull requests are welcome!
[Specify a license, e.g., MIT, Apache 2.0]