GitHub - danushkhanna/Statistical-Analysis-of-National-Medical-Exam-Performance-Using-ML: National Eligibility cum Entrance Test (NEET) Rank Estimation and College Clustering using Linear Regression and K-means Algorithms.

Introduction

With the power of data-driven growth, let’s help reimagine what's possible and empower the next generation of medical professionals. We solve two problems:

Challenges in providing accurate counseling due to missing NEET scores and ranks in student profiles.
Lack of insights into college preferences and attrition rates hinders effective guidance for students' academic choices.

Data Preparation

Problem 1: Incomplete Data in Student Scores and Ranks Around 40% of the dataset contains missing scores for NEET Ranks.

Our study commences with an in-depth examination of data analytics, aiming to reveal valuable insights capable of influencing academic counseling practices. We scrutinized a sizable dataset comprising 100,000 entries, noting that 40% of the data contained gaps, posing a notable challenge for analysis.

Rank Estimation with Linear Regression

To address the problem of colleges experiencing varying levels of attrition (dropout rates) among students and the lack of understanding about factors influencing student attrition, our project delved into Rank Estimation with Linear Regression.

College Clustering with K-means Algorithm

Problem 2: Colleges experience varying levels of attrition (dropout rates) among students. Lack of understanding about factors influencing student attrition.

To address this, our project delved into College Clustering using the k-means algorithm. By applying this technique to a dataset of 400 colleges, we clustered them based on Round 1 closings and attrition rates. To refine our clustering process, we employed methods like the elbow curve and silhouette score. Our goal was to gain insights into the factors contributing to student attrition.

Visualizations

Continuing our analysis, we computed the "Attraction Index" for colleges, providing insights into their appeal. This index, derived from analyzing 324 colleges, revealed a mean score of 94.45, highlighting the prestige of various institutions.

Summary Statistics

We then moved on to compute the Attraction Index. Through K-means clustering, a robust unsupervised machine learning algorithm, we categorized the colleges based on their characteristics. This approach sets the stage for developing tailored counseling strategies, catering to the unique dynamics of each college.

Next, we conducted model validation by analyzing regional trends and evaluating our predictive models. Through box plots, we visually represented the differences in attrition rates across various states, providing insights into the educational landscape. Additionally, our Linear Regression model achieved a robust R-squared value of 0.8925 during validation, indicating its accuracy in predicting Expected Scores based on NEET Ranks.

Usage

If you're excited to leverage the potential of our project, follow these steps:

Clone this repository to your local machine.
Install the required libraries and dependencies listed in the requirements.txt file.
Execute the provided scripts to experience the power of predictive modeling and college clustering firsthand.

Contributions

Your contributions are invaluable to us! Whether it's enhancing existing features, introducing novel insights, or refining the codebase, we welcome your input. Feel free to submit issues or pull requests to be a part of this project.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
code		code
.gitignore		.gitignore
Attrition % by State.png		Attrition % by State.png
Attrition vs. College Rank.png		Attrition vs. College Rank.png
Average Attrition % by Cluster .png		Average Attrition % by Cluster .png
Elbow Curve & Silhouette Score.png		Elbow Curve & Silhouette Score.png
Given dataset.png		Given dataset.png
LICENSE		LICENSE
Linear Regression Prediction.png		Linear Regression Prediction.png
Missing data.png		Missing data.png
README.md		README.md
State-wise Distribution of Students.png		State-wise Distribution of Students.png
Top Colleges by Attraction Index.png		Top Colleges by Attraction Index.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of Contents:

Introduction

Data Preparation

Rank Estimation with Linear Regression

College Clustering with K-means Algorithm

Visualizations

Summary Statistics

Usage

Contributions

About

Releases

Packages

Languages

License

danushkhanna/Statistical-Analysis-of-National-Medical-Exam-Performance-Using-ML

Folders and files

Latest commit

History

Repository files navigation

Table of Contents:

Introduction

Data Preparation

Rank Estimation with Linear Regression

College Clustering with K-means Algorithm

Visualizations

Summary Statistics

Usage

Contributions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages