Skip to content

National Eligibility cum Entrance Test (NEET) Rank Estimation and College Clustering using Linear Regression and K-means Algorithms.

License

Notifications You must be signed in to change notification settings

danushkhanna/Statistical-Analysis-of-National-Medical-Exam-Performance-Using-ML

Repository files navigation

Table of Contents:

Introduction

With the power of data-driven growth, let’s help reimagine what's possible and empower the next generation of medical professionals. We solve two problems:

  • Challenges in providing accurate counseling due to missing NEET scores and ranks in student profiles.
  • Lack of insights into college preferences and attrition rates hinders effective guidance for students' academic choices.

Data Preparation

Problem 1: Incomplete Data in Student Scores and Ranks Around 40% of the dataset contains missing scores for NEET Ranks.

Our study commences with an in-depth examination of data analytics, aiming to reveal valuable insights capable of influencing academic counseling practices. We scrutinized a sizable dataset comprising 100,000 entries, noting that 40% of the data contained gaps, posing a notable challenge for analysis.

Rank Estimation with Linear Regression

To address the problem of colleges experiencing varying levels of attrition (dropout rates) among students and the lack of understanding about factors influencing student attrition, our project delved into Rank Estimation with Linear Regression.

College Clustering with K-means Algorithm

Problem 2: Colleges experience varying levels of attrition (dropout rates) among students. Lack of understanding about factors influencing student attrition.

To address this, our project delved into College Clustering using the k-means algorithm. By applying this technique to a dataset of 400 colleges, we clustered them based on Round 1 closings and attrition rates. To refine our clustering process, we employed methods like the elbow curve and silhouette score. Our goal was to gain insights into the factors contributing to student attrition.

Visualizations

Continuing our analysis, we computed the "Attraction Index" for colleges, providing insights into their appeal. This index, derived from analyzing 324 colleges, revealed a mean score of 94.45, highlighting the prestige of various institutions.

Summary Statistics

We then moved on to compute the Attraction Index. Through K-means clustering, a robust unsupervised machine learning algorithm, we categorized the colleges based on their characteristics. This approach sets the stage for developing tailored counseling strategies, catering to the unique dynamics of each college.

Next, we conducted model validation by analyzing regional trends and evaluating our predictive models. Through box plots, we visually represented the differences in attrition rates across various states, providing insights into the educational landscape. Additionally, our Linear Regression model achieved a robust R-squared value of 0.8925 during validation, indicating its accuracy in predicting Expected Scores based on NEET Ranks.

Usage

If you're excited to leverage the potential of our project, follow these steps:

  1. Clone this repository to your local machine.
  2. Install the required libraries and dependencies listed in the requirements.txt file.
  3. Execute the provided scripts to experience the power of predictive modeling and college clustering firsthand.

Contributions

Your contributions are invaluable to us! Whether it's enhancing existing features, introducing novel insights, or refining the codebase, we welcome your input. Feel free to submit issues or pull requests to be a part of this project.

About

National Eligibility cum Entrance Test (NEET) Rank Estimation and College Clustering using Linear Regression and K-means Algorithms.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages