Skip to content

Here the prediction and analysis of student scores using selected features is done entirely by linear regression machine learning algorithm. This project covers all methods of linear regression theory.

Notifications You must be signed in to change notification settings

HirudikaAnupama/Student-Score-Prediction-Linear-Regression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Student Score Prediction Using Linear Regression

Most of the linear regression theory is covered in this project.

Using these theories, we can predict students' marks through this program.


Introduction

  • When we train the machine learning model we need to follow several steps.

  • While training this linear regression machine learning algorithm, we need to follow some steps to make the model accurate and fast. Amoung them are things like,

    Data Collecting
    Data Preprocessing
    Data Analysis
    Split the Data into training and testing
    Evaluate the Model
    Check model performance
    Fine-tune the Model
  • A better understanding of these can be obtained from the following introduction and relative code sections related to the introduction can be obtained by observing the code.


Data Collecting

  • We must collect the data we need according to our needs.
  • Depending on the target variable (dependent variable/ our predictor variable/ y) we need to collect other data (characteristics/ independent variables/ Features).

Data Preprocessing

  • After collecting the data we need to clean it,

    find missing data and fill them
    Drop duplicate data
    Turn categorical data into numerical or Boolean
    Rename columns for easily understand
    Separate target value and features
  • We can use Encoding method or dummy method for convert categorical data into numerical or boolean.

Data Analysis

  • We can analyze the relationships between the target and the features using plots, graphs, etc.
  • We can identify the relationship through the following sample examples.

SH1

SH1

SH1

SH1

SH1

SH1


Split the Data into training and testing

  • We want seperate data set into training and testing.
  • Traing data is used for train model and tesing data for find model accuracy

Evaluate the Model

  • Then we can train linear regression model using traning data

Check model performance

  • We are used testing data set for this proccess
  • We can use mean squared error, mean absolute error and R2 score for check performance
  • MSE measures the average of the squares of the errors—that is, the average squared difference between the actual and predicted values
  • Lower MSE values indicate a better fit. However, since MSE is in squared units of the response variable, it can be harder to interpret directly
  • R², or the coefficient of determination, indicates the proportion of the variance in the dependent variable that is predictable from the independent variable
  • R² ranges from 0 to 1. An R² of 1 indicates that the model perfectly explains the variability of the response data around its mean, while an R² of 0 indicates that the model does not explain any of the variability.
  • We can check these thing between trained and test data
  • Also we can use recidual plot for predict trained model performance
  • Additional information - https://www.geeksforgeeks.org/regression-metrics/

SH1


Fine-tune the Model

  • Fine-tuning a Linear Regression (LR) model involves optimizing the model parameters and improving its performance by making adjustments based on the evaluation of its results. Here are some steps and techniques for fine-tuning a Linear Regression model

Feature Engineering


Regularization

  • Apply techniques to prevent overfitting by adding a penalty to the model's complexity

    1. Ridge Regression: Adds an L2 penalty to the loss function.
    2. Lasso Regression: Adds an L1 penalty to the loss function.
    3. Elastic Net: Combines both L1 and L2 penalties.

  • Additional information - https://www.geeksforgeeks.org/regularization-in-machine-learning/

Hyperparameter optimization


Cross-Validation



More information on functionality


Key Features

  • Implementation of linear regression machine learning algorithm
  • Data cleaning
  • Data visualization
  • Data analysis
  • Other

Contact

About

Here the prediction and analysis of student scores using selected features is done entirely by linear regression machine learning algorithm. This project covers all methods of linear regression theory.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published