Predicting a students score based on their study hours
Author: Lafir
Introduction
In this project, we will train a machine learning model to predict the percentage of marks scored by a student based on their study hours.
This is a simple linear regression problem since it involves only two variables, i.e. Hours
and Scores
. Hours
column represents number of study hours, and Scores
column represents percentage of marks scored by the student.
In this notebook, we will use linear regression class from scikit-learn linear model library for training our model. We will also use libraries like Pandas, Numpy, Matplotlib, and Seaborn to perform exploratory data analysis and gather insights for machine learning. Here is a list of the activities that our project involves:
- Download the Dataset
-
Install and import required libraries
-
Download data from Github
-
Load dataset with Pandas
- Explore the Dataset
-
Basic info about dataset
-
Exploratory data analysis & visualization
- Prepare Dataset for Training
- Split into training and validation sets
- Extract inputs and outputs (targets)
-
Model Training
-
Model Validation
-
Model Application
-
Inferences and Conclusion
-
References
1. Download the Dataset
Install and import required libraries
#install all required libraries
!pip install jovian pandas matplotlib seaborn scikit-learn --upgrade --quiet