Learn practical skills, build real-world projects, and advance your career

Lesson 2

Classification Problems vs Regression Problems

Problems where each input must be assigned a discrete category (also called label or class) are known as classification problems. e.g. Rainfall prediction "will rain" "will not rain"
Classification problems can be binary (yes/no) or multiclass (picking one of many classes)

Problems where a continuous numeric value must be predicted for each input are known as regression problems.

Logistic regression is a commonly used technique for solving binary classification problems. In a logistic regression model:

Linear regression is a commonly used technique for solving regression problems. In a linear regression model, the target is modeled as a linear combination (or weighted sum) of input features. The predictions from the model are evaluated using a loss function like the Root Mean Squared Error (RMSE).

Logistic Regression

- we take linear combination (or weighted sum of the input features)
- we apply the sigmoid function to the result to obtain a number between 0 and 1
- this number represents the probability of the input being classified as "Yes"
- instead of RMSE, the cross entropy loss function is used to evaluate the results

Classification and regression are both supervised machine learning problems, because they use labeled data. Machine learning applied to unlabeled data is known as unsupervised learning

In this tutorial, we'll train a logistic regression model using the Rain in Australia dataset to predict whether or not it will rain at a location tomorrow, using today's data. This is a binary classification problem.

Download Data