DEMYSTIFIED: What’s the difference between Classification and Regression?

For today’s #futurefridays I’m going to answer a question that confuses a lot of people trying to learn Data Science and Machine Learning.

The question is “What’s the difference between Classification and Regression?”

Let me give a shot at this with a simple explanation and example.

Think about the output that you want to achieve.

If you have a dataset, and the output you want to get are labels or categories, then it’s classification.

However if the output you want to get is a numerical value based on computations done on the dataset, then it’s regression.

Let me give you a specific example.

Facebook uses both classification and regression in their algorithms.

The difference is their where they are used.

Classification

When you post on Facebook and upload an image, and it has faces of people who are Facebook users, the algorithm is able to determine who these people are, and tag their names accordingly. What’s happening there is called Classification.

Regression

Now when people like, comment, and share the post, the algorithm uses that to determine how “viral” the post is, and if it should be shown to other people’s feeds or not. This is usually based on count data and a corresponding weight for behavior or action. What’s happening there is called Regression.

Ending note

If course this is an oversimplification of Classification and Regression. There’s more to it than that such as contextual analytics, quality of content, determining if it’s a human or a bot, etc.

But my objective here was to demystify things and help shed some light on the topic.

I hope you learned something new today.

If you want to learn how to become a Data Scientist, some of the best learning tracks I’ve seen are the Data Science learning paths from DataCamp.

You can start learning today even for free. Check it out! Click the link below.

Data Science at DataCamp

What did you learn that apples to you? What will you implement moving forward?