City College, Fall 2019

Intro to Data Science

Week 6: Regression vs. Classification

October 16, 2019

Today's Agenda
  1. Regression vs. Classification
  2. Logistic Regression
  3. Measuring Performance for Classification Models
Week 5 Recap
  1. Linear Regression
  2. Assumptions for Linear Models
  3. Measuring Performance for Linear Models
Data Science Models
Generalized Linear Models

A flexible generalization of ordinary linear regression that allows for response variables that have error distribution models other than a normal distribution.

Generalized Linear Models
Regression vs. Classification

Is this a good forecast?

Regression analysis estimates the conditional expectation of the dependent variable given the independent variables.


Classification is the problem of identifying to which of a set of categories a new observation belongs.

Regression vs. Classification
Logistic Regression

Solved by gradient descent. (optimization)

Logistic Regression

Regression vs. Classification
Logistic Regression Output

Logisitic regression and many other classification models output a continuous value between 0 and 1.

Measuring Classification Performance
  1. Confusion Matrix
  2. Precision
  3. Recall
  4. Accuracy
(as explained by the zombie apocalypse)
Confusion Matrix
Precision

zombie apocalypse use case: you're hunting zombies, and you need to kill as many zombies as possible without killing any humans

Recall

zombie apocalypse use case: you discover a cure for zombies, but can only apply it k infected people

Accuracy

zombie apocalypse use case: zombies have infected roughly half the population, and you're throwing them a party. you are putting together an invite list and want to make sure you invite an equal amount of zombies and humans.


Assignment 6: Due Thursday, October 24 by 11:59pm

DataCamp's Machine Learning with Tree-Based Models in Python

  • The course should appear collectively as assignment within your existing DataCamp account.
  • Course claims to take 5 hours, but I found it shorter than some of the past courses. Nonetheless, use your time wisely.