Week 5: Intro to Linear Models

October 7, 2019- Types of Data
- Useful Statistical Distribution
- Important Summary Statistics
- Key Theorems

- Modeling Theory
- Linear Regression
- Assumptions for Linear Models
- Measuring Performance for Linear Models

"Entities should not be multiplied unnecessarily."

*Simple models are preferable over more complex models.*

See also: Wikipedia, and a brief history.

*All models are wrong...*

*but some are useful.*

*A model is linear when each term is either a constant or the product of a parameter and a predictor variable.*

- Data is linear in form.
- Sample is random.
- Error terms have constant variance (homoscedasticity).
- Error terms have a mean of zero based on the observed data.
- Predictors are independent (no multicollinearity).
- Errors are normally distributed.

in the case of:

- R-squared
- Adjusted R-squared
- Coefficients
- P-values

xkcd

Email me teammate requests by October 14.

DataCamp's Supervised Learning with scikit-learn

- The course should appear as assignment within your existing DataCamp account.
- Course takes ~4 hours, plan your time accordingly.