Week 5: Intro to Linear Models

March 11, 2019- Types of Data
- Useful Statistical Distribution
- Important Summary Statistics
- Key Theorems

- Project Proposals: due March 18, details here
- Midterm: March 25
- Drawn primarily from course lectures
- Mix of multiple choice and short answer
- Similar to fall's exam, but longer in content and time
- Project Update: April 29
- Project Deadline & Presentation:
**Wednesday**May 15

- Modeling Theory
- Linear Regression
- Assumptions for Linear Models
- Measuring Performance for Linear Models

"Entities should not be multiplied unnecessarily."

*Simple models are preferable over more complex models.*

See also: Wikipedia, and a brief history.

*All models are wrong...*

*but some are useful.*

*A model is linear when each term is either a constant or the product of a parameter and a predictor variable.*

- Data is linear in form.
- Sample is random.
- Error terms have constant variance (homoscedasticity).
- Error terms have a mean of zero based on the observed data.
- Predictors are independent (no multicollinearity).
- Errors are normally distributed.

in the case of:

- R-squared
- Adjusted R-squared
- Coefficients
- P-values

xkcd

Details on the project are now available here.