City College, Fall 2018

Intro to Data Science

Course Intro: What is Data Science and Why Does It Matter?

August 27, 2018

What is Data Science?
Is that helpful?
Who are Data Scientists?
About Me



About Tech in Residence

About this Course
grantmlong.com/teaching
Official Course Objectives
  1. Explain the key steps in a data science project.
  2. Apply Python to load, clean, and process data sets.
  3. Identify key elements of and patterns in a data set using computational analysis and statistical methods.
  4. Explain and visualize empirical findings using with Python and other resources.
  5. Explain fundamental principles of machine learning.
  6. Apply predictive algorithms to a data set.
  7. Work effectively in a team dedicated to analyzing data.
Why Take this Course?
  1. Careers in data are abundant, lucrative, and rewarding.
  2. Learn how to detect BS.
  3. Be a more informed person.
Resources: Coding
Resources: Notebooks

Resources: Class Communications
Course Page
How to Get Help
How to Get Help
itds.ccny@gmail.com
How to Get Help
Classmates
Grading
Project 40%
Assignments & Quizzes 30%
Midterm Exam 20%
Class Participation 10%
Project

The bulk of the course grade will be a group project that will be due in December (exact date TBD). Students will be expected to work on the project during the second half of the class and will be required to present their progress throughout the course of the semester. Grades will be assigned on the basis of overall project quality, demonstration of core principles taught in the class, and individual contributions to the group's effort. More details on the project will discussed in the second week of class.

Assignments and Exams
  • Assigments. This class includes short, frequent assignments to check comprehension. All assignments and quizzes will be graded on a 5-point scale. All quizzes will be announced in advance of class.
    • No late assignments accepted. Assignments not turned in by the set deadline will be scored as 0/5. Exceptions will be granted only as mandated by CUNY policy.
    • Worst two assignments dropped, includes missed assignments.
  • Exam. A short midterm exam will be held in October and will focus on broad concepts the course has surveyed thus far. The format will mimic the style of questions frequently asked in interviews for data-related roles.
Texts and Materials
  • Required Text: Data Science from Scratch, Joel Grus. 2nd Edition, April 2015 (O'Reilly). Available online.
  • Additional required readings and videos will be made available to students in advance of each week's assignments. All will be availble online at no cost.
  • In addition to the required materials, students may find the following resources helpful in supplementing course materials:
    • Recommended Text: Python for Data Analysis, Wes McKinney. 2nd Edition, October 2017 (O'Reilly). Available online.
    • Recommended Text: Elements of Statistical Learning, Trevor Hastie, Robert Tibshirani and Jerome Friedman. 2nd Edition, 2009 (Springer). Available free online here.
Cheating

Academic dishonesty is prohibited in The City University of New York. Penalties for academic dishonesty include academic sanctions, such as failing or otherwise reduced grades, and/or disciplinary sanctions, including suspension or expulsion.


CUNY Policy on Academic Integrity
Data Science in Practice
Today's Data
H1-B Visa Data

Homework
  • No class next Monday, September 4.
  • First assignment (due Tuesday, September 4 at 11:59pm):
    • Email itds.ccny@gmail.com with:
      1. Two original insights that we did not discuss in class from our H1B data dive.
      2. How you prefer to be addressed in class (name, pronouns).
      3. The email you prefer to correspond in with the class.
      4. Your GitHub handle. (Sign up for one if you do not already have it, a free account is fine.)
      5. The top three things you hope to get out of this class. (No wrong answers)