City College, Fall 2019

Intro to Data Science

Week 14: Life in Data

December 9, 2019

Today's Agenda
  1. Project Debrief
  2. What to Expect in an Interview
  3. What to Look for in a Data Job
  4. Ethical Considerations in Data
Projects: Results from Submission 2
RankTeam NameMSEPointsMedian ErrorShare under 10%
1GIN435,192+12$19665.6%
2Science Data488,104+12$21560.7%
3EDS545,513+8$24955.2%
4The Data Scientists548,583+8$22359.9%
5GodZillow582,126+4$25154.9%
6The Divers678,275+4$32545.7%
7Datalicious699,689+4$20162.8%
Demo Model1,669,750$36642.20%
Successful Strategies
  • Simple models can work well, but adding data helps: MSE for demo model = 1,669,750
  • Nonlinear models outperformed linear models
  • Building details were helpful: ~35 percent of buildings in test had units in train
  • Monolithic models rule
  • More data would have been helpful...
What to Expect in an Interview
My Interviews
  1. Recruiter screen
  2. Semi-technical phone screen
  3. Coding exercise
  4. Onsite with the team
Tips
  1. Know the basics
  2. Sweat the details
  3. Be resourceful
  4. Show your passion
What to Look for in a Data Job
How good is their data? How much of your time will be spent cleaning it?
Are they looking for a data [scientist/engineer/analyst] or a wizard?
Who is you boss? Is it someone you can learn from?
What kind of technical resources do you have? (Computer, installation rights, cloud resources, visualization software)
How do they handle code?
Do they use the latest and greatest in open source software?
How seriously do they take recruiting?
You can do amazing things with data!

Should you?
Major Ethical Issues in Data
  1. Storage and Handling of Sensitive Information
  2. Bias in Algorithms
  3. Consent to Share Data
  4. Using Technology to Circumvent Law
  5. Bullsh!t
You can do amazing things with data!

You should.