please go to this link and get a census api key
api.census.gov/data/key_signup.html
HW Recap
Please follow instructions!
Best sources for data?
Interesting Stats
Two vehicles have received over 1,000 parking tickets in 2017.
Iceland has the highest percentage of people with internet access at 98.24%
As of 2015, the richest 10% for each country owns on average 34.24% of their own country's wealth.
Tip: Write the tweet.
Week 1 Recap.
Data in the News This Week
The Situation in Venezuela
flightradar24.com
A Masterclass in Data
Where Does Data Come From?
Where does data come from?
Government Agencies
Private Firms
Individuals
Government Agencies
Private Firms
Individuals
Structured vs. Unstructured
Structured Data
Data with that has well defined model and clearly organized.
Structured data has a clear definition of what constitutes an observation, and is typically carefully collected and often well-documented.
Common Examples: stock prices, employee records, medical test results.
Unstructured Data
Data that lacks clear organization or does not follow a set model.
Unstructured data often requires significant effort to turn into a useful data set.
Common Examples: transcripts and other collections of text, code, or activity logs.
Is it structured or unstructured?
Stock Market Data
Call Center Transcripts
Facebook Likes
Credit Card Statements
Database of New York Times Articles
The Selfies on Your Phone
Click Data
Website HTML Code
Why is this important?
Common Ways to Access Data
Databases
Flat files
APIs
Scraping
Tools to access data
This Week's Data: Part 1
Conducts a full count of the U.S. population every 10 years
Estimates and projects U.S. population between counts
Conducts economic surveys of manufacturing, retail, service, and other establishments and of domestic governments
Ongoing survey - conducted continuously
Includes ancestry, educational attainment, income, language proficiency, migration, disability, employment, and housing characteristics
Sent to approximately 295,000 addresses monthly (or 3.5 million per year) [source]
Let's Jump Into Some Code!
Assignment 2: Due Monday, Febrary 11 by 6:30pm
DataCamp's Cleaning Data in Python
''
By tomorrow evening, everyone in the class should receive an invitation to join the course group at DataCamp at the email you indicated to me as your preferred email in Assignment 1. Please accept that invitation and complete all assignments in DataCamp through the account associated that email. Assignments completed under other accounts will not be accepted. If you have not received an invite to the course organization at DataCamp, please email me as soon as possible.
For this part of the assignment, there is nothing to submit formally, as I will have reports on your progress from DataCamp.
Note, the exercises in the course should be straightforward, but note that the course does take 4 hours. Please plan your time accordingly.