Week 12: Unsupervised Learning
November 25, 2019Rank | Team Name | MSE | Points |
---|---|---|---|
1 | GIN | 849,875 | +6 |
2 | EDS | 879,847 | +6 |
3 | The Data Scientists | 880,845 | +4 |
4 | RentAdvisor | 956,380 | +4 |
5 | Science Data | 1,039,054 | +2 |
6 | 100k Offer | 1,147,571 | +2 |
7 | Datalicious | 1,158,716 | +2 |
Supervised learning is the machine learning task of learning a function that maps an input to an output based on example input-output pairs. It infers a function from labeled training data consisting of a set of training examples.
Unsupervised learning is a branch of machine learning that learns from test data that has not been labeled, classified or categorized. Instead of responding to feedback, unsupervised learning identifies commonalities in the data and reacts based on the presence or absence of such commonalities in each new piece of data.
For more, check out these these help notes from CS221 at Stanford.
These slides adapted from CS109 at Harvard.
See Wikipedia for more.
These slides adapted from CS109 at Harvard.
Topic modeling provides methods for automatically organizing, understanding, searching, and summarizing large electronic archives.
This slide adapted from Columbia's David Blei.
This slide adapted from Columbia's David Blei.
How many genes does an organism need to survive? Last week at the genome meeting here, two genome researchers with radically different approaches presented complementary views of the basic genes needed for life. One research team, using computer analyses to compare known genomes, concluded that today's organisms can be sustained with just 250 genes, and that the earliest life forms required a mere 128 genes. The other researcher mapped genes in a simple parasite and estimated that for this organism, 800 genes are plenty to do the job - but that anything short of 100 wouldn't be enough.
This slide adapted from Columbia's David Blei.
This slide adapted from Columbia's David Blei.
We'll use Gensim to build our topic model and pyLDAvis to visualize it.