This guide is a primer on the very basics of machine learning that are necessary to complete the assignments and motivate the final system. Machine learning is a rich and well-developed field with many different models, goals, and learning settings. There are many great texts that cover all the aspects of the area in detail. (I recommend this textbook.) This guide is not that. Our goal is to explain the minimal details of one dataset with one class of model. Specifically, this is an introduction to supervised binary classification with neural networks. The goal of this section is to learn how a basic neural network works to classify simple points.
Supervised learning problems begin with a labeled training dataset. We assume that we are given a set of labeled points. Each point has two coordinates \(x_1\) and \(x_2\), and has a label \(y\) corresponding to an O or X. For instance, here is one O labeled point:
And here is an X labeled point.
It is often convenient to plot all of the points together on one set of axes.
Here we can see that all the O points are in the top-right and all the X points are on the bottom-left. Not all datasets is this simple, and here is another dataset where points are split up a bit more.
Later in the class, we will consider datasets of different forms, e.g. a dataset of handwritten numbers, where some are 8's and others are 2's:
Here is an example of what this dataset looks like.
In addition to a dataset, our ML system need to specify a model type that we want to fit to the data. A model is a function that assigns labels to data points. In 2D, we can visualize a model by its decision boundary. For instance, consider the following (Model A).
For most of the data points, the model puts them in class X. Only for a little area on the top right would it decide to put those points in class O.
We can overlay the simple dataset described ealier over this model. This tells us roughly how well the model fits this dataset.
Models can take many different forms, Here is another model, which we will discuss more below, that splits the data points up based on three regions (Model B).
Models may also have strange shapes and even disconnected regions. Any blue/red split will do, for instance (Model C):
A model class specifies the general shape of models that you want to explore. Given that we as programmers don't know what the dataset looks like, we try to give a class of functions for our system to explore. Machine learning is the process of finding the best model from that class.
The first model class we consider is linear models. Linear models separate the data space with only a single straight line. For instance, Model A is a linear model, but an intuitively "better" model looks like this:
Note that Model B also uses lines, but it is not a linear model: it uses multiple lines to split up the space.
Let's look at an example of some models. Here is some randomly generated data.