Introduction to Machine Learning
Machine learning is a set of tools that, broadly speaking, allow us to “teach” computers how toperform tasks by providing examples of how they should be done. For example, suppose we wishto write a program to distinguish between valid email messages and unwanted spam. We could tryto write a set of simple rules, for example, flagging messages that contain certain features (suchas the word “viagra” or obviously-fake headers). However, writing rules to accurately distinguishwhich text is valid can actually be quite difficult to do well, resulting either in many missed spammessages, or, worse, many lost emails. Worse, the spammers will actively adjust the way theysend spam in order to trick these strategies (e.g., writing “vi@gr@”). Writing effective rules —and keeping them up-to-date — quickly becomes an insurmountable task. Fortunately, machinelearning has provided a solution. Modern spam filters are “learned” from examples: we provide thelearning algorithm with example emails which we have manually labeled as “ham” (valid email)or “spam” (unwanted email), and the algorithms learn to distinguish between them automatically.
Machine learning is a diverse and exciting field, and there are multiple ways of defining it:
1. The Artifical Intelligence View. Learning is central to human knowledge and intelligence,
and, likewise, it is also essential for building intelligent machines. Years of effort in AIhas shown that trying to build intelligent computers by programming all the rules cannot bedone; automatic learning is crucial. For example, we humans are not born with the abilityto understand language — we learn it — and it makes sense to try to have computers learnlanguage instead of trying to program it all it.
2. The Software Engineering View. Machine learning allows us to program computers by
example, which can be easier than writing code the traditional way.
3. The Stats View. Machine learning is the marriage of computer science and statistics: com-
putational techniques are applied to statistical problems. Machine learning has been appliedto a vast number of problems in many contexts, beyond the typical statistics problems. Ma-chine learning is often designed with different considerations than statistics (e.g., speed isoften more important than accuracy).
Often, machine learning methods are broken into two phases:
1. Training: A model is learned from a collection of training data.
2. Application: The model is used to make decisions about some new test data.
For example, in the spam filtering case, the training data constitutes email messages labeled as hamor spam, and each new email message that we receive (and which to classify) is test data. However,there are other ways in which machine learning is used as well.
Copyright c 2011 Aaron Hertzmann and David Fleet
Types of Machine Learning
Some of the main types of machine learning are:
1. Supervised Learning, in which the training data is labeled with the correct answers, e.g.,
“spam” or “ham.” The two most common types of supervised learning are classification (where the outputs are discrete labels, as in spam filtering) and regression (where the outputs are real-valued).
2. Unsupervised learning, in which we are given a collection of unlabeled data, which we wish
to analyze and discover patterns within. The two most important examples are dimension reduction and clustering.
3. Reinforcement learning, in which an agent (e.g., a robot or controller) seeks to learn the
optimal actions to take based the outcomes of past actions.
There are many other types of machine learning as well, for example:
1. Semi-supervised learning, in which only a subset of the training data is labeled
2. Time-series forecasting, such as in financial markets
3. Anomaly detection such as used for fault-detection in factories and in surveillance
4. Active learning, in which obtaining data is expensive, and so an algorithm must determine A simple problem
Figure 1 shows a 1D regression problem. The goal is to fit a 1D curve to a few points. Which curveis best to fit these points? There are infinitely many curves that fit the data, and, because the datamight be noisy, we might not even want to fit the data precisely. Hence, machine learning requiresthat we make certain choices:
1. How do we parameterize the model we fit? For the example in Figure 1, how do we param-
eterize the curve; should we try to explain the data with a linear function, a quadratic, or asinusoidal curve?
2. What criteria (e.g., objective function) do we use to judge the quality of the fit? For example,
when fitting a curve to noisy data, it is common to measure the quality of the fit in terms ofthe squared error between the data we are given and the fitted curve. When minimizing thesquared error, the resulting fit is usually called a least-squares estimate.
Copyright c 2011 Aaron Hertzmann and David Fleet
3. Some types of models and some model parameters can be very expensive to optimize well.
How long are we willing to wait for a solution, or can we use approximations (or hand-tuning) instead?
4. Ideally we want to find a model that will provide useful predictions in future situations. That
is, although we might learn a model from training data, we ultimately care about how wellit works on future test data. When a model fits training data well, but performs poorly ontest data, we say that the model has overfit the training data; i.e., the model has fit propertiesof the input that are not particularly relevant to the task at hand (e.g., Figures 1 (top row andbottom left)). Such properties are refered to as noise. When this happens we say that themodel does not generalize well to the test data. Rather it produces predictions on the testdata that are much less accurate than you might have hoped for given the fit to the trainingdata.
Machine learning provides a wide selection of options by which to answer these questions,
along with the vast experience of the community as to which methods tend to be successful ona particular class of data-set. Some more advanced methods provide ways of automating someof these choices, such as automatically selecting between alternative models, and there is somebeautiful theory that assists in gaining a deeper understanding of learning. In practice, there is nosingle “silver bullet” for all learning. Using machine learning in practice requires that you makeuse of your own prior knowledge and experimentation to solve problems. But with the tools ofmachine learning, you can do amazing things!
Copyright c 2011 Aaron Hertzmann and David Fleet
Figure 1: A simple regression problem. The blue circles are measurements (the training data), andthe red curves are possible fits to the data. There is no one “right answer;” the solution we preferdepends on the problem. Ideally we want to find a model that provides good predictions for newinputs (i.e., locations on the x-axis for which we had no training data). We will often prefer simple,smooth models like that in the lower right.
Copyright c 2011 Aaron Hertzmann and David Fleet
Bloomberg Brief | Bankruptcy & Restructuring BANKRUPTCY & RESTRUCTURING NEWS ROUNDUP European CLOs Riskier Without turn increase credit risk and hedging costs,” ADVANCE SHEETS More Loan Issuance: Moody’s Moody’s analysts led by London-based Dimitri Kaltsas , wrote in a report yesterday. Case Tests Constitutional rope will become riskier if their growth is Po
PRIME MINISTER’S SECRETARIAT NATIONAL DISASTER MANAGEMENT AUTHORITY ISLAMABAD List of Items Required for Relief Operation - Floods 2011 Items required for flood affectees are as under; Family Tents (Both normal as well as winterized) Hygiene Kits a) 10 liters water container for storage Water Purification Equipment (Preferably Family Straw) De-watering pumps (As per speci