Precision: Out of all the positive predictions, how many were correct?
Recall: Out of all the positive labels, how many were correct?
Specificity: Out of all the negative labels, how many were correct?
| Predicted Positive | Predicted Negative | |
|---|---|---|
| True Positive | TP | FN |
| True Negative | FP | TN |
The receiver operating characteristic curve is a plot of the true positive rate (recall or sensitivity) vs. false positive rate (1 - specificity) as the detection threshold changes
The diagonal is the same as random guessing
A perfect classifier would hug the top left corner
Fun fact: the name comes from WWII radar operators, where true positives were airplanes and false positives were noise

For a predicted
In 1D, estimate modeled as:
Vector form:
Goal is to minimize the Mean Square Error:
Most of the time we have
N-D:
Common to use a design matrix
where each row is an instance (sample) and each column is a feature.
We can rewrite the estimate in matrix notation:
The MSE can be written as:
This has a closed form solution, but it is computationally expensive

Figure 2-32. Comparison of predictions made by a linear model and predictions made by a regression tree on the RAM price data. Source: Introduction to Machine Learning with Python