A Confusion Matrix is a popular representation of the performance of classification models. The matrix (table) shows us the number of correctly and incorrectly classified examples, compared to the actual outcomes (target value) in the test data. One of the advantages of using confusion matrix as evaluation tool is that it allows more detailed analysis (such as if the model is confusing two classes), than simple proportion of correctly classified examples (accuracy) which can give misleading results if the dataset is unbalanced (i.e. when there are huge differences in number of between difference classes).
The matrix is n by n, where n is the number of classes. The simplest classifiers, called binary classifiers, has only two classes: positive/negative, yes/no, male/female… Performance of a binary classifier is summarized in a confusion matrix that cross-tabulates predicted and observed examples into four options:
How can we use those metrics and what we can read from the confusion matrix? For instance, let's consider a classical problem of predicting spam and non-spam email, by using binary classification model. Our dataset consists of 50 emails that are Spam, and 105 emails that are Not Spam. In order to evaluate the performance of our developed model, which labels emails as Spam or Not Spam, we can use confusion matrix, where the outcome is formulated in a 2×2 contingency table or a confusion matrix :