Confusion Matrix
Online Calculator

True Positive True Negative
Predicted Positive
Predicted Negative
Measure Value Derivations
{{}} {{metric.value | number:4 }} {{metric.derivation}}


A Confusion Matrix is a popular representation of the performance of classification models. The matrix (table) shows us the number of correctly and incorrectly classified examples, compared to the actual outcomes (target value) in the test data. One of the advantages of using confusion matrix as evaluation tool is that it allows more detailed analysis (such as if the model is confusing two classes), than simple proportion of correctly classified examples (accuracy) which can give misleading results if the dataset is unbalanced (i.e. when there are huge differences in number of between difference classes).

The matrix is n by n, where n is the number of classes. The simplest classifiers, called binary classifiers, has only two classes: positive/negative, yes/no, male/female… Performance of a binary classifier is summarized in a confusion matrix that cross-tabulates predicted and observed examples into four options:

  • True Positive (TP): Correctly predicting a label (we predicted “yes”, and it’s “yes”),
  • True Negative (TN): Correctly predicting the other label (we predicted “no”, and it’s “no”),
  • False Positive (FP): Falsely Predicting a label (we predicted “yes”, but it's “no”),
  • False Negative (FN): Missing and incoming label (we predicted “no”, but it’s “yes”).



Evaluating Spam Classifier

How can we use those metrics and what we can read from the confusion matrix? For instance, let's consider a classical problem of predicting spam and non-spam email, by using binary classification model. Our dataset consists of 50 emails that are Spam, and 105 emails that are Not Spam. In order to evaluate the performance of our developed model, which labels emails as Spam or Not Spam, we can use confusion matrix, where the outcome is formulated in a 2×2 contingency table or a confusion matrix :

  • Altogether, the classifier made 100 predictions (100 emails were classified in Spam or Non-Spam class)
  • Out of 100 emails, our model correctly classified 95 emails: 85 were correctly classified as Non-Spam, and 10 of them were correctly classified as Spam. This result to 95% accuracy.
  • Further, 5 out of 100 emails were classified falsely: 5 emails, which were actual Spam, were not predicted as Spam (False Negative). And more important, no email was falsely predicted as Spam (False Positive), which is very desired in this case.
  • We can observe that our model is very conservative when it comes to predicting Spam. Therfore, the precision of this of this model is very high: 1.0.
  • By computing additional measures (also called rates) from the classification matrix, we can get additional insight about our model.