CONFUSION MATRIX

Ambrunimurali
3 min readJun 6, 2021

cyber crime cases where they talk about confusion matrix or its two types of error.

A confusion matrix is a table that is often used to describe the performance of a classification model (or “classifier”) on a set of test data for which the true values are known. The confusion matrix itself is relatively simple to understand, but the related terminology can be confusing.

Here we are going to discuss about cyber crime cases where they talk about confusion matrix or its two types of error.

WHAT IS CONFUSION MATRIX?

it is a performance measurement for machine learning classification problem where output can be two or more classes. It is a table with 4 different combinations of predicted and actual values.

It is extremely useful for measuring Recall, Precision, Specificity, Accuracy and most importantly AUC-ROC Curve.

Let’s understand TP, FP, FN, TN in terms of pregnancy analogy.

Let’s now define the most basic terms, which are whole numbers (not rates):

  • true positives (TP): These are cases in which we predicted yes (they have the disease), and they do have the disease.
  • true negatives (TN): We predicted no, and they don’t have the disease.
  • false positives (FP): We predicted yes, but they don’t actually have the disease. (Also known as a “Type I error.”)
  • false negatives (FN): We predicted no, but they actually do have the disease. (Also known as a “Type II error.”)

WHAT IS CYBER CRIME?

Cyber crimes can include identity theft,breaching personal data,stealing bank details,hacking for personal gain.

Cyber Attack Detection using confusion matrix

Cyberattack detection is a classification problem, in which we classify the normal pattern from the abnormal pattern (attack) of the system

Detection Rate (DR) and false alarm rate are the two most famous metrics that have already been used. DR is computed as the ratio between the number of correctly detected attacks and the total number of attacks, while the false alarm (false positive) rate is computed as the ratio between the number of normal connections that is incorrectly misclassified as attacks and the total number of normal connections.

  • True Positive (TP): The amount of attack detected when it is actually attack.
  • True Negative (TN): The amount of normal detected when it is actually normal.
  • False Positive (FP): The amount of attack detected when it is actually normal (False alarm).
  • False Negative (FN): The amount of normal detected when it is actually attack.

the discovery was made that ‘platform fraud’ is linked to ‘identity theft’, as it appears that stolen identities are often used to commit platform fraud. In the confusion matrix it is shown that ‘identity theft’ is often predicted as ‘platform fraud’.

--

--