[ML][performance matrics] Performance matrics, imbalanced dataset

Performance matrics

Confusion matrix
Accuracy
Precision, Recall
F1-score
TPR, FPR
ROC curve
imbalanced dataset

Confusion matrix

Accuracy

(TP+TN)/(TP+TN+FP+FN)

Precision, Recall

Precision: TP/(TP+FP)
Recall: (TP)/(TP+FN)

F1-score

2 * (precision*recall)/(precision+recall)

TPR, FPR

TPR = (TP)/(TP+FN) = Recall
FPR = (FP)/(FP+TN)

ROC curve

imbalanced dataset

1. negative class(0) > positive class(1)

help

don't help

FPR: is high. Since our model predicts everything 1, we have a high number of FP. And it signifies that this is not a good classifier/model.
AUC score: is very low and represents the true picture of evaluation here.

Accuracy: is very high. Even when TN = 0. Since the data is imbalanced (high number of +tive class). Numerator i.e TN+TP is high
Precision: is very high. Since data has a very disproportionately high number of Positive cases.
The ration of TP/(TP+FP) becomes high.
Recall: is very high. Since data has a very disproportionately high number of Positive cases.
The ration of TP/(TP+FN) becomes high.
F1-score: is very high. The high values of Precision and Recall make F1- score misleading.

2. negative class(0) < positive class(1)

help

don't help

Precision: is very low. Because of the high number of FP. The ration of TP/(TP+FP) becomes low.
Recall: is very low. Since data has a very disproportionately high number of Negative cases. The classifier may detect a larger no. of positive as negative.
The ration of TP/(TP+FN) becomes low.
F1-score: is low. The low values of Precision and Recall make F1- score, a good indicator of performance here.

Accuracy: is very high. Since the proportion of TN is very, as the data is imbalanced (high number of -tive class). Numerator i.e TN+TP becomes high.
AUC score: is high. Even more than 50% of Actual positive are predicted as FN. (TPR)
FPR: is low. It gets skewed because of the large number of TN(imbalanced). Even when a classifier makes a lot of FP

요약

negative class(0) > positive class(1) 이고 negative에 포커스일 때 AUC score 사용
negative class(0) < positive class(1) 이고 positive에 포커스일 때 Precision, Recall, F1-score 사용
Accuracy score는 imbalanced dataset에서 크게 도움이 되지 못함

예제

http://ethen8181.github.io/machine-learning/model_selection/imbalanced/imbalanced_metrics.html

imbalanced_metrics

Judging from the plot above, the can see that when the weight's value is set at 10, we seem to have strike a good balance between precision and recall (this setting has the highest f1 score, we'll have a deeper discussion on f1 score in the next section),

ethen8181.github.io

ref :

https://medium.com/datasciencestory/performance-metrics-for-evaluating-a-model-on-an-imbalanced-data-set-1feeab6c36fe

http://ethen8181.github.io/machine-learning/model_selection/imbalanced/imbalanced_metrics.html