Posts about Precision

Precision

Goal

This post aims to introduce one of the model evaluation metrics, called Precision score. Precision score is used to measure the prediction ratio of how many of predictions were correct out of the total number of the predictions. As the precision score is higher, the prediction would be high likely true whenever such prediction is made.

Precision score is defined as the following equations:

$$ {\displaystyle {\text{Precision}}={\frac {True\;Positive}{True\;Positive + False\;Positive}}\,} = \frac {True \;Positive}{total\;\#\,of\;samples\;predicated\;as\;True} $$

Reference

Libraries

In [2]:
from sklearn.metrics import precision_score
import pandas as pd

Create a prediction and ground truth

In [21]:
df_prediction = pd.DataFrame([0, 1, 0, 1, 1 ,1, 1, 1], 
                             columns=['prediction'])
df_prediction
Out[21]:
prediction
0 0
1 1
2 0
3 1
4 1
5 1
6 1
7 1
In [22]:
df_groundtruth = pd.DataFrame([0, 0, 0 , 0, 1, 1, 1, 1], 
                              columns=['gt'])
df_groundtruth
Out[22]:
gt
0 0
1 0
2 0
3 0
4 1
5 1
6 1
7 1

Compute F1 Score

In [24]:
precision_score(y_true=df_groundtruth, 
        y_pred=df_prediction, average='binary')
Out[24]:
0.6666666666666666

double check by precision and recall

In [19]:
TP = (df_prediction.loc[df_prediction['prediction']==1,'prediction'] == df_groundtruth.loc[df_prediction['prediction']==1,'gt']).sum()
TP
Out[19]:
4
In [25]:
FP = (df_prediction.loc[df_prediction['prediction']==1,'prediction'] != df_groundtruth.loc[df_prediction['prediction']==1,'gt']).sum()
FP
Out[25]:
2
In [26]:
TP / (TP + FP)
Out[26]:
0.6666666666666666