Goal¶

This post aims to introduce one of the model evaluation metrics, called Precision score. Precision score is used to measure the prediction ratio of how many of predictions were correct out of the total number of the predictions. As the precision score is higher, the prediction would be high likely true whenever such prediction is made.

Precision score is defined as the following equations:

$$ {\displaystyle {\text{Precision}}={\frac {True\;Positive}{True\;Positive + False\;Positive}}\,} = \frac {True \;Positive}{total\;\#\,of\;samples\;predicated\;as\;True} $$

Reference

Libraries¶

In [2]:

from sklearn.metrics import precision_score
import pandas as pd

Create a prediction and ground truth¶

In [21]:

df_prediction = pd.DataFrame([0, 1, 0, 1, 1 ,1, 1, 1], 
                             columns=['prediction'])
df_prediction

Out[21]:

	prediction
0	0
1	1
2	0
3	1
4	1
5	1
6	1
7	1

In [22]:

df_groundtruth = pd.DataFrame([0, 0, 0 , 0, 1, 1, 1, 1], 
                              columns=['gt'])
df_groundtruth

Out[22]:

	gt
0	0
1	0
2	0
3	0
4	1
5	1
6	1
7	1

Compute F1 Score¶

In [24]:

precision_score(y_true=df_groundtruth, 
        y_pred=df_prediction, average='binary')

Out[24]:

0.6666666666666666

double check by precision and recall¶

In [19]:

TP = (df_prediction.loc[df_prediction['prediction']==1,'prediction'] == df_groundtruth.loc[df_prediction['prediction']==1,'gt']).sum()
TP

Out[19]:

In [25]:

FP = (df_prediction.loc[df_prediction['prediction']==1,'prediction'] != df_groundtruth.loc[df_prediction['prediction']==1,'gt']).sum()
FP

Out[25]:

In [26]:

TP / (TP + FP)

Out[26]:

0.6666666666666666