Skip to content

Metrics

compute_confusion_matrix(preds, labels, n_labels) ¤

Function for computing a confusion matrix.

Parameters:

Name Type Description Default
preds numpy.ndarray

A NumPy array of predictions formatted with shape (n_samples, n_labels). Provided by NeuralNetwork.

required
labels numpy.ndarray

Classification list with One-Hot Encoding. Provided by input_interface.

required
n_labels int

Number of classes. Provided by input_interface.

required

Returns:

Name Type Description
rawcm numpy.ndarray

NumPy matrix with shape (n_labels, n_labels).

Source code in aucmedi/evaluation/metrics.py
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
def compute_confusion_matrix(preds, labels, n_labels):
    """ Function for computing a confusion matrix.

    Args:
        preds (numpy.ndarray):          A NumPy array of predictions formatted with shape (n_samples, n_labels). Provided by
                                        [NeuralNetwork][aucmedi.neural_network.model].
        labels (numpy.ndarray):         Classification list with One-Hot Encoding. Provided by
                                        [input_interface][aucmedi.data_processing.io_data.input_interface].
        n_labels (int):                 Number of classes. Provided by [input_interface][aucmedi.data_processing.io_data.input_interface].

    Returns:
        rawcm (numpy.ndarray):          NumPy matrix with shape (n_labels, n_labels).
    """
    preds_argmax = np.argmax(preds, axis=-1)
    labels_argmax = np.argmax(labels, axis=-1)
    rawcm = np.zeros((n_labels, n_labels))
    for i in range(0, labels.shape[0]):
        rawcm[labels_argmax[i]][preds_argmax[i]] += 1
    return rawcm

compute_metrics(preds, labels, n_labels, threshold=None) ¤

Function for computing various classification metrics.

Computed Metrics

F1, Accuracy, Sensitivity, Specificity, AUROC (AUC), Precision, FPR, FNR, FDR, TruePositives, TrueNegatives, FalsePositives, FalseNegatives

Parameters:

Name Type Description Default
preds numpy.ndarray

A NumPy array of predictions formatted with shape (n_samples, n_labels). Provided by NeuralNetwork.

required
labels numpy.ndarray

Classification list with One-Hot Encoding. Provided by input_interface.

required
n_labels int

Number of classes. Provided by input_interface.

required
threshold float

Only required for multi_label data. Threshold value if prediction is positive.

None

Returns:

Name Type Description
metrics pandas.DataFrame

Dataframe containing all computed metrics (except ROC).

Source code in aucmedi/evaluation/metrics.py
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
def compute_metrics(preds, labels, n_labels, threshold=None):
    """ Function for computing various classification metrics.

    !!! info "Computed Metrics"
        F1, Accuracy, Sensitivity, Specificity, AUROC (AUC), Precision, FPR, FNR,
        FDR, TruePositives, TrueNegatives, FalsePositives, FalseNegatives

    Args:
        preds (numpy.ndarray):          A NumPy array of predictions formatted with shape (n_samples, n_labels). Provided by
                                        [NeuralNetwork][aucmedi.neural_network.model].
        labels (numpy.ndarray):         Classification list with One-Hot Encoding. Provided by
                                        [input_interface][aucmedi.data_processing.io_data.input_interface].
        n_labels (int):                 Number of classes. Provided by [input_interface][aucmedi.data_processing.io_data.input_interface].
        threshold (float):              Only required for multi_label data. Threshold value if prediction is positive.

    Returns:
        metrics (pandas.DataFrame):     Dataframe containing all computed metrics (except ROC).
    """
    df_list = []
    for c in range(0, n_labels):
        # Initialize variables
        data_dict = {}

        # Identify truth and prediction for class c
        truth = labels[:, c]
        if threshold is None:
            pred_argmax = np.argmax(preds, axis=-1)
            pred = (pred_argmax == c).astype(np.int8)
        else:
            pred = np.where(preds[:, c] >= threshold, 1, 0)
        # Obtain prediction confidence (probability)
        pred_prob = preds[:, c]

        # Compute the confusion matrix
        tp, tn, fp, fn = compute_CM(truth, pred)
        data_dict["TP"] = tp
        data_dict["TN"] = tn
        data_dict["FP"] = fp
        data_dict["FN"] = fn

        # Compute several metrics based on confusion matrix
        data_dict["Sensitivity"] = np.divide(tp, tp+fn)
        data_dict["Specificity"] = np.divide(tn, tn+fp)
        data_dict["Precision"] = np.divide(tp, tp+fp)
        data_dict["FPR"] = np.divide(fp, fp+tn)
        data_dict["FNR"] = np.divide(fn, fn+tp)
        data_dict["FDR"] = np.divide(fp, fp+tp)
        data_dict["Accuracy"] = np.divide(tp+tn, tp+tn+fp+fn)
        data_dict["F1"] = np.divide(2*tp, 2*tp+fp+fn)

        # Compute area under the ROC curve
        try:
            data_dict["AUC"] = roc_auc_score(truth, pred_prob)
        except:
            print("ROC AUC score is not defined.")

        # Parse metrics to dataframe
        df = pd.DataFrame.from_dict(data_dict, orient="index",
                                    columns=["score"])
        df = df.reset_index()
        df.rename(columns={"index": "metric"}, inplace=True)
        df["class"] = c

        # Append dataframe to list
        df_list.append(df)

    # Combine dataframes
    df_final = pd.concat(df_list, axis=0, ignore_index=True)
    # Return final dataframe
    return df_final

compute_roc(preds, labels, n_labels) ¤

Function for computing the data data of a ROC curve (FPR and TPR).

Parameters:

Name Type Description Default
preds numpy.ndarray

A NumPy array of predictions formatted with shape (n_samples, n_labels). Provided by NeuralNetwork.

required
labels numpy.ndarray

Classification list with One-Hot Encoding. Provided by input_interface.

required
n_labels int

Number of classes. Provided by input_interface.

required

Returns:

Name Type Description
fpr_list list of list

List containing a list of false positive rate points for each class. Shape: (n_labels, tpr_coords).

tpr_list list of list

List containing a list of true positive rate points for each class. Shape: (n_labels, fpr_coords).

Source code in aucmedi/evaluation/metrics.py
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
def compute_roc(preds, labels, n_labels):
    """ Function for computing the data data of a ROC curve (FPR and TPR).

    Args:
        preds (numpy.ndarray):          A NumPy array of predictions formatted with shape (n_samples, n_labels). Provided by
                                        [NeuralNetwork][aucmedi.neural_network.model].
        labels (numpy.ndarray):         Classification list with One-Hot Encoding. Provided by
                                        [input_interface][aucmedi.data_processing.io_data.input_interface].
        n_labels (int):                 Number of classes. Provided by [input_interface][aucmedi.data_processing.io_data.input_interface].
    Returns:
        fpr_list (list of list):        List containing a list of false positive rate points for each class. Shape: (n_labels, tpr_coords).
        tpr_list (list of list):        List containing a list of true positive rate points for each class. Shape: (n_labels, fpr_coords).
    """
    fpr_list = []
    tpr_list = []
    for i in range(0, n_labels):
        truth_class = labels[:, i].astype(int)
        pdprob_class = preds[:, i]
        fpr, tpr, _ = roc_curve(truth_class, pdprob_class)
        fpr_list.append(fpr)
        tpr_list.append(tpr)
    return fpr_list, tpr_list