Skip to content

Best model

BestModel ยค

Bases: Metalearner_Base

A Best Model based Metalearner.

This class should be passed to an ensemble function/class like Stacking for combining predictions.

This Metalearner computes the Area Under the Receiver Operating Characteristic Curve (ROC AUC) for each model, and simply utilizes only the predictions of the best scoring model.

Info

Can be utilized for binary, multi-class and multi-label tasks.

Source code in aucmedi/ensemble/metalearner/best_model.py
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
class BestModel(Metalearner_Base):
    """ A Best Model based Metalearner.

    This class should be passed to an ensemble function/class like Stacking for combining predictions.

    This Metalearner computes the Area Under the Receiver Operating Characteristic Curve (ROC AUC)
    for each model, and simply utilizes only the predictions of the best scoring model.

    !!! info
        Can be utilized for binary, multi-class and multi-label tasks.
    """
    #---------------------------------------------#
    #                Initialization               #
    #---------------------------------------------#
    def __init__(self):
        self.model = {}

    #---------------------------------------------#
    #                  Training                   #
    #---------------------------------------------#
    def train(self, x, y):
        # Identify number of models and classes
        n_classes = y.shape[1]
        n_models = int(x.shape[1] / n_classes)
        # Preprocess data input
        data = np.reshape(x, (x.shape[0], n_models, n_classes))

        # Compute AUC scores and store them to cache
        for m in range(n_models):
            pred = data[:,m,:]
            score = roc_auc_score(y, pred, average="macro")
            self.model["nn_" + str(m)] = score

        # Identify best model and store results to cache
        best_model = max(self.model, key=self.model.get)
        self.model["best_model"] = best_model
        self.model["n_classes"] = n_classes
        self.model["n_models"] = n_models

    #---------------------------------------------#
    #                  Prediction                 #
    #---------------------------------------------#
    def predict(self, data):
        # Preprocess data input
        preds = np.reshape(data, (data.shape[0],
                                  self.model["n_models"],
                                  self.model["n_classes"]))

        # Obtain prediction probabilities of best model
        m = int(self.model["best_model"].split("_")[-1])
        pred_best = preds[:,m,:]
        # Return results
        return pred_best

    #---------------------------------------------#
    #              Dump Model to Disk             #
    #---------------------------------------------#
    def dump(self, path):
        # Dump model to disk via pickle
        with open(path, "wb") as pickle_writer:
            pickle.dump(self.model, pickle_writer)

    #---------------------------------------------#
    #             Load Model from Disk            #
    #---------------------------------------------#
    def load(self, path):
        # Load model from disk via pickle
        with open(path, "rb") as pickle_reader:
            self.model = pickle.load(pickle_reader)