Split
sampling_split(samples, labels, metadata=None, sampling=[0.8, 0.2], stratified=True, iterative=False, seed=None)
ยค
Simple wrapper function for calling percentage split sampling functions.
Allow usage of stratified and iterative sampling algorithm.
Warning
Be aware that multi-label data does not support random stratified sampling.
Percentage split ratios have to be provided with a sampling list. Each percentage value in the list defines the approximate split size. Sum of percentage split ratios have to equal 1!
Example
split_ratio = [0.7, 0.25, 0.05]
ds = sampling_split(samples, labels, sampling=split_ratio)
# Returns a list with the following elements as tuples:
print(ds[0]) # -> (samples_a, labels_a) with 70% of complete dataset
print(ds[1]) # -> (samples_b, labels_b) with 25% of complete dataset
print(ds[2]) # -> (samples_c, labels_c) with 5% of complete dataset
ds = sampling_split(samples, labels, metadata, sampling=[0.8, 0.2])
# Returns a list with the following elements as tuples:
print(ds[0]) # -> (samples_a, labels_a, metadata_a) with 80% of complete dataset
print(ds[1]) # -> (samples_b, labels_b, metadata_b) with 20% of complete dataset
Parameters:
Name | Type | Description | Default |
---|---|---|---|
samples |
list of str
|
List of sample/index encoded as Strings. |
required |
labels |
numpy.ndarray
|
NumPy matrix containing the ohe encoded classification. |
required |
metadata |
numpy.ndarray
|
NumPy matrix with additional metadata. Have to be shape (n_samples, meta_variables). |
None
|
sampling |
list of float
|
List of percentage values with split sizes. |
[0.8, 0.2]
|
stratified |
bool
|
Option whether to use stratified sampling based on provided labels. |
True
|
iterative |
bool
|
Option whether to use iterative sampling algorithm. |
False
|
seed |
int
|
Seed to ensure reproducibility for random functions. |
None
|
Returns:
Name | Type | Description |
---|---|---|
results |
list of tuple
|
List with |
Source code in aucmedi/sampling/split.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 |
|