Io csv
csv_loader(path_data, path_imagedir, allowed_image_formats, training=True, ohe=True, ohe_range=None, col_sample='SAMPLE', col_class='CLASS')
ยค
Data Input Interface for loading a dataset via a CSV and an image directory.
This internal function allows simple parsing of class annotations encoded in a CSV,
and can be called via the input_interface()
function by passing "csv"
as parameter interface
.
Input Formats
CSV Format 1:
- Name Column: "SAMPLE" -> String Value
- Class Column: "CLASS" -> Sparse Categorical Classes (String/Integers)
- Optional Meta Columns possible
CSV Format 2:
- Name Column: "SAMPLE"
- One-Hot Encoded Class Columns:
-> If OHE parameter provides list of column names -> use these
-> Else try to use all other columns as OHE columns
- Optional Meta Columns only possible if OHE parameter provided
Expected structure:
dataset/
images_dir/ # path_imagedir = "dataset/images_dir"
sample001.png
sample002.png
...
sample350.png
annotations.csv # path_data = "dataset/annotations.csv"
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path_data |
str
|
Path to the csv file. |
required |
path_imagedir |
str
|
Path to the directory containing the images. |
required |
allowed_image_formats |
list of str
|
List of allowed imaging formats. (provided by IO_Interface) |
required |
training |
bool
|
Boolean option whether annotation data is available. |
True
|
ohe |
bool
|
Boolean option whether annotation data is sparse categorical or one-hot encoded. |
True
|
ohe_range |
list of str
|
List of column name values if annotation encoded in OHE. Example: ["classA", "classB", "classC"] |
None
|
col_sample |
str
|
Index column name for the sample name column. Default: 'SAMPLE' |
'SAMPLE'
|
col_class |
str
|
Index column name for the sparse categorical classes column. Default: 'CLASS' |
'CLASS'
|
Returns:
Name | Type | Description |
---|---|---|
index_list |
list of str
|
List of sample/index encoded as Strings. Required in DataGenerator as |
class_ohe |
numpy.ndarray
|
Classification list as One-Hot encoding. Required in DataGenerator as |
class_n |
int
|
Number of classes. Required in NeuralNetwork for Architecture design as |
class_names |
list of str
|
List of names for corresponding classes. Used for later prediction storage or evaluation. |
image_format |
str
|
Image format to add at the end of the sample index for image loading. Required in DataGenerator. |
Source code in aucmedi/data_processing/io_interfaces/io_csv.py
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 |
|