sanitize_data
FunctionSource
sanitized_data: Annotations = sanitize_data(
data: Annotations | Sequence[th.AnnoItem],
deduplicate: "add" | "drop" = "drop",
)
Perform the full sanitization on the annotation data.
The sanitization will ensure that:
- The current timestamp is the time when this method is used.
- All items in
data
are dictionaries typed byAnnoItem
. - Any item with a
None
comment will be sanitized as an item without the comment. - The position of the annotation mark will always have positive width and height. (Negative values means that the starting location is reversed.)
- Annotation items will be deduplicated, i.e. the IDs will be sanitized.
This method can be used when the annotations need to be saved as a file. It may take time to run, so it may not be suitable to sanitize the annotations in the real time.
Aliases
This function can be acquired by
import dash_picture_annotation as dpa
dpa.sanitize_data
dpa.utilities.sanitize_data
Arguments
Requires
Argument | Type | Required | |
---|---|---|---|
data | Annotations | [AnnoItem] | The annotation data that will be sanitized. Note that this method will not change the input data. | |
deduplicate | "add" | "drop" | The deduplicate method for the annotation IDs. "add" means that preserving the duplicated ID by adding a postfix. "drop" means that dropping all annotation items with duplicated IDs after the first found item. |
Returns
Argument | Type | |
---|---|---|
sanitized_data | Annotations | The sanitized copy of the data. |
Examples
Sanitize a collection of data (drop
mode)
- Input Data
- Codes
- Output Data
data-input.json
data | [-]data [10]
|
The following codes will sanitize the data, but all data items with repeated IDs will be dropped. For any repeated ID, only the first occuring item will be preserved.
sanitize_data_drop.py
import json
import dash_picture_annotation as dpa
with open("./data-input.json", "r") as fobj:
data = json.load(fobj)
with open("./sanitized-data.json", "w") as fobj:
json.dump(dpa.sanitize_data(data), fobj, indent=2, ensure_ascii=False)
sanitized-data.json
timestamp | 1730913816961 | ||||||||||||||||||||||||||||||||||||||||||||||
data | [-]data [8]
|
Sanitize a collection of data (add
mode)
- Input Data
- Codes
- Output Data
data-input.json
data | [-]data [10]
|
The following codes will sanitize the data and preserve all data items. For any item with a repeated ID, will add a random postfix to deduplicate it.
sanitize_data_add.py
import json
import dash_picture_annotation as dpa
with open("./data-input.json", "r") as fobj:
data = json.load(fobj)
with open("./sanitized-data.json", "w") as fobj:
json.dump(dpa.sanitize_data(data, "add"), fobj, indent=2, ensure_ascii=False)
sanitized-data.json
timestamp | 1730915309229 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
data | [-]data [10]
|