Version: 0.2.1

sanitize_data

sanitized_data: Annotations = sanitize_data(
    data: Annotations | Sequence[th.AnnoItem],
    deduplicate: "add" | "drop" = "drop",
)

Perform the full sanitization on the annotation data.

The sanitization will ensure that:

The current timestamp is the time when this method is used.
All items in data are dictionaries typed by AnnoItem.
Any item with a None comment will be sanitized as an item without the comment.
The position of the annotation mark will always have positive width and height. (Negative values means that the starting location is reversed.)
Annotation items will be deduplicated, i.e. the IDs will be sanitized.

This method can be used when the annotations need to be saved as a file. It may take time to run, so it may not be suitable to sanitize the annotations in the real time.

Aliases

This function can be acquired by

import dash_picture_annotation as dpa


dpa.sanitize_data
dpa.utilities.sanitize_data

Arguments

Requires

Argument	Type	Required	Description
`data`	`Annotations \| [AnnoItem]`		The annotation data that will be sanitized. Note that this method will not change the input data.
`deduplicate`	`"add" \| "drop"`		The deduplicate method for the annotation IDs. `"add"` means that preserving the duplicated ID by adding a postfix. `"drop"` means that dropping all annotation items with duplicated IDs after the first found item.

Returns

Argument	Type	Description
`sanitized_data`	`Annotations`	The sanitized copy of the data.

Examples

Sanitize a collection of data (`drop` mode)

Input Data
Codes
Output Data

data-input.json

data

[-]data [10]

mark

comment

undefined

[+]mark {}

undefined

[+]mark {}

type-2

ZxdA2p

[+]mark {}

undefined

hCC6Gi

[+]mark {}

undefined

yJwGxK

[-]mark {}

x	528.1117021276596
y	62.765957446808514
width	-125.531914893617
height	-125.53191489361701
type	RECT

undefined

yJwGxK

[+]mark {}

type-6

yJwGxK

[+]mark {}

type-7

undefined

[+]mark {}

type-7

EWCEpN

[+]mark {}

null

3Pb3rh

[+]mark {}

The following codes will sanitize the data, but all data items with repeated IDs will be dropped. For any repeated ID, only the first occuring item will be preserved.

sanitize_data_drop.py
import json
import dash_picture_annotation as dpa


with open("./data-input.json", "r") as fobj:
    data = json.load(fobj)

with open("./sanitized-data.json", "w") as fobj:
    json.dump(dpa.sanitize_data(data), fobj, indent=2, ensure_ascii=False)

sanitized-data.json

timestamp

1730913816961

data

[-]data [8]

mark

comment

67199d

[+]mark {}

undefined

a22c54

[+]mark {}

type-2

ZxdA2p

[+]mark {}

undefined

hCC6Gi

[+]mark {}

undefined

yJwGxK

[-]mark {}

x	402.57978723404256
y	-62.7659574468085
width	125.531914893617
height	125.53191489361701
type	RECT

undefined

d345c5

[+]mark {}

type-7

EWCEpN

[+]mark {}

undefined

3Pb3rh

[+]mark {}

Sanitize a collection of data (`add` mode)

Input Data
Codes
Output Data

data-input.json

data

[-]data [10]

mark

comment

undefined

[+]mark {}

undefined

[+]mark {}

type-2

ZxdA2p

[+]mark {}

undefined

hCC6Gi

[+]mark {}

undefined

yJwGxK

[-]mark {}

x	528.1117021276596
y	62.765957446808514
width	-125.531914893617
height	-125.53191489361701
type	RECT

undefined

yJwGxK

[+]mark {}

type-6

yJwGxK

[+]mark {}

type-7

undefined

[+]mark {}

type-7

EWCEpN

[+]mark {}

null

3Pb3rh

[+]mark {}

The following codes will sanitize the data and preserve all data items. For any item with a repeated ID, will add a random postfix to deduplicate it.

sanitize_data_add.py
import json
import dash_picture_annotation as dpa


with open("./data-input.json", "r") as fobj:
    data = json.load(fobj)

with open("./sanitized-data.json", "w") as fobj:
    json.dump(dpa.sanitize_data(data, "add"), fobj, indent=2, ensure_ascii=False)

sanitized-data.json

timestamp

1730915309229

data

[-]data [10]

mark

comment

1e5a53

[+]mark {}

undefined

3fc004

[+]mark {}

type-2

ZxdA2p

[+]mark {}

undefined

hCC6Gi

[+]mark {}

undefined

yJwGxK

[-]mark {}

x	402.57978723404256
y	-62.7659574468085
width	125.531914893617
height	125.53191489361701
type	RECT

undefined

yJwGxKafa456

[+]mark {}

type-6

yJwGxK60cfab

[+]mark {}

type-7

be34ff

[+]mark {}

type-7

EWCEpN

[+]mark {}

undefined

3Pb3rh

[+]mark {}

Aliases​

Arguments​

Requires​

Returns​

Examples​

Sanitize a collection of data (drop mode)​

Sanitize a collection of data (add mode)​

Aliases

Arguments

Requires

Returns

Examples

Sanitize a collection of data (`drop` mode)

Sanitize a collection of data (`add` mode)