sc2_datasets.validators.singleprocess_validator

Functions

validate_integrity_persist_sp(→ set[pathlib.Path])

Exposes the logic for validating replays using a single process.

validate_integrity_sp(→ tuple[set[pathlib.Path], ...)

Exposes logic for single process integrity validation of a replay.

Module Contents

validate_integrity_persist_sp(list_of_replays: list[pathlib.Path], validation_file_path: pathlib.Path) set[pathlib.Path]

Exposes the logic for validating replays using a single process. This function uses a validation file that persists the files which were previously checked.

Parameters:
  • list_of_replays (list[Path]) – Specifies the list of replays that are supposed to be validated.

  • validation_file_path (Path) – Specifies the path to the validation file which will be read to obtain the

Returns:

Returns a set of files that should be skipped in further processing.

Return type:

set[Path]

Examples

Persistent validators save the validation information to a specified filepath. Only the files that ought to be skipped are returned as a set from this function.

>>> from pathlib import Path
>>> replays_to_skip = validate_integrity_persist_sp(
...                         list_of_replays=[
...                               Path("test/test_files/single_replay/test_replay.json"),
...                               Path("test/test_files/single_replay/test_bit_flip_example.json"),
...                         ],
...                         validation_file_path=Path("validator_file.json"),
...                   )
>>> assert len(replays_to_skip) == 1
validate_integrity_sp(list_of_replays: list[pathlib.Path]) tuple[set[pathlib.Path], set[pathlib.Path]]

Exposes logic for single process integrity validation of a replay.

Parameters:

list_of_replays (list[Path]) – Specifies the SC2ReplayInfo information of the files that will be validated.

Returns:

Returns a tuple that contains (validated replays, files to be skipped).

Return type:

tuple[set[Path], set[Path]]

Examples

Validators can be used to check if a file is correct before loading it for some modeling task. Below you will find a sample execution which should contain one correct file and one incorrect file. This results in the final tuple containing two sets. The first tuple denotes correctly validated files, whereas the second tuple denotes the files that should be skipped in modeling tasks.

>>> validated_replays = validate_integrity_sp(
...                         list_of_replays=[
...                               Path("./test/test_files/single_replay/test_replay.json"),
...                               Path("./test/test_files/single_replay/test_bit_flip_example.json"),
...                         ],
...                   )
>>> assert len(validated_replays[0]) == 2
>>> assert len(validated_replays[1]) == 1