sc2_datasets.torch.datasets.sc2_dataset_single_json

Classes

Module Contents

class SC2DatasetSingleJSON(dataset_name: str | None = None, unpack_dir: pathlib.Path | str | None = None, json_path: pathlib.Path | str | None = None, download: bool = True, download_dir: pathlib.Path | str | None = None, dataset_url: str = '', transform: Callable | None = None, validator: Callable | None = None)

Bases: torch.utils.data.Dataset

transform = None
dataset_name: str | None = None
download_dir: pathlib.Path | None = None
maybe_downloaded_zip_path: pathlib.Path | None = None
was_downloaded: bool = False
unpack_dir: pathlib.Path | None = None
unpack_path: pathlib.Path | None = None
dataset_path: pathlib.Path | None = None
json_offsets
validator = None
skip_files: dict[str, set[str]]
len
file_handle = None
__check_download_args(dataset_name: str, download: bool, download_dir: pathlib.Path, dataset_url: str)

Checks the arguments for downloading the dataset.

Parameters:
  • dataset_name (str) – Name of the dataset to be downloaded.

  • download (bool) – Whether to download the dataset or not.

  • download_dir (Path) – Directory where the dataset will be downloaded.

  • dataset_url (str) – URL from which the dataset will be downloaded.

Raises:
  • Exception – If the dataset URL is empty when download is requested.

  • Exception – If the download directory is None when download is requested.

  • Exception – If the dataset name is empty when download is requested.

__download(dataset_name: str, download: bool, download_dir: pathlib.Path, dataset_url: str)
__check_unpack_args(dataset_name: str | None, unpack_dir: pathlib.Path | None)

Checks the arguments for unpacking the dataset.

Parameters:
  • dataset_name (str) – Name of the dataset to be unpacked.

  • unpack_dir (Path) – Directory where the dataset will be unpacked.

Raises:
  • Exception – If the dataset name is empty.

  • Exception – If the unpack directory is None.

__unpack(dataset_name: str, unpack_dir: pathlib.Path)

Unpacks the dataset if it was not unpacked previously.

Parameters:
  • dataset_name (str) – Name of the dataset, this will be used for naming the unpacked directory.

  • unpack_dir (Path) – Directory where the dataset will be unpacked.

Raises:

Exception – If the unpack directory is not a directory.

download_and_unpack(dataset_name: str | None, download: bool, download_dir: pathlib.Path | str | None, unpack_dir: pathlib.Path | str | None, dataset_url: str | None)

Downloads and unpacks the dataset if needed.

Parameters:
  • dataset_name (str | None) – Name of the dataset, this will be used for naming the downloaded zip file and by extension the unpacked directory.

  • download (bool) – Whether to download the dataset or not.

  • download_dir (Path | str | None) – Directory where the dataset will be downloaded.

  • unpack_dir (Path | str | None) – Directory where the dataset will be unpacked.

  • dataset_url (str | None) – URL from which the dataset will be downloaded.

calculate_offsets(unpack_path: pathlib.Path, dataset_name: pathlib.Path) list[int]

Calculates JSON offsets for fast indexing of dataset objects.

Parameters:
  • unpack_path (Path) – Path where the dataset is unpacked.

  • dataset_name (Path) – Name of the dataset.

Returns:

List of offsets for each object in the JSON dataset.

Return type:

list[int]

__len__() int
__getitem__(index: Any) tuple[Any, Any] | sc2_datasets.replay_data.sc2_replay_data.SC2ReplayData
__del__()
static from_json_path(json_path: pathlib.Path | str, validator: Callable | None = None, transform: Callable | None = None) SC2DatasetSingleJSON

Creates a SC2DatasetSingleJSON object directly from a JSON path, skipping download and unpack steps.

Parameters:
  • json_path (Path | str) – Path to the JSON file with the correct dataset structure.

  • validator (Callable | None, optional) – Validator to use when deciding about objects within the dataset, by default None

  • transform (Callable | None, optional) – Transform for each of the SC2ReplayData objects, a function taking SC2ReplayData and returning the transformed object, by default None

Returns:

Returns a SC2DatasetSingleJSON object initialized with the provided JSON path.

Return type:

SC2DatasetSingleJSON