sc2_datasets.torch.datasets.sc2_dataset_single_json¶
Classes¶
Module Contents¶
- class SC2DatasetSingleJSON(dataset_name: str | None = None, unpack_dir: pathlib.Path | str | None = None, json_path: pathlib.Path | str | None = None, download: bool = True, download_dir: pathlib.Path | str | None = None, dataset_url: str = '', transform: Callable | None = None, validator: Callable | None = None)¶
Bases:
torch.utils.data.Dataset- transform = None¶
- dataset_name: str | None = None¶
- download_dir: pathlib.Path | None = None¶
- maybe_downloaded_zip_path: pathlib.Path | None = None¶
- was_downloaded: bool = False¶
- unpack_dir: pathlib.Path | None = None¶
- unpack_path: pathlib.Path | None = None¶
- dataset_path: pathlib.Path | None = None¶
- json_offsets¶
- validator = None¶
- skip_files: dict[str, set[str]]¶
- len¶
- file_handle = None¶
- __check_download_args(dataset_name: str, download: bool, download_dir: pathlib.Path, dataset_url: str)¶
Checks the arguments for downloading the dataset.
- Parameters:
dataset_name (str) – Name of the dataset to be downloaded.
download (bool) – Whether to download the dataset or not.
download_dir (Path) – Directory where the dataset will be downloaded.
dataset_url (str) – URL from which the dataset will be downloaded.
- Raises:
Exception – If the dataset URL is empty when download is requested.
Exception – If the download directory is None when download is requested.
Exception – If the dataset name is empty when download is requested.
- __download(dataset_name: str, download: bool, download_dir: pathlib.Path, dataset_url: str)¶
- __check_unpack_args(dataset_name: str | None, unpack_dir: pathlib.Path | None)¶
Checks the arguments for unpacking the dataset.
- Parameters:
dataset_name (str) – Name of the dataset to be unpacked.
unpack_dir (Path) – Directory where the dataset will be unpacked.
- Raises:
Exception – If the dataset name is empty.
Exception – If the unpack directory is None.
- __unpack(dataset_name: str, unpack_dir: pathlib.Path)¶
Unpacks the dataset if it was not unpacked previously.
- Parameters:
dataset_name (str) – Name of the dataset, this will be used for naming the unpacked directory.
unpack_dir (Path) – Directory where the dataset will be unpacked.
- Raises:
Exception – If the unpack directory is not a directory.
- download_and_unpack(dataset_name: str | None, download: bool, download_dir: pathlib.Path | str | None, unpack_dir: pathlib.Path | str | None, dataset_url: str | None)¶
Downloads and unpacks the dataset if needed.
- Parameters:
dataset_name (str | None) – Name of the dataset, this will be used for naming the downloaded zip file and by extension the unpacked directory.
download (bool) – Whether to download the dataset or not.
download_dir (Path | str | None) – Directory where the dataset will be downloaded.
unpack_dir (Path | str | None) – Directory where the dataset will be unpacked.
dataset_url (str | None) – URL from which the dataset will be downloaded.
- calculate_offsets(unpack_path: pathlib.Path, dataset_name: pathlib.Path) list[int]¶
Calculates JSON offsets for fast indexing of dataset objects.
- Parameters:
unpack_path (Path) – Path where the dataset is unpacked.
dataset_name (Path) – Name of the dataset.
- Returns:
List of offsets for each object in the JSON dataset.
- Return type:
list[int]
- __len__() int¶
- __getitem__(index: Any) tuple[Any, Any] | sc2_datasets.replay_data.sc2_replay_data.SC2ReplayData¶
- __del__()¶
- static from_json_path(json_path: pathlib.Path | str, validator: Callable | None = None, transform: Callable | None = None) SC2DatasetSingleJSON¶
Creates a SC2DatasetSingleJSON object directly from a JSON path, skipping download and unpack steps.
- Parameters:
json_path (Path | str) – Path to the JSON file with the correct dataset structure.
validator (Callable | None, optional) – Validator to use when deciding about objects within the dataset, by default None
transform (Callable | None, optional) – Transform for each of the SC2ReplayData objects, a function taking SC2ReplayData and returning the transformed object, by default None
- Returns:
Returns a SC2DatasetSingleJSON object initialized with the provided JSON path.
- Return type: