sc2_datasets.utils.zip_utils

Classes

UnpackZipFileArguments

Helper class for passing arguments to a multi-threaded unpacking function.

Functions

unpack_chunk(→ None)

Helper function for unpacking a chunk of files from an archive.

unpack_zipfile(→ pathlib.Path)

Helper function that unpacks the content of .zip archive.

Module Contents

class UnpackZipFileArguments(chunk_id: int, zip_path: pathlib.Path, filenames: list[str], path_to_extract: pathlib.Path)

Helper class for passing arguments to a multi-threaded unpacking function.

Parameters:
  • zip_path (Path) – Specifies the path to the archive file that will be extracted.

  • filenames (list[str]) – Specifies a list of the filenames which are within the archive and will be extracted.

  • path_to_extract (Path) – Specifies the path to which the files will be extracted to.

chunk_id
zip_path
filenames
path_to_extract
unpack_chunk(unpack_arguments: UnpackZipFileArguments) None

Helper function for unpacking a chunk of files from an archive.

Parameters:

unpack_arguments (UnpackZipFileArguments) – Specifies the arguments required for unpacking a chunk of files.

unpack_zipfile(destination_dir: pathlib.Path, subdir: str, zip_path: pathlib.Path, n_workers: int) pathlib.Path

Helper function that unpacks the content of .zip archive.

Parameters:
  • destination_dir (Path) – Specifies the path where the .zip file will be extracted.

  • subdir (str) – Specifies the subdirectory where the content will be extracted.

  • zip_path (Path) – Specifies the path to the zip file that will be extracted.

  • n_workers (int) – Specifies the number of workers that will be used for unpacking the archive.

Returns:

Returns a path to the extracted content.

Return type:

str

Raises:

Exception – Raises an exception if the number of workers is less or equal to zero.

Examples

The use of this method is intended to extract a zipfile.

You should set every parameter, destination, subdir, zip_path and n_workers.

May help you to work with dataset.

The parameters should be set as in the example below.

>>> from pathlib import Path
>>> unpack_zipfile_object = unpack_zipfile(
... destination_dir=Path("./directory/destination_dir").resolve(),
... subdir="./directory/subdir",
... zip_path=Path("./directory/zip_path").resolve,
... n_workers=1)
>>> assert isinstance(destination_dir, Path)
>>> assert isinstance(subdir, str)
>>> assert isinstance(zip_path, Path)
>>> assert isinstance(n_workers, int)
>>> assert n_workers >= 1