files
File utils module.
This module provides utility functions for other modules.
Hint
Use pip to install the necessary dependencies for this module:
pip install mltb2[files]
- class mltb2.files.FileBasedRestartableBatchDataProcessor(data: list[dict[str, Any]], batch_size: int, uuid_name: str, result_dir: str)[source]
Bases:
object
Batch data processor which supports restartability and is backed by files.
- Parameters:
- static load_data(result_dir: str, ignore_load_error: bool = False) list[dict[str, Any]] [source]
Load all data.
After all data is processed, this method can be used to load all data. As the FileBasedRestartableBatchDataProcessor may be executed several times in parallel, data records may exist in duplicate. These duplicates are removed here.
- mltb2.files.fetch_remote_file(dirname, filename, url: str, sha256_checksum: str) str [source]
Fetch a file from a remote URL.
- Parameters:
- Returns:
Full path of the created file.
- Raises:
IOError – if the sha256 checksum is wrong
- Return type:
- mltb2.files.get_and_create_mltb2_data_dir(mltb2_base_data_dir: str | None = None) str [source]
Return and create a data dir for mltb2.
The exact directory is given by the
mltb2_base_data_dir
as the base folder and then the foldermltb2
is appended.- Parameters:
mltb2_base_data_dir (str | None) – The base data directory. If
None
the default user data directory is used. The default user data directory is determined byplatformdirs.user_data_dir()
.- Returns:
The directory path.
- Return type: