s3

S3 tools.

ml_cloud_tools.s3.copy_dir_to_s3_dir(local_dir_name, s3_dir_name, s3_bucket_name=None, s3_kwargs=None)[source]

Copy a directory from the local file system to a directory on S3.

If you call this function with local_dir_name = "a/x" and s3_dir_name = "y" it will copy the content in a/x to the S3 location below y/x. This way the local file at a/x/file.txt would be copied to S3 at the location y/x/tfile.txt.

Parameters:
  • local_dir_name (str) – Name of the local directory.

  • s3_dir_name (str) – Name of the S3 directory. This is the part after the s3_bucket_name. Example: /foo/bar

  • s3_bucket_name (Optional[str]) – S3 bucket name. Can also be provided by the DEFAULT_S3_BUCKET_NAME environment variable. One of the two must be specified. If both are specified this argument has priority.

  • s3_kwargs (Optional[Dict[str, Any]]) – Additional kwargs to be passed to the S3 client function S3.Bucket.upload_file().

Returns:

S3 directory where files are stored. In the example above, this would be y/x.

Return type:

str

ml_cloud_tools.s3.copy_file_to_s3_file(local_file_name, s3_file_name, s3_bucket_name=None, s3_kwargs=None)[source]

Copy a file on the local file system to a file on S3.

Upload a local file local_file_name to the S3 file at s3_dir_name from the S3 bucket s3_bucket_name.

Parameters:
  • local_file_name (str) – Local path to the file to upload. Example: /home/my_username/baz.txt

  • s3_file_name (str) – Name of the so called key to upload to. This is the part after the s3_bucket_name. Example: /foo/bar/baz.txt

  • s3_bucket_name (Optional[str]) – S3 bucket name. Can also be provided by the DEFAULT_S3_BUCKET_NAME environment variable. One of the two must be specified. If both are specified this argument has priority.

  • s3_kwargs (Optional[Dict[str, Any]]) – Additional kwargs to be passed to the S3 client function S3.Bucket.upload_file().

Return type:

None

ml_cloud_tools.s3.copy_s3_dir_to_dir(s3_dir_name, local_dir_name, s3_bucket_name=None, overwrite=True, s3_kwargs=None)[source]

Copy a directory from S3 to a directory on the local file system.

If you call this function with s3_dir_name = "a/x" and local_dir_name = "y" it will create a local directory y/x and copy the S3 content in a/x to that location. This way a S3 file at a/x/file.txt would be copied to y/x/file.txt.

Parameters:
  • s3_dir_name (str) – Name of the S3 directory. This is the part after the s3_bucket_name. Example: /foo/bar

  • local_dir_name (str) – Name of the local directory.

  • s3_bucket_name (Optional[str]) – S3 bucket name. Can also be provided by the DEFAULT_S3_BUCKET_NAME environment variable. One of the two must be specified. If both are specified this argument has priority.

  • overwrite (bool) – Overwrite already existing files.

  • s3_kwargs (Optional[Dict[str, Any]]) – Additional kwargs to be passed to the S3 client function S3.Bucket.download_file().

Returns:

Local directory where files are stored. In the example above, this would be y/x.

Return type:

str

ml_cloud_tools.s3.copy_s3_file_to_file(s3_file_name, local_file_name, s3_bucket_name=None, overwrite=True, s3_kwargs=None)[source]

Copy a file from S3 to a file on the local file system.

Download the S3 file at s3_dir_name from the S3 bucket s3_bucket_name to the local file local_file_name.

Parameters:
  • s3_file_name (str) – Name of the so called key to download from. This is the part after the s3_bucket_name. Example: /foo/bar/baz.txt

  • local_file_name (str) – Local path to the file to download to. Example: /home/my_username/baz.txt

  • s3_bucket_name (Optional[str]) – S3 bucket name. Can also be provided by the DEFAULT_S3_BUCKET_NAME environment variable. One of the two must be specified. If both are specified this argument has priority.

  • overwrite (bool) – Overwrite local file.

  • s3_kwargs (Optional[Dict[str, Any]]) – Additional kwargs to be passed to the S3 client function S3.Bucket.download_file().

Return type:

None

ml_cloud_tools.s3.list_s3_files(s3_dir_name, s3_bucket_name=None, s3_kwargs=None)[source]

List files in S3 directory.

Parameters:
  • s3_dir_name (str) – Name of the S3 directory. This is the part after the s3_bucket_name. Example: /foo/bar

  • s3_bucket_name (Optional[str]) – S3 bucket name. Can also be provided by the DEFAULT_S3_BUCKET_NAME environment variable. One of the two must be specified. If both are specified this argument has priority.

  • s3_kwargs (Optional[Dict[str, Any]]) – Additional kwargs to be passed to the S3 client function S3.Client.list_objects_v2().

Returns:

List of files in s3_dir_name.

Return type:

List[str]