databricks_aws_utils.s3

S3Utils Objects

class S3Utils(DatabrickAWSUtils)

AWS S3 Utils.

spark.read.{csv,parquet,etc}("") throws an exception when the path not exist in the S3 Bucket, this module checks the prefix and returns a boolean to avoid throwing exception in the Spark Job

Arguments:

  • aws_region str, optional - AWS region, default us-east-1
  • iam_role str, optional - IAM Role ARN, if specified assumes the IAM role to perform the AWS API calls
  • aws_access_key_id str, optional - Temporary AWS Access Key Id
  • aws_secret_access_key str, optional - Temporary AWS Secret Access Key
  • aws_session_token str, optional - Temporary AWS Session Token

check_path

def check_path(uri: str) -> bool

Checks if the S3 URI exists in the S3 Bucket

Example:

from databricks_aws_utils.s3 import S3Utils

S3Utils().check_path("s3://bucket_name/folder")
  • Output - True

Arguments:

  • uri str - S3 URI

Returns:

  • bool - if the uri exists in the S3 bucket