databricks_aws_utils.delta_table

DeltaTableUtils Objects

class DeltaTableUtils(DatabrickAWSUtils)

Delta Table AWS Integration Utils.

This Delta Table integration only works if the Databricks use the AWS Glue as the Metastore

Arguments:

  • spark SparkSession - spark session
  • name - (str): delta table name, must contain the database (e.g. <database>.<table>)
  • aws_region str, optional - AWS region, default us-east-1
  • iam_role str, optional - IAM Role ARN, if specified assumes the IAM role to perform the AWS API calls
  • aws_access_key_id str, optional - Temporary AWS Access Key Id
  • aws_secret_access_key str, optional - Temporary AWS Secret Access Key
  • aws_session_token str, optional - Temporary AWS Session Token

Features:

  • Convert Databricks delta table to AWS Glue Format using symlink_format_manifest to allow the AWS Athena or Presto to consume externally

to_athena

def to_athena(target_database: str, target_table: str) -> None

Converts a Delta table to external table using AWS Athena or Presto format using symlink_format_manifest

Presto Integration full documentation: https://docs.databricks.com/delta/presto-integration.html#limitations

Arguments:

  • delta_table str - delta table name
  • target_database str - external database name
  • target_table str - external table name
  • target_table_description str, optional - external table description

get_table_name

def get_table_name() -> str

Get delta table name without the database name

Returns:

  • str - table name

get_database_name

def get_database_name() -> str

Get database name from the delta table

Returns:

  • str - database name

get_location

def get_location() -> str

Get delta table location

Returns:

  • str - delta table location

schema_to_glue

def schema_to_glue() -> Tuple[List[dict], List[dict]]

Extracts the delta table schema and returns in the AWS Glue Format

Returns:

Tuple[List[dict], List[dict]] columns and partitions