Skip to main content

ConfiguredAssetAWSGlueDataCatalogDataConnector

class great_expectations.datasource.data_connector.ConfiguredAssetAWSGlueDataCatalogDataConnector(name: str, datasource_name: str, execution_engine: Optional[great_expectations.execution_engine.execution_engine.ExecutionEngine] = None, catalog_id: Optional[str] = None, partitions: Optional[List[str]] = None, assets: Optional[Dict[str, dict]] = None, boto3_options: Optional[dict] = None, batch_spec_passthrough: Optional[dict] = None, id: Optional[str] = None)#

A Configured Asset Data Connector used to connect to data through an AWS Glue Data Catalog.

Being a Configured Asset Data Connector, it requires an explicit list of each Data Asset it can connect to. While this allows for fine-grained control over which Data Assets may be accessed, it requires more setup.

Parameters:
  • name – The name of the Data Connector.

  • datasource_name – The name of this Data Connector’s Datasource.

  • execution_engine – The Execution Engine object to used by this Data Connector to read the data.

  • catalog_id – The catalog ID from which to retrieve data. If none is provided, the AWS account ID is used by default. Make sure you use the same catalog ID as configured in your spark session.

  • partitions – A list of partition keys to be defined for all Data Assets. The partitions defined in Data Asset config will override the partitions defined in the connector level.

  • assets – A mapping of Data Asset names to their configuration.

  • boto3_options – Options passed to the boto3 library.

  • batch_spec_passthrough – Dictionary with keys that will be added directly to the batch spec.

  • id – The unique identifier for this Data Connector used when running in cloud mode.

get_available_data_asset_names() List[str]#

Return the list of asset names known by this DataConnector.

Returns:

A list of available names