Skip to main content

Deploying Great Expectations in a hosted environment without file system or CLI

If you follow the steps of the Getting Started tutorial, you create a standard deployment of Great Expectations. By default, this relies on two components:

  1. The Great Expectations CLI to initialize a Data Context, create Expectation Suites, add Datasources, etc.
  2. The great_expectations.yml file to configure your Data Context, e.g. to point at different Stores for Validation Results, etc.

However, you might not have these components available in hosted environments, such as Databricks, AWS EMR, Google Cloud Composer, and others. This workflow guide will outline the main steps required to successfully use Great Expectations in a hosted environment.

Step 1: Configure your Data Context

Instead of using the Great Expectations CLI, you can create a Data Context directly in code. Your Data Context also manages the following components described in this guide:

  • Datasources to connect to data
  • Stores to save Expectations and Validation Results
  • Data Docs hosting

The following guide gives an overview of creating an in-code Data Context including defaults to help you more quickly set one up for common configurations:

The following guides will contain examples for each environment we have tested out:

Step 2: Create Expectation Suites and add Expectations

If you want to create an Expectation Suite in your environment without using the CLI, you can follow this guide from step 5 onward to add a Datasource and an Expectation Suite: How to connect to a PostgreSQL database

You can then add Expectations to your Suite one at a time like this example:

validator.expect_column_values_to_not_be_null("my_column")
validator.save_expectation_suite(discard_failed_expectations=False)

In order to load the Suite at a later time, you will need to ensure that you have an Expectation store configured:

Step 3: Run validation

In order to use an Expectation Suite you've created to validate data, follow this guide: How to validate data without a Checkpoint

Step 4: Use Data Docs

Finally, if you would like to build and view Data Docs in your environment, please follow the guides for configuring Data Docs: Options for hosting Data Docs

Additional notes

If you have successfully deployed Great Expectations in a hosted environment other than the ones listed above, we would love to hear from you. Please reach out to us on Slack