Using the Glean CLI

The Glean CLI allows you to create Preview and Deploy Builds directly from your terminal or continuous integration system.

Quickstart

1. Create an Access Key

To use the CLI with your Glean project, you need an Access Key. An Access Key is used by Glean to identify who you are and what resources you have access to. You should use a separate Access Key for each distinct user or service using the CLI.

Go to the Settings (opens in a new tab) page using the link in the project dropdown
Click on Access Keys
Click + New Access Key in the top right and follow the instructions. Your Access Key file will be downloaded automatically.
Move your Access Key to the default location CLI will look for it

$ mkdir ~/.glean
$ mv ~/Downloads/glean_access_key.json ~/.glean/

⚠️

Once you navigate away from the page, you will not be able to re-download your Access Key. If you lose your Access Key, you will need to delete it and then create a new one.

2. Install Glean CLI

Confirm Python 3 (opens in a new tab) is installed:
```
$ python3 --version
```

Install Glean CLI into a virtual environment:

$ python3 -m venv venv
$ source venv/bin/activate
$ pip3 install glean-cli

Confirm sucessful installation:

$ glean
Usage: glean [OPTIONS] COMMAND [ARGS]...
 
  A command-line interface for interacting with Glean.
 
Options:
...

Use the --help flag to see documentation about a specific command. For example:

$ glean preview --help
Usage: glean preview [OPTIONS] [FILEPATH]
 
  Validates resource configurations and generates a preview link.
 
Options:
...

Moving your Access Key

By default, the CLI expects your Access Key to be located at ~/.glean/glean_access_key.json. You can override this by:

Setting the GLEAN_CREDENTIALS_FILEPATH environment variable to a different filepath
Using the --credentials-filepath command-line option to use a different filepath
Setting the GLEAN_PROJECT_ID, GLEAN_ACCESS_KEY_ID, and GLEAN_SECRET_ACCESS_KEY_TOKEN environment variables to the respective values stored in your Access Key file.

Using environment variables

You can use environment variables to dynamically populate Glean configuration files with different values at runtime.

When creating a Build using local files, the CLI will replace placeholders of the form ${ENV_VAR_NAME} with the corresponding environment variable. For example, if your model file contains:

glean: "1.0"
name: My Data Model
source:
  connectionName: ${DATABASE_CONNECTION_NAME}
  physicalName: test_table

...then you can preview a Build against your dev database by running:

$ DATABASE_CONNECTION_NAME=dev glean preview

Environment variable substitution is not yet supported for Builds that are triggered via a git revision.

Recommended workflows

The CLI allows you to integrate Glean into your existing development and deployment process. As an example, here is a typical workflow that we use when making changes to Glean resources or upstream data:

Store your Glean configuration files in a git repo alongside your data pipeline code.
Run your pipelines to populate test data into a separate database or schema.
Adjust your Glean configuration files as necessary to reflect your intended changes.
Run glean preview to create a Preview Build, adjusting the source section of your model(s) to point at your test dataset.
Make any necessary adjustments in the Preview and re-export your new configuration files.
Send a Pull Request for all the pending changes, including a link to your Preview Build.
Merge the Pull Request into your main git branch.
Run glean deploy --git-revision=main to deploy your new Glean resources to your project.

Continuous integration

You can invoke the Glean CLI in a continuous integration system to automatically generate previews or deploy your project.

For example, if you use GitHub, the following GitHub action will generate a Build Preview of the local glean directory whenever you send a pull request:

name: glean-preview
on: [pull_request]
jobs:
  create-glean-preview:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - run: pip install glean-cli
      - run: cd glean && glean preview
    env:
      GLEAN_PROJECT_ID: ${{ secrets.GLEAN_PROJECT_ID_PROD }}
      GLEAN_ACCESS_KEY_ID: ${{ secrets.GLEAN_ACCESS_KEY_ID_PROD }}
      GLEAN_SECRET_ACCESS_KEY_TOKEN: ${{ secrets.GLEAN_SECRET_ACCESS_KEY_TOKEN_PROD }}

Glean Pull

glean pull is a command-line utility for moving resources from the web UI into DataOps or pulling down changes. When you invoke glean pull with no arguments, it will fetch all resources from your Glean project and save their configurations to your working directory. If you already have local files that correspond to resources in Glean, they will be overwritten.

If you provide a GRN as an argument, glean pull will restrict itself to only the specified resource and all resources it depends on. For example, glean pull sv:the-sv-id will retrieve the resource configuration for the specified saved exploration, plus any model(s) it depends on.

⚠️

When developing locally, you are not required to specify GRNs in Glean resource configuration files. However, when running glean deploy, GRNs are assigned based on the relative path from the working directory to the resource configuration file. Subsequent uses of glean pull should therefore be run from the same working directory.

Note that glean pull is a potentially destructive operation that will change files in your working directory. Make sure your git status is clean (no uncommitted changes in the working directory) before pulling resources.

glean pull will not remove configuration files when the corresponding resource is deleted from the web UI.

dbt Utility

glean preview and glean deploy can optionally accept a flag, --dbt, which runs dbt parse and then uses the resulting manifest for the corresponding Glean build. See the dbt integration documentation for more information about the Glean dbt integration.

Data Ops Using the YAML Editor