Using the Glean CLI
The Glean CLI allows you to create Preview and Deploy Builds directly from your terminal or continuous integration system.
Quickstart
1. Create an Access Key
To use the CLI with your Glean project, you need an Access Key. An Access Key is used by Glean to identify who you are and what resources you have access to. You should use a separate Access Key for each distinct user or service using the CLI.
- Go to the
Settings
(opens in a new tab) page using the link in the project dropdown - Click on
Access Keys
- Click
+ New Access Key
in the top right and follow the instructions. Your Access Key file will be downloaded automatically. - Move your Access Key to the default location CLI will look for it
$ mkdir ~/.glean
$ mv ~/Downloads/glean_access_key.json ~/.glean/
Once you navigate away from the page, you will not be able to re-download your Access Key. If you lose your Access Key, you will need to delete it and then create a new one.
2. Install Glean CLI
-
Confirm Python 3 (opens in a new tab) is installed:
$ python3 --version
-
Install Glean CLI into a virtual environment:
$ python3 -m venv venv $ source venv/bin/activate $ pip3 install glean-cli
-
Confirm sucessful installation:
$ glean
Usage: glean [OPTIONS] COMMAND [ARGS]...
A command-line interface for interacting with Glean.
Options:
...
Use the --help
flag to see documentation about a specific command. For example:
$ glean preview --help
Usage: glean preview [OPTIONS] [FILEPATH]
Validates resource configurations and generates a preview link.
Options:
...
Moving your Access Key
By default, the CLI expects your Access Key to be located at ~/.glean/glean_access_key.json
. You can override this by:
- Setting the
GLEAN_CREDENTIALS_FILEPATH
environment variable to a different filepath - Using the
--credentials-filepath
command-line option to use a different filepath - Setting the
GLEAN_PROJECT_ID
,GLEAN_ACCESS_KEY_ID
, andGLEAN_SECRET_ACCESS_KEY_TOKEN
environment variables to the respective values stored in your Access Key file.
Using environment variables
You can use environment variables to dynamically populate Glean configuration files with different values at runtime.
When creating a Build using local files, the CLI will replace placeholders of the form ${ENV_VAR_NAME}
with the corresponding environment variable. For example, if your model file contains:
glean: "1.0"
name: My Data Model
source:
connectionName: ${DATABASE_CONNECTION_NAME}
physicalName: test_table
...then you can preview a Build against your dev
database by running:
$ DATABASE_CONNECTION_NAME=dev glean preview
Environment variable substitution is not yet supported for Builds that are triggered via a git revision.
Recommended workflows
The CLI allows you to integrate Glean into your existing development and deployment process. As an example, here is a typical workflow that we use when making changes to Glean resources or upstream data:
- Store your Glean configuration files in a git repo alongside your data pipeline code.
- Run your pipelines to populate test data into a separate database or schema.
- Adjust your Glean configuration files as necessary to reflect your intended changes.
- Run
glean preview
to create a Preview Build, adjusting thesource
section of your model(s) to point at your test dataset. - Make any necessary adjustments in the Preview and re-export your new configuration files.
- Send a Pull Request for all the pending changes, including a link to your Preview Build.
- Merge the Pull Request into your
main
git branch. - Run
glean deploy --git-revision=main
to deploy your new Glean resources to your project.
Continuous integration
You can invoke the Glean CLI in a continuous integration system to automatically generate previews or deploy your project.
For example, if you use GitHub, the following GitHub action will generate a Build Preview of the local glean
directory whenever you send a pull request:
name: glean-preview
on: [pull_request]
jobs:
create-glean-preview:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- run: pip install glean-cli
- run: cd glean && glean preview
env:
GLEAN_PROJECT_ID: ${{ secrets.GLEAN_PROJECT_ID_PROD }}
GLEAN_ACCESS_KEY_ID: ${{ secrets.GLEAN_ACCESS_KEY_ID_PROD }}
GLEAN_SECRET_ACCESS_KEY_TOKEN: ${{ secrets.GLEAN_SECRET_ACCESS_KEY_TOKEN_PROD }}
Glean Pull
glean pull
is a command-line utility for moving resources from the web UI into DataOps or pulling down changes. When you invoke glean pull
with no arguments, it will fetch all resources from your Glean project and save their configurations to your working directory. If you already have local files that correspond to resources in Glean, they will be overwritten.
If you provide a GRN as an argument, glean pull
will restrict itself to only the specified resource and all resources it depends on. For example, glean pull sv:the-sv-id
will retrieve the resource configuration for the specified saved exploration, plus any model(s) it depends on.
When developing locally, you are not required to specify GRNs in Glean
resource configuration files. However, when running glean deploy
, GRNs are
assigned based on the relative path from the working directory to the resource
configuration file. Subsequent uses of glean pull
should therefore be run
from the same working directory.
Note that glean pull
is a potentially destructive operation that will change files in your working directory. Make sure your git status
is clean (no uncommitted changes in the working directory) before pulling resources.
glean pull
will not remove configuration files when the corresponding resource is deleted from the web UI.
dbt Utility
glean preview
and glean deploy
can optionally accept a flag, --dbt
, which runs dbt parse
and then uses the resulting manifest for the corresponding Glean build. See the dbt integration documentation for more information about the Glean dbt integration.