Skip to content

Data Model

Glean data model files are intended to map to the options available in the Data Model user interface (see Data Models Overview).

Examples

The following is an example of a Data Model my_model.yml which loads test_table and defines some simple attributes.

glean: "1.0"
name: My Data Model
type: model
source:
  connectionName: Production BigQuery
  physicalName: test_table
cols:
  - id: metric_sum
    type: metric
    physicalName: value1
    name: SUM VALUE
    aggregate: sum
  - id: label1
    type: attribute
    physicalName: label1
  - id: secondary_event_date
    type: datetime
    physicalName: secondary_event_date
    primaryDate: true

The following Data Model uses a sql statement instead of an underlying table. It also uses sql to build a custom metric, daily_active_users:

glean: "1.0"
name: My Data Model
type: model
source:
  connectionName: Production BigQuery
  sql: select * from test_table
cols:
  - id: metric_sum
    type: metric
    physicalName: value1
    name: SUM VALUE
    aggregate: sum
  - id: daily_active_users
    type: metric
    name: Daily Active Users
    sql: COUNT(DISTINCT user_id) / COUNT(DISTINCT login_date)

Properties

  • glean (string - required): The Glean file format version.
  • type (string - required): The type of this resource. For data models, this is always "model".
  • name (string - required): The user-facing name for this Data Model.
  • source (object - required): A Data Source object.
  • cols (array - required): A list of Columns defining the metrics and attributes on this data model.

Data Source

The data source to read from for this model. All data sources require one of the following properties:

  • connectionName (string): The name of a database connection, as displayed in the Glean UI.
  • connectionId (string): The connection ID of a database connection, as displayed in the Glean UI.

and then one of the following sets of properties:

Table Source

Fetch data from an existing table.

  • schema (string): The name of the schema to use in the specified database.
  • physicalName (string): The name of the table to use in the specified database.

SQL Source

Fetch data from the results of a custom SQL query.

  • sql (string): The SQL statement used to fetch data for this model.

Columns

The columns (metrics, attributes, dates) of the Data Model. Each item in the array is an object in one of the following formats:

Date Column

Specifies a date column.

  • id (string): The persistent identifier for the column.
  • type (string): The type of this column, which in this case should always be 'datetime'.
  • physicalName (string): The name of the column as it appears in the underlying data source.
  • primaryDate (boolean): Flag indicating whether the column is the primary date. Exactly one datetime column should have this set to true.

Attribute Column

Specifies an attribute column.

  • id (string): The persistent identifier for the column.
  • type (string): The type of this column, which in this case should always be 'attribute'.
  • physicalName (string): The name of the column as it appears in the underlying data source.

Row Count Column

Specifies the 'row count' metric column.

  • id (string): The persistent identifier for this column.
  • type (string): The type of this column, which in this case should always be 'metric'.
  • name (string): The user-visible name for this column.
  • aggregate (string): The type of aggregation for this column, which in this case should always be 'row_count'.

Metric Column

Specifies a metric column that fetches data from a column in the underlying data source. Cannot contain additional properties.

  • id (string): The persistent identifier for the column.
  • type (string): The type of this column, which in this case should always be 'metric'.
  • physicalName (string): The name of the column as it appears in the underlying data source.
  • name (string): The user-visible name for this column.
  • aggregate (string): Specifies how to aggregate data from this column. Must be one of: ['count', 'count_distinct', 'sum', 'min', 'max', 'avg'].

SQL Metric Column

Specifies a metric column that fetches data using a custom SQL aggregation. Cannot contain additional properties.

  • id (string): The persistent identifier for the column.
  • type (string): The type of this column, which in this case should always be 'metric'.
  • name (string): The user-visible name for this column.
  • sql (string): SQL expression defining the column.