Testing
nhic_dbt uses dbt’s built-in test framework to validate data quality at every layer. Tests run in CI on every PR and block merges when they fail.
Schema tests
Schema tests are declared in YAML files (typically schema.yml) alongside the models they test. dbt ships four built-in generic tests:
| Test | What it checks |
|---|---|
not_null | Column contains no null values |
unique | All values in the column are distinct |
accepted_values | Column only contains values from a defined list |
relationships | Foreign key integrity — every value in a column exists in a referenced column |
Example declaration:
# models/staging/cemr/schema.yml
models:
- name: stg_cemr__patients
columns:
- name: patient_id
tests:
- not_null
- unique
- name: national_id
tests:
- not_null
- name: gender
tests:
- accepted_values:
values: ['M', 'F', 'U']Custom generic tests
Project-specific tests that are reused across multiple models live in the tests/ directory as generic tests. These follow the same YAML declaration pattern and are referenced by name in schema.yml files.
Use custom tests for domain-specific assertions — for example, validating that a date falls within an expected range, or that a facility code exists in the seeds reference table.
Running tests
# Run all tests
dbt test
# Test a specific layer
dbt test --select staging
dbt test --select intermediate
# Test models with a specific tag
dbt test --select tag:dwh
# Test a single model
dbt test --select stg_cemr__patientsCI test gate
Tests run automatically in .github/workflows/dbt.yaml on every PR targeting main. The workflow runs dbt build (which executes seeds, models, and tests in dependency order) against the CI environment.
A failing test blocks the PR merge. Fix the underlying data or model logic — do not skip tests to unblock a merge.
dbt build is the recommended CI command because it respects the DAG and runs seeds before models and models before tests. Running dbt test alone on a fresh environment may fail if models have not been materialised first.