Skip to Content
AnalyticsdbtSeeds & Reference Data

Seeds & Reference Data

dbt seeds are static CSV files that dbt loads into the database as ordinary tables. nhic_dbt ships 40+ seed CSVs in the seeds/ directory covering facility mappings, administrative classifications, code lookups, and other reference data that changes infrequently.

What seeds are for

Seeds are the right choice when:

  • The data is static or rarely changes (a new release is needed to update it)
  • The data is small enough to commit to git as a CSV
  • The data is authoritative reference data — code tables, classifications, approved facility lists

Seeds are the wrong choice when:

  • The data is sourced from an external system and changes frequently — use a staging model instead
  • The data is large (thousands of rows or more) — consider a staging model backed by a source table

Loading seeds

# Load all seeds dbt seed # Load a specific seed by name dbt seed --select facility_mapping # Full refresh (truncate and reload) dbt seed --full-refresh

Seeds are loaded into the staging schema by default (configured in dbt_project.yml).

Using seeds in models

Reference a seed in a model using {{ ref() }} just like any other dbt model:

-- models/intermediate/int_visits__enriched.sql select v.visit_id, v.facility_code, f.facility_name, f.district, f.province from {{ ref('stg_cemr__visits') }} v left join {{ ref('facility_mapping') }} f on v.facility_code = f.facility_code

dbt tracks this dependency in the DAG, so seeds are guaranteed to be loaded before any model that references them.

Adding or updating a seed

  1. Add or edit the CSV file in seeds/.
  2. If adding a new seed, declare column types in dbt_project.yml under seeds: to prevent dbt from inferring incorrect types.
  3. Run dbt seed --select <seed_name> locally to verify it loads correctly.
  4. Open a PR — CI will validate the seed as part of dbt build.

Updating a seed CSV changes the committed data. For seeds that are authoritative (e.g., the approved facility list), treat the CSV as the source of truth and coordinate with the data team before modifying.

Last updated on