5 dbt Test Patterns Every Analytics Engineer Should Know
Kenji Watanabe
·
October 24, 2023
·
8 min read
Beyond the basics
Every analytics engineer knows the drill: add a `unique` constraint to a primary key and a `not_null` check to required fields. But generic dbt tests often leave dangerous gaps. They validate *structure*, not *business logic*. They test *history*, but they don't warn you about *future* drift.
To build truly robust data pipelines, you need to move beyond the default configuration. Here are five patterns—ranging from seed-table management to continuous monitoring—that will make your dbt tests smarter and your production data safer.
Patterns that catch the edge cases
1. Use `accepted_values` with dynamic seeds
Hardcoding lists of accepted values in your tests is a maintenance nightmare. If you need to add a new state to your US address table, you have to find and update every single test file. The solution? Use a seed file.
Create a `seeds/us_states.yml` and populate it with your static reference data. Then, use `dbt_utils.get_column_values` to reference that seed dynamically.
-- tests/stg_addresses__state_in_us.yml
version: 2
tests:
- dbt_utils.expression_is_true:
name: stg_addresses_state_in_us
model: stg_addresses
expression: "state in ({{ dbt_utils.get_column_values('seed', 'us_states', column='name') }})"
-- tests/fct_orders__related_stg_orders.yml
version: 2
tests:
- dbt_utils.relationships_where:
name: related_stg_orders
from_table:
name: stg_orders
schema: "{{ target.schema }}"
field: user_id
to_table:
name: stg_orders
schema: "{{ target.schema }}"
to_field: user_id
relation_condition: "created_at > '2023-01-01'"
2. Test relationships across schema boundaries
It is common practice to separate staging and mart layers. However, if a staging model accidentally refers to the wrong schema, your tests will pass even though the data is broken. Use `ref()` with schema qualification to ensure cross-schema integrity.
This pattern ensures that your `stg_users` are actually linking to the `marts_core_users` table, not a `dev_users` table that was mistakenly created.
3. Custom singular tests for business logic
Generic tests can't capture complex business rules. For example, a "duplicate invoice" rule requires checking specific columns. You can solve this by creating a custom test file and using `dbt test --select`.
This test checks that for any given `invoice_id`, there is only one record with a status of 'void'.
-- tests/fct_invoices__no_duplicate_void_invoices.sql
select *
from {{ ref('fct_invoices') }}
group by 1, 2, 3, 4, 5, 6
having count(*) > 1 and status = 'void'
4. Test freshness at the model level
`dbt_source freshness` tells you when your raw data is late, but it doesn't tell you if your *derived* models are late. If your staging models are delayed, your daily dashboards will be outdated, even if the source data is technically "fresh".
Implement model-level freshness checks by querying the `information_schema` or leveraging custom dbt macros to compare model run times against a threshold.
Quick check:
If `stg_users` ran at 8:15 AM but the source table was updated at 8:00 AM, your `fct_orders` model is already stale before it even starts. Catch this upstream.
5. Combine dbt tests with Valido
dbt tests are a great "gatekeeper" — they tell you if a specific run failed. But they don't provide continuous coverage. If you run `dbt test` at 9:00 AM and an anomaly occurs at 10:00 AM, you won't know until the next morning.
Valido picks up where dbt leaves off. It runs continuous validation on your data pipelines, catching drift between runs. Connect your Valido account to your dbt project to see how your custom tests perform in a live monitoring environment.
Continuous Coverage
Valido executes your dbt tests and custom SQL rules on every run, providing a health score for your entire data warehouse.
See these patterns in action
Copy the code snippets and examples from our GitHub repository.
Connect your dbt project to Valido for free
Turn your static dbt tests into a live, continuous quality monitoring system.