Supported Data Sources
Valido connects directly to your warehouse. Ensure you have access to Snowflake, BigQuery, Databricks, Redshift, or DuckDB.
Follow these steps to connect your warehouse, set your first rule, and start monitoring data health in real-time.
Valido connects directly to your warehouse. Ensure you have access to Snowflake, BigQuery, Databricks, Redshift, or DuckDB.
You need read-only access to the schemas and tables you wish to monitor. Valido uses this to run profile queries and validation checks.
For the best experience, use the latest version of Chrome, Firefox, Safari, or Edge. Mobile browsers are supported for viewing but not for configuration.
Log in to Valido and you'll be prompted to create a new workspace. This is your isolated environment for data quality checks.
Tip: Name your workspace after the project or team it serves (e.g., "Marketing Analytics" or "Finance Q3").
Once created, invite your teammates by email. They will receive an invitation to join the workspace and view quality reports.
Navigate to the Connections tab in your workspace and click "Add Connection". Select your warehouse provider from the list.
You will be asked to enter your connection string or credentials. Valido uses OAuth for secure authentication, so you won't need to manage raw passwords.
Once connected, select the specific database, schema, and tables you want to monitor. You can always add more later.
Figure 1: Selecting a data source in the connection wizard.
Before setting rules, run a Data Profile on your tables. This scans your data to understand its current state—distribution, null counts, and data types. This creates your baseline for comparison.
Go to the Rules tab and choose a rule from the library. For example, select "Row Count Check" to ensure your daily table has the expected number of rows. You can customize the threshold (e.g., allow 5% variance).
Finally, set up an alert to notify your team. Connect your Slack workspace or email address. When a rule fails, Valido will post a message in the designated channel with the details.
Once configured, Valido runs automatically on every pipeline run. It compares current data against your baselines and rules.
If everything looks good, you'll see a green health score. If an anomaly is detected, Valido will surface the root cause and notify you immediately, allowing you to fix issues before they impact downstream dashboards.
Your data quality is now guaranteed. No more manual checks or surprise incidents.
If the connection fails, check your network firewall settings. Ensure outbound traffic to the Valido API endpoints is allowed.
Ensure the service account or user used for connection has SELECT privileges on the specific tables. You may need to grant access to the `INFORMATION_SCHEMA` as well.
Check if the rule is enabled. Also, ensure the rule's threshold is realistic for your data volume. Small tables might trigger false positives if the threshold is too strict.
Need more help? Check out our full documentation or join our community Slack.
View all guides