sodadata/soda-core — reverse-engineered prompt
Reverse engineered prompt
Build me a Python based data quality tool that lets me define data contracts in simple YAML and then verify them against real datasets. I want it to work from the command line first, with a Python API too, so I can use it locally during development or plug it into data pipelines later.
It should let someone create a data source config file, test that connection, write a contract for a table or view, and run checks for things like schema, row count, missing values, and valid values. Please support the main warehouse and query engines mentioned in the project, like Postgres, Snowflake, BigQuery, Databricks, DuckDB, Redshift, Trino, Spark, SQL Server, Synapse, Athena, and Fabric, with the right package layout for each connector.
Make the YAML readable for non experts, include sensible examples, and make the CLI commands feel straightforward. Add tests and basic developer setup so the repo is runnable out of the box. If anything is unclear, look up the current docs online and match the behavior closely.
Want more depth? Deep Reverse