Compare two SQL tables at different levels of granularity:
- Row count diff
- Aggregate table level NULL percent and approx distinct count diff for each column
- Divide the table horizontally using dimension columns and provide diff for each non-dimension column
# Basic install
pip install -e .
# With Trino support
pip install -e ".[trino]"
# With DuckDB support
pip install -e ".[duckdb]"
# With Spark support
pip install -e ".[spark]"
# With dev tools
pip install -e ".[dev]"
# Using uv (recommended)
uv pip install -e ".[trino]"- Provide left and right SQL queries without any need for join condition
- Configure
SQL_ENGINEinvariables.py - Implement the connection to
SQL_ENGINEinvariables.pyviaget_fetch_raw_query_function - For S3 storage of diff results: set
S3_ENABLED=Trueand configure S3 details invariables.py
