You can easily launch this example in just 5 minutes.

Installation

MAC OS and Linux

Install Datachecks using the pip package manager. Below we are installing the package with the postgres extra, which is required for this example.

pip install 'dcs-core[postgres]' -U

Quick Setup of Database & Test Data

Ignore if you already have a PostgreSql setup

Datachecks Configuration File

Create a configuration file dcs_config.yaml with the following contents:

dcs_config.yaml
data_sources:
  - name: product_db
    type: postgres
    connection:
      host: 127.0.0.1
      port: 5431
      username: dbuser
      password: dbpass
      database: dcs_demo
validations for product_db.products:
  - count_of_products:
      on: count_rows
      threshold: "> 0 & < 1000"
  - max_product_price_in_india:
      on: max(price)
      where: "country_code = 'IN'"
      threshold: "< 190"

Run Datachecks

Datachecks can be run in two ways using the CLI or the Python API.

Run Datachecks in CLI

dcs-core inspect --config-path ./dcs_config.yaml

While running the above command, you should see the following output:

Generate Metrics Validation Report

You can generate a beautiful data quality report with all the metrics with just one command. This html report can be shared with the team.

dcs-core inspect --config-path ./dcs_config.yaml --html-report

Run Datachecks in Python

getting_started.py
from dcs_core.core import Inspect


if __name__ == "__main__":
    inspect = Inspect()
    inspect.add_configuration_yaml_file("dcs_config.yaml")
    inspect_output = inspect.run()
    print(inspect_output.metrics)
    # User the metrics to send or store somewhere
    # It can be sent to elk or any time series database