Uniqueness
Ensure each record stands out.
Uniqueness Validations play a pivotal role in upholding data quality standards. Just as reliability Validations ensure timely data updates, uniqueness Validations focus on the distinctiveness of data entries within a dataset.
By consistently tracking these Validations, you gain valuable insights into data duplication, redundancy, and accuracy. This knowledge empowers data professionals to make well-informed decisions about data cleansing and optimization strategies. Uniqueness Validations also serve as a radar for potential data quality issues, enabling proactive intervention to prevent major problems down the line.
Distinct Count
A distinct count Validation in data quality measures the number of unique values within a dataset, ensuring accuracy and completeness.
Example
validations for product_db.products:
- distinct_count_of_product_categories:
on: count_distinct(product_category)
Duplicate Count
Duplicate count is a data quality Validation that measures the number of identical or highly similar records in a dataset, highlighting potential data redundancy or errors.
Example
validations for product_db.products:
- distinct count of product categories:
on: count_duplicate(product_category)
validations for product_db.products:
- distinct count of product categories:
on: count_duplicate(product_category)
Updated 8 days ago