About Windows
Datachecks supports two types of windows:
Global
Global windows calculate all validator metrics for the entire source, rather than for a sequence of windows. This window will perform a full load of data every polling cycle--which is fine for small tables, but will lead to slow performance and high resource usage for bigger tables.
Creating a Global window simply means grouping your chosen validations under a configuration that treats the data globally. There is no need to specify a time column or any additional parameters. Once saved, the Global window ensures your validations always consider the complete dataset as-is.
Tumbling
This window runs linked validations with additional time range constraints. For example, if you want to monitor data from the past five days, you can define the relevant timeframe
To create a window, click Add Window and a configuration pop-up will appear where you can enter the Window Name and select a window type as Tumbling.
Once you select Tumbling enter the following details:
Date-Time Field
This is the timestamp column used to split the dataset into time-based segments. For instance, if your records include a field like created_at
or event_time
, that field is used to assign each row to a specific time window.
Window Size & Unit
Window Size defines the length of each time segment, and the Unit determines the time scale—such as hours, days, or weeks. Together, these values control how your dataset is divided. For instance, if you have ten days of data and choose a window size of 2 with the unit set to "days," the system will divide your data into five distinct two-day windows. Each window is treated independently, and validations are executed separately on each one.
This allows you to identify issues that may occur within specific time periods, such as spikes in missing data or anomalies in a particular batch.
Look Back Period & Unit
The Look Back Period specifies how far back in time the window should retrieve data for analysis. This ensures that you're only validating the most relevant and recent information. For example, if you set the Look Back Period to 5 and select "days" as the unit, the system will consider only the data from the last five days when creating the time windows.
This is particularly useful for continuous monitoring scenarios, where you want to focus on recent data changes rather than the entire dataset.
Updated 12 days ago