Andrei Contan
One of the most important assets of any organization is its information. This asset is almost always kept by an organization in two forms: the operational systems of record and the data warehouse. Crudely speaking, the operational systems are where the data is put in, and the data warehouse is where we get the data out.
The users of an operational system turn the wheels of the organization. They take orders, sign up new customers, and log complaints. Users of an operational system almost always deal with one record at a time. They repeatedly perform the same operational tasks over and over. The users of a data warehouse, on the other hand, watch the wheels of the organization turn. They count the new orders and compare them with last week’s orders and ask why the new customers signed up and what the customers complained about.
Users of a data warehouse almost never deal with one row at a time. Rather, their questions often require that hundreds or thousands of rows be searched and compressed into an answer set. To further complicate matters, users of a data warehouse continuously change the kinds of questions they ask.
Testing strategies must include the following aspects of a DW environment:
We intend to prove that High Volume Automated Testing in a Data Warehouse environment is the most suitable approach, due to the high load of data to be processed.
The HVAT techniques will show that the following objectives will be met as we consider it crucial for the future of the business:
1.Count Validation - Record Count Verification DWH backend/Reporting queries against source and target as a initial check.
2.Source Isolation - Validation after isolating the driving sources.
3.Dimensional Analysis - Data integrity between the various source tables and relationships.
4.Statistical Analysis - Validation for various calculations.
5.Data Quality Validation - Check for missing data, negatives and consistency. Field-by-Field data verification can be done to check the consistency of source and target data.
6.Granularity - Validate at the lowest granular level possible (Lowest in the hierarchy E.g. Country-City-Street – start with test cases on street).
7.Other validations - Graphs, Slice/dice, meaningfulness, accuracy.
As the DW is SQL and PLSQL based, the entire testing process will be defined and developed using UT PL/SQL testing framework. This framework will help us keep the directions of open-source testing for HVAT techniques
The users of an operational system turn the wheels of the organization. They take orders, sign up new customers, and log complaints. Users of an operational system almost always deal with one record at a time. They repeatedly perform the same operational tasks over and over. The users of a data warehouse, on the other hand, watch the wheels of the organization turn. They count the new orders and compare them with last week’s orders and ask why the new customers signed up and what the customers complained about.
Users of a data warehouse almost never deal with one row at a time. Rather, their questions often require that hundreds or thousands of rows be searched and compressed into an answer set. To further complicate matters, users of a data warehouse continuously change the kinds of questions they ask.
Testing strategies must include the following aspects of a DW environment:
- The data warehouse must be adaptive and resilient to change.
- The data warehouse must present the organization’s information consistently.
- The data warehouse must make an organization’s information easily accessible.
- The data warehouse must be a secure bastion that protects our information assets.
- The data warehouse must serve as the foundation for improved decision making.
We intend to prove that High Volume Automated Testing in a Data Warehouse environment is the most suitable approach, due to the high load of data to be processed.
The HVAT techniques will show that the following objectives will be met as we consider it crucial for the future of the business:
1.Count Validation - Record Count Verification DWH backend/Reporting queries against source and target as a initial check.
2.Source Isolation - Validation after isolating the driving sources.
3.Dimensional Analysis - Data integrity between the various source tables and relationships.
4.Statistical Analysis - Validation for various calculations.
5.Data Quality Validation - Check for missing data, negatives and consistency. Field-by-Field data verification can be done to check the consistency of source and target data.
6.Granularity - Validate at the lowest granular level possible (Lowest in the hierarchy E.g. Country-City-Street – start with test cases on street).
7.Other validations - Graphs, Slice/dice, meaningfulness, accuracy.
As the DW is SQL and PLSQL based, the entire testing process will be defined and developed using UT PL/SQL testing framework. This framework will help us keep the directions of open-source testing for HVAT techniques