Sunday, January 5, 2014

ETL Test Scenarios and TestCases

Basic Validation in ETL testing... 


Mapping doc validation: 

  • Verify mapping doc whether corresponding ETL information is provided or not. Change log should maintain in every mapping doc.
  • Define the default test strategy If mapping docs are missed out some optional information. Ex: data types length etc
Structure validation:

  1. . Validate the source and target table structure against corresponding mapping do
  2.  Source data type and Target data type should be same.
  3.  Length of data types in both source and target should be equal.
  4.  Verify that data field types and formats are specified
  5.  Source data type length should not less than the target data type length.
  6.  Validate the name of columns in table against mapping doc.
Constraint Validation:

Ensure the constraints are defined for specific table as expected.

Data Completeness Issues:

Ensures that all expected data is loaded in to target table

1. Compare records counts between source and target. Check for any rejected records.

2. Check Data should not be truncated in the column of target table.

3. Check boundary value analysis (ex: only >=2008 year data has to load into the target

Data Correctness Issues:

1. Data that is misspelled or inaccurately recorded.

2. Null, non-unique, or out of range data may be stored when the integrity constraints are disabled.
Example: The primary key constraint is disabled during an import function.  Data is entered into the existing data with null unique identifiers.

Data Transformation
1. Create a spread sheet of scenarios of input data and expected results and validate these with the business customer. This is an excellent requirements elicitation step during design and could also be used as part of testing.

2. Create test data that includes all scenarios. Utilize an ETL developer to automate the entire process of populating data sets with the scenario spread sheet to permit versatility and mobility for the reason that scenarios are likely to change.

3. Utilize data profiling results to compare range and submission of values in each field between target and source data.

Data Quality

1. Number check: if in the source format of numbering the columns are as xx_30 but if the target is only 30 then it has to load not pre_fix(xx_). we need to validate.

2.  Date Check: They have to follow Date format and it should be same across all the records. Standard format: yyyy-mm-dd etc..

3. Precision Check: Precision value should display as expected in the target table.

4.  Data Check: Based on business logic, few record which does not meet certain criteria should be filtered out.

5. Null Check: Few columns should display “Null” based on business requirement.

Duplicate check

1. Needs to validate the unique key, primary key and any other column should be unique as per the business requirements are having any duplicate rows.

2. Check if any duplicate values exist in any column which is extracting from multiple columns in source and combining into one column.

3. Some time as per the client requirements we needs ensure that no duplicates in combination of multiple columns within target only.
 Complete Data Validation

1 To validate the complete data set in source and target table minus query is best solution.

2. We need to source minus target and target minus source.
3. If minus query returns any value those should be considered as mismatching rows.