{"id":22,"date":"2006-09-06T17:32:01","date_gmt":"2006-09-06T17:32:01","guid":{"rendered":"http:\/\/etlguru.com\/blog\/2006\/09\/06\/etl-startegy-to-store-data-validation-rules\/"},"modified":"2006-09-06T17:32:01","modified_gmt":"2006-09-06T17:32:01","slug":"etl-startegy-to-store-data-validation-rules","status":"publish","type":"post","link":"https:\/\/etlguru.com\/?p=22","title":{"rendered":"ETL Startegy to store data validation rules"},"content":{"rendered":"<p>Every time there is movement of data the results have to be tested against the expected results. For every ETL process, test conditions for testing data are defined before\/during design and development phase itself.\u00c2\u00a0 Some that are missed can be added later on.<\/p>\n<p>Various test conditions are used to validate data\u00c2\u00a0when the\u00c2\u00a0ETL process is migrated\u00c2\u00a0from DEV-to->QA-to->PRD. These test conditions are can exists in the developer\u00e2\u20ac\u2122s\/tester\u00e2\u20ac\u2122s mind \/documented in word or excel. With time the test conditions either lost ignored or scattered all around to be really useful.<\/p>\n<p>In production if the ETL process runs successfully without error is a good thing. But it does not really mean anything. You still need rules to validate\u00c2\u00a0data processed by\u00c2\u00a0ETL. At this point you need\u00c2\u00a0data validation\u00c2\u00a0rules again!<\/p>\n<p>A better ETL strategy is to store the ETL business rules in a RULES table by target table, source system. These rules can be in SQL text. This will create a repository of all the rules in a single location which can be called by any ETL process\/ auditor at any phase of the project life cycle.<\/p>\n<p>There is also no need to re-write \/rethink rules. Any or all of these rules can be made optional, tolerances can be defined, called immediately after the process is run or data can be audited at leisure.<\/p>\n<p><strong>This Data validation \/auditing system will basically contain<br \/>\n<\/strong>A table that contains the rules,<br \/>\nA process to call is dynamically and<br \/>\nA table to store the results from the execution of the rules<\/p>\n<p><strong>Benefits: <\/strong><\/p>\n<p>Rules can be added dynamically with no cange to code.<\/p>\n<p>Rules are stored permanantly.<\/p>\n<p>Tolerance level can be changed with ever changing the code<\/p>\n<p>Biz rules can be added or validated by business experts without worring about the ETL code.<\/p>\n<p>NOTE:<em> This post is applicable to all etl tools or databases like Informatica, DataStage, Syncsort DMExpress, Sunopsis\u00c2\u00a0 or Oracle, Sybase, SQL Server Integration Services (SSIS)\/DTS, Ab Initio,\u00c2\u00a0\u00c2\u00a0 MS SQL Server, RDB, etc.<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Every time there is movement of data the results have to be tested against the expected results. For every ETL process, test conditions for testing data are defined before\/during design and development phase itself.\u00c2\u00a0 Some that are missed can be added later on. Various test conditions are used to validate data\u00c2\u00a0when the\u00c2\u00a0ETL process is migrated\u00c2\u00a0from [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[23,27,1],"tags":[],"class_list":["post-22","post","type-post","status-publish","format-standard","hentry","category-etl-exception-error-handling","category-etl-testing","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/etlguru.com\/index.php?rest_route=\/wp\/v2\/posts\/22","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/etlguru.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/etlguru.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/etlguru.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/etlguru.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=22"}],"version-history":[{"count":0,"href":"https:\/\/etlguru.com\/index.php?rest_route=\/wp\/v2\/posts\/22\/revisions"}],"wp:attachment":[{"href":"https:\/\/etlguru.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=22"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/etlguru.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=22"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/etlguru.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=22"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}