ETL Interface Architecture (ETL-IA) tm

An enterprise data warehouse has by definition has more then one data source. In simple language a table employee in the EDW will have multiple sources like HR dept, Sales dept, and so on. Also if the company is a group of companies then same department can repeat across various companies.

Let’s say the DIM_EMPLOYEE table was designed by the data modeler and then he asks the ETL team to develop an ETL process to populate it. Very easy, write an ETL process to read from the source and load it into the target table.
Now after few days they ask that they have identified another source, should the ETL team start all over again and repeat the whole process? Is this efficient? Or is there a better way?

Well, welcome to my world of “ETL Interface Architecture ™ (ETL-IA)”

With this ETL-IA addition of 3X sources does not result in 3X work. Instead additional work added is just to handle the uniqueness of the each source. Thus saving time and money.

The question is how?

I will be discussing this in details on the website. A detailed information on implementation of process is the core of the book ETLGuru.com: ETL Strategies and Solutions for Data Warehouse.

NOTE: This post is applicable to all etl tools or databases like Informatica, DataStage, Syncsort DMExpress, Sunopsis or Oracle, Sybase, SQL Server Integration Services (SSIS)/DTS, Ab Initio, MS SQL Server, RDB, Teradata, etc.

Please click here to leave comments or Questions

Leave a Reply