ETL Alone Is Not Always the Most Effective Way to Load a Warehouse
Extract, Transform, and Load (ETL) middleware is the tool of choice for loading data warehouses. However, there are some cases where the ETL tools are not the most effective approach, for example where:
- ETL tools lack interfaces to easily access source data, for example data from packaged applications such as SAP or new technologies such as Web services
- Readily available, existing virtual views or data services can be reused rather than building new ETL scripts from scratch
- Tight batch windows require access, abstraction, and federation activities to be pre-processed and virtually staged in advance of ETL processes.
Use Data Virtualization to Preprocess Data for ETL
You can use Composite data virtualization to complement your ETL tools to gain greater flexibility when loading your data warehouse.
- Your ETL tools can leverage virtual views and data services as inputs to their batch process, simply appearing as any other data source
- Integrate data source types that your ETL tool cannot easily access
- Reuse existing views and services, saving time and costs
Further these data abstractions do not require your ETL developers to understand the structure of, or interact directly with, your actual data sources, significantly simplifying their work and reducing time to solution.
In the integration pattern shown below, the Composite Data Virtualization Platform complements ETL by providing access, abstraction and federation of packaged applications and Web services data sources.
Using Virtual Views to Simplify Packaged Application and Web Service Access, Abstraction, and Federation
- Preprocess SAP Data – To provide the SAP financial data required for their financial data warehouse, this energy company uses Composite data virtualization to access and abstract SAP R/3 FICO data. Composite replaced an error-prone, SAP data expert intensive, flat file extraction process that would not scale across their complex SAP landscape. The results include more complete and timely data in the financial data warehouse enabling better performance management.