What is a Logical Data Warehouse?
According to Gartner Hype Cycle for Information Infrastructure, 2012, “the Logical Data Warehouse (LDW) is a new data management architecture for analytics which combines the strengths of traditional repository warehouses with alternative data management and access strategy. The LDW will form a new best practices by the end of 2015.”
David Besemer, CTO, Cisco, presents on How Data Virtualization Supports Gartner’s Logical Data Warehouse Vision.
Why is a Logical Data Warehouse Needed?
Business users are dissatisfied with the traditional data warehousing. New analytic requirements have driven new analytic appliances such at IBM Netezza, EMC Greenplum and ParAccel. Big data analytics have driven Hadoop and other specialized data bases such as graph and key-value.
This expanding diversity, along with data virtualization’s ability to easily access and federate data, has ended the reign of the enterprise data warehouse as the singular best practice for large scale information management.
What are the Elements that Form a Logical Data Warehouse?
In Understanding the Logical Data Warehouse: the Emerging Practice, Gartner identifies seven major components
- Repository Management
- Data Virtualization
- Distributed Processes
- Auditing Statistics and Performance Evaluation Services
- SLA Management
- Taxonomy / Ontology Resolution
- Metadata Management
How Does Data Virtualization Enable the Logical Data Warehouse?
“The use of data federation/virtualization helps enable virtual views of an organization’s data for supporting a logical data warehouse. While not a new technique, it presents pragmatic opportunities for use in LDWs with the ability to create abstracted interfaces to data,” according to Gartner’s The Logical Data Warehouse Will Be a Key Scenario for Using Data Federation.
A data virtualization platform such as the Cisco Data Virtualization Platform enables every Logical Data Warehouse component.
- Repository Management – Data virtualization supports a broad range of data warehouse extensions.
- Data Virtualization – Data virtualization virtually integrates data within the enterprise and beyond.
- Distributed Processes – Data virtualization integrates big data sources such as Hadoop as well as enable integration with distributed processes performed in the cloud.
- Auditing Statistics and Performance Evaluation Services – Data virtualization provides the data governance, auditability and lineage required.
- SLA Management – Data virtualization’s scalable query optimizers and caching delivers the flexibility needed to ensure SLA performance.
- Taxonomy / Ontology Resolution – Data virtualization also provides an abstracted, semantic layer view of enterprise data across repository-based, virtualized and distributed sources.
- Metadata Management – Data virtualization leverages metadata from data sources as well as internal metadata needed to automate and control key logical data warehouse functions.