Claudia Imhoff, PhD, President and Founder, Intelligent Solutions, Inc.
Colin White, Founder, BI Research
Robert Eve, VP Marketing, Composite Software
Gary Damiano
Hello, Ladies and Gentlemen. I’d like to welcome you to today’s webcast, It’s time to Think Outside the Box: Extend BI with Data Virtualization, brought to you by Composite Software as part of our Thought Leadership Series of Webcasts. Thank you for joining us today. We have an impressive list of experts joining us today. We’d like to welcome distinguished analyst from the BI industry, Dr. Claudia Imhoff, President and Founder, Intelligent Solutions and Colin White, Founder of BI Research. Robert Eve, Executive Vice President of Marketing at Composite Software will be our third speaker. Today’s topic, It’s time to Think Outside the Box: Extend BI with Data Virtualization.

You’ll be learning about a framework for extending the scope and reach of BI. Seven popular uses of data virtualization that support this framework, some real world case studies, the business and IT value that you’ll derive from data virtualization and a number of key considerations and next steps to follow to leverage this thought leadership in your specific scenarios and situations.

First, let me introduce our panel of speakers. Let’s start with Dr. Claudia Imhoff, a thought leader, visionary and practitioner in the rapidly growing field of Business Intelligence, Dr. Claudia Imhoff is a popular and dynamic speaker on Business Intelligence and the infrastructure to support these initiatives. She is the co-creator of the Corporate Information Factory, also known as CIF and its architecture. Dr. Imhoff has authored five highly-regarded books on these subjects and writes monthly columns for technical and business magazines. Hello, Claudia. It’s great to have you here.
Claudia Imhoff
Thanks so much, Gary. I appreciate the introduction.
Gary Damiano
Next I’d like to introduce Colin White. Colin White is the Founder of BI Research. As an analyst, educator and writer, he is well-known for his in-depth knowledge of Business Intelligence, data management and information integration technologies, and how they can be used for supporting smart and agile decision-making. With 40 years of IT experience, he has consulted for dozens of companies throughout the world and is a frequent speaker at leading IT events. Thanks for joining us today, Colin.
Colin White
Thank you for inviting me. It’s my pleasure to be here.
Gary Damiano
And a few words about our third speaker, Robert Eve, Executive Vice President of Marketing at Composite Software. Bob Eve’s experience includes executive level marketing and business development roles at leading enterprise software companies such as Informatica, Mercury Interactive, PeopleSoft and Oracle. Bob is a prolific writer who has authored dozens of White Papers and blogs on BI, data crypt integration and virtualization. Welcome, Bob.
Robert Eve
Good to be here.

Gary Damiano
I’m Gary Damiano, Vice President of Field Marketing for Composite Software, and I will be the moderator for today’s webcast. Our agenda and discussions today are based on a series of articles written by Claudia and Colin on extending the scope and reach of Business Intelligence. We will talk about the reasons for extending BI, the role of data virtualization, use cases, case studies and, of course, the benefits of adopting this framework. We’ll provide you with information at the end of this webcast to help you find these articles and additional resources that you’ll find very useful. Let’s start our discussion with Claudia.

Claudia Imhoff
Alright, well, thanks again, Gary, very much and my thanks to Composite as well for giving me and Colin this forum to speak today. One of the things that we’ve learned over the years is Business Intelligence is not just meant for a select few. What we’re finding is that it’s becoming far more pervasive, if you will, throughout the organization. Everyone makes a decision. Everyone in an organization makes a decision at some point during the day. Now what that means is that we’re going to have to work on our BI environment a little bit. We’re going to have to move it beyond the traditional data warehousing with data marts and so forth. And what we’re going to be talking about today is how we go about doing that.
So let me talk about the four ways that we can expand or extend the BI environment. The first one is by supporting a wider set of sources, of course. Obviously we have to include the traditional structured data. That’s what we know and love and have been using for years. But now we also need to bring in the unstructured data and the contextual types of information. Unstructured data – these are the comments, or the terms and conditions of a contract, or maybe it’s even some link to a blog or something. A lot of contextual information that gives us the color commentary, if you will, about why we’re getting the results. It’s one thing to get the analytic and to look at a number, it’s another thing to actually understand what that number means in terms of the context of the number. The second way is with these vastly new ways of deploying Business Intelligence. In the last five or ten years, we’ve opened up the deployment options tremendously, even in the last year or so. Software as a service for Business Intelligence is becoming a big deal now, being able to get your BI on demand. Using cloud technology certainly does reduce the cost. Then, of course, there’s open source Business Intelligence, another inexpensive way to get Business Intelligence up and running. And then there are the appliances, software only, hardware and software appliances, analytic databases. We’ve had a tremendous flurry of innovation in the appliance area as well.

Claudia Imhoff
On the next slide, let’s talk about the last two then. The next one is by expanding the actual uses of Business Intelligence, if you will. We started out, most of us started out with tactical BI, most of us started out doing simple reports or perhaps comparisons, week over week, month over month, quarter over quarter. And that worked out pretty well, but then we started to expand it to include longer term or more information so that we could begin to look at it from a strategic standpoint. How have we done over seasonalities, for example? Or can we now begin to look at some kinds of predictive analytics based on years of information. If we continue down this path, here’s where we will be in six months, and so forth. Using our Business Intelligence now to set strategies, what kinds of customers seem to be the best for us, what types of products seem to work better than others? Where should we be placing a new store? All of that is strategic Business Intelligence. But we’re also now really seeing BI come into its own in operational BI and that means expanding the business use into operation.
No longer are we looking only at a backward look, right, looking at things that were historical in nature, or looking forward in terms of the predictive capabilities, but we’re now trying to analyze what’s going on right now. How do I understand what’s happening with my business internally and externally. How can I determine what action I should take right now, to either stop a bad trend or enhance a good trend. So, certainly, that opens it up to not only more information, but certainly more users as well, expanding that business use now into the business operations themselves. And that’s the fourth bullet there. We are reaching a much wider audience now. Colin and I have written a number of articles on information workers and what we found is that information workers actually split themselves into three different groups. The first one has been a traditional audience for Business Intelligence, the information producers, the power users, if you will. Certainly these are the people that we, at first, built the BI environment for. They produced the analytics, they produced all kinds of models and reports and so forth, that are used by and large by the second group, the information consumers. Now here’s where we have to make BI much easier. They are not going to be creating the analytics or the models. They’re going to be using them. That means they have to have access to the analytic results themselves. And then that third group, the information collaborators, probably the newest group in the information worker world. They’re the people that I just described, the color commentators, if you will. These are the people that explain why a result is happening, why an analytic is trending the way it is. They are the experts that understand both the internal and external events that are causing these changes in our analytic stand.

Claudia Imhoff
So these are the four ways that we can extend the rates of Business Intelligence. I’m going to turn it over now to Colin and the next slide will start his piece in terms of how do we actually go about doing this. So, Colin, to you.
Colin White
Thanks, Claudia. As Claudia’s mentioned, there are four ways to extend BI. The four ways are shown on this slide. If we start over on the right-hand side, our main objective here is to extend the reach of BI to a wider user audience. Today we’re just really addressing a very small subset of the information workers Claudia was talking about. And so, things we can do there, we need to make technology easier to use, we need to make information easier to consume and as this chart shows here, collaboration plays a key role here. The other three approaches that Claudia was talking about really enable us to build new applications which inevitably also lead to us being apt to supporting a broader audience. So, for example, if we look at analyzing information at the top of the screen, we’re seeing new analytical processes come out. For example, like predictive analytics and content analytics and event analytics. These are enabling us to build a broader set of BI applications that address a wider set of BI applications in business and business needs which in turn, of course, addresses a wider user audience. Claudia mentioned about extending the scope of deployment options. She talked about appliances and cloud computing and things like that. Again, that’s an openness to build BI applications faster and cheaper. They’re faster to deploy and, again, they’re enabling us to build BI applications that simply weren’t possible before.

Colin White
The last series on the left-hand side going toward we briefly touched on this. This is very much the focus of today’s presentation is the ability to extend the reach of BI to a broader set of data sources, more information enables us to make more informed decisions, and I think we have some examples. Claudia mentioned text data, but also things like web data, event data from censors, and the technologies that are associated with this area is data syndication and data virtualization and we’re going to focus on data virtualization today and talk about really how virtualization enables us to extend BI. But the key point about this is, as we extend to reach these broader data sources that, in turn, speeds downstream. It enables us to build new applications and enables us to exploit new deployment options, which, in turn, enables us to reach a wider user audience.

Colin White
And so, if you go to the next slide and talk about why data virtualization, what we’re going to talk about in this session are seven use cases. But those seven use cases really break into two areas. The first area is complementing the existing data integration environment. Many people now use data integration tools that support data integration in the area of consolidation and propagation, but we use technologies like ETN or ERT in that environment. And, data virtualization extends the existing data integration environment. And the other thing you’re going to see with these use cases is that we are going to be apt to invest in new applications and new business needs. Virtualization gives us access to live operational data. It enables us to actually combine analytical data sources in data warehouses which is about operational systems. It gives us access to new data types and new data services. And the whole objective of data virtualization is time-building unified data access law that sits in front of this heterogeneous sources.

Colin White
So, as I said, if you go to the next chart, these are the seven case studies we’re going to address in this seminar. We’re going to go through them fairly quickly. I will cover some, Claudia will cover some and Bob will actually pop up at odd occasions to hopefully talk about some really large user case studies. And that’s the whole point. When we wrote these articles about data virtualization and came out with these seven use cases, these were based on our experience. They’re based on actually seeing live applications running using the data virtualization technologies.

Colin White
So let me start off then. Let’s look at the next slide, which will be our first use case and, if we think of the enterprise data warehouse today, as Claudia mentioned, it’s very much focused towards strategic and tactical decision making, but increasingly the big growth area in BI is towards operational BI where we’re using BI to make daily and intra-day decisions. And one advantage of data virtualization here is that we can only look at the historical information in a data warehouse, but we could look at what’s happening now, you know the virtual environment gives us access to up-to-the minute operational data that is external to the data wealth, enables us actually look at detailed data. So, you find core centers in that when they’re dealing with customers, they can not only look at historical information then to first data warehouse, but the data virtualization area also enables us to look at current data as well and presents the combined, current and historical information through a single view to business users and business applications. And I think, talking to Bob now, I believe, Bob, you have a case study here about an energy company that’s doing exactly that.

Bob Eve
Yes, thank you, Colin. I think Aera Energy, who’s is the California operations of Exxon and Shell, if you haven’t heard of that, Aera, it provides a great example of combining historical and up-to-the-minute data using data virtualization. Their particular use case is around maintenance management of wells, kind of a timely topic, and what they have in their situation is they have a large Oracle database, they use about 1,600 ETLs a night in capturing surface and sub-surface and other information, business information about the activities of their 10,000-plus wells. But if something happens during the middle of the day and they need to deploy maintenance crews, equipment and that sort of thing to that fight, to get that well up and running again, they really need a real-time snapshot of where that maintenance equipment is, where those crews are currently working and figure out from an operational BI point of view where to deploy those resources.
So, they’ve used the data virtualization technology from Composite to do that and it achieved pretty significant business benefits. Primarily, they’re able to keep their wells up and running more, which keeps the revenue flowing. So I think that’s a good case. There’s a lot of other good cases in and around data warehouses and now we’ll kind of move back towards Claudia. As Gary mentioned, Claudia is the author of the Corporate Information Model, which is pretty much a seminal work in our industry and in that, it recognizes the environments where there’s a mix of multiple data warehouses and I think that creates an opportunity to federate different consolidated sources using data virtualization. So, with that, send it back to you, Claudia.

Claudia Imhoff
Alright, well thanks so much. You know, the Corporate Information Factory is certainly a well-known and proven architecture for Business Intelligence. It deals with creating a data warehouse and then from that spawning, data marts. Now what happens, though, is that a lot of companies kind of have somewhat, let’s say somewhat, chaotic implementation pads or they can many times buy an existing company that has an existing corporate information factory or FAS architecture, whatever. It has the data warehouse in place as well. So we’ve got… In this situation then, we have two or more data warehouses in an organization. Now, there are two ways that we can organize these data warehouses. Obviously, the first one is we can tear down one of them and add them to the other data warehouse. So, that’s, you know, it seems like you’re throwing out the baby with the bath there a little bit. There was a lot of work that went into creating that data warehouse, its design and so forth, and it seems a shame if we have to destroy it. Another alternative would be to use the data federation type of technology and use the two existing data warehouses, just simply federate them together to create a combined view of the information. So now Line of Business A and Line of Business B can certainly federate their warehouses together and we can have a combined set of data without going through the pain and anguish of paring one down or both down and rebuilding a single one. So, with that, let me turn it back over to you, Bob, because I think you have an example of this very situation.

Bob Eve
Yeah, absolutely, and I think mergers and acquisitions is something everyone can relate to. In this particular case, this is the recent acquisition and combination of Pfizer and Wyeth. They both are large pharmaceutical companies with significant R&D efforts. A lot of that information about their R&D efforts compounds the drugs and discovery, the research team, that information is held in different data warehouses at each side. And what they were able to do upon merger was use our product and the data virtualization approach to combine the data virtually across those two data warehouses and enable to really rationalize the engineering efforts and the R&D efforts in both companies very quickly. And by very quickly, it’s less than a month. And that’s pretty incredible when you consider 130 or so drugs in process in each or, or more, in each of those companies and large research teams distributed around the world. So, pretty successful and that’s about nine months faster than they did when they had an earlier acquisition with Pharmacia, so quite a proven approach that worked very well in their situation. But, besides combining data warehouses, there’s also an element in Claudia’s Information Factory about dependent data marts and I think data virtualization and its use in working with dependent data marts is also a key point that we should bring up. So, Claudia, let me turn it back to you so you can address that for the audience.

Claudia Imhoff
Well, indeed it is. As I mentioned with the Corporate Information Factory, or, in fact, anybody’s architecture, what we like to create are these analytic data marts, these specialized areas where we do specific types of analytics. Now, in the Corporate Information Factory, a lot of people today misinterpret these data marts as that they have to be or must be physical data marts. Not so. In fact, one of the beauties of having a data warehouse in place, as you see in this slide, is that, yeah, if we have to, we can spin off a physical data mart, but we don’t have to. We can actually create a virtual data mart, a view, if you will, into the enterprise data warehouse and be able to create this analytic environment without actually moving the data anywhere. As you can imagine, these data marts are somewhat fluid. When we create a data mart and we start to have our business users using them, the first thing that’s going to happen is they’re going to request some changes to that data mart.
Now, it’s one thing to have a physical data mart and to tear it down and rebuild it every time someone says, gee, I want something a little bit different, or I’m expanding the use of this mart, I need new data and new whatever, new reports from it, and so forth. It’s another thing if we’ve got some kind of virtual data mart and all we’re tearing down is that virtual connection and redoing it. Obviously, that’s a lot less stress, it’s certainly cheaper and it works pretty darn well. Now, it does mean, and you’ll see this later on, it does mean that our data warehouse now is going to have some pretty good horsepower behind it if I’m going to have a lot of virtual marts that are actually tapping into the enterprise data warehouse rather than being physically separated from the data warehouse. Now with that, Bob, I’d like to turn it back over to you. You have a customer example on this one. In this case, it’s Putnam.

Bob Eve
Yes, absolutely, and they created a really excellent financial data warehouse with a lot of the information that the mutual fund managers and the financial analysts use in order to do their portfolio management and make stock purchase decisions. And in their case, they have 100 or so analysts and a number of analytic data marts that they need to create and modify as the market changes, etc. What they found was, by using a virtual approach, they were able to more quickly append and change and add new virtual data marts to react to market opportunities and, not only did that help them get their data faster and do their analysis faster and make their financial decisions faster and better, they were also able to save a significant amount of money because, you’ve got to remember that every time you make a physical data mart, you now have the operating costs to maintain it, the actual hardware, etc., and so they were able to save both IT infrastructure costs, as well as provide better data. So, it was a good combination.
Gary Damiano
Claudia, why don’t we have you talk a little bit about the operational data store and how that fits in.

Claudia Imhoff
You bet. You know, a much maligned operational data store, not a lot of people understood it when it first started. So let me see if I can explain what it is first off. What we found is that in organizations that have multiple operational systems, there was still a need to have, sort of more current information integrated together. Now, at first, we built a physical operational data store. We used our ETL processing and so forth. And we tried to make it run as fast as we possibly could, trickle feeding in information, whatever we could do to create this operational data store, this environment, not a data warehouse, but an operational data store, data that is more current, that is updated, as opposed to being appended, as the data warehouse is. It doesn’t have the history that the data warehouse has, of course. So it is called an operational data store.
Now, what we found is that even doing a physical one, as fast as we possibly can, there was still a fair amount of latency in the data. We had to move it again from the operational environment through our ETL processing and pop it into our database that we were using for the operational data store. Now, along comes data federation and what we discovered is that if our sources are fairly well behaved, that is, they are fairly well integrated and the quality is fairly good on them, we can actually create a virtual operational data store. We can bring the data together through our data federation capabilities, rather than physically moving it from the operational environment into its own database. We can create this virtual operational data store that makes it look as though it were in its own database, but, in fact, it’s not. So what we’re seeing today are a lot of companies actually creating virtual operational data stores. And I like that. I think that’s a good move as well. Now let me talk about the last one and that’s on the next slide.

Claudia Imhoff
And that’s constructing a data warehouse prototype. The prototype itself, and this is something that is actually quite useful. Data warehousing, as we all know, we have to understand what the data is, we gotta go back to the business users, we’ve got to understand their business needs, and that’s not always easy. However, if we can build a virtual one, a prototype, if you will, of what we think we want, and show it to our business users, then they can say, yes, no, add this, take this away, I want to be able to do this, and so forth. That prototype really does enhance our ability to understand the business requirements before we build the physical thing. Now, please, do not get me wrong. The data warehouse that I’m talking about, this prototype, is just a prototype. We throw it away at the end. It is not going to become a virtual data warehouse. There’s no such thing. That prototype, though, helps us understand our business requirements much better, helps us even understand the quality of the data that we’re bringing together because we can see it. We can see the problem. We can see what needs to be done in our ETL processing and then phase 2, we build the actual physical data warehouse with all that intelligence that we’ve garnered from our prototype. So, again, let me stress, the prototype goes away, and the physical entity becomes the data warehouse itself. Now, with that, let me turn it back over to Colin for another couple of case studies.

Colin White
Thanks, Claudia. I think prototyping is pretty interesting use case for virtualization and I think, as we start to move through these seven use cases, we’ve started to move away from more of the traditional warehousing environment and expanding that to really beyond the enterprise data warehouse. And I think the important thing to talk about here is that data virtualization can be used that’s combined warehousing data operational data. It can be used to bring data together from various data warehouses. It seems in life there’s never one of anything if there was more than one. And again to bring together multiple operational sources. I think the other very good use of virtualization is to combine structure data, which has been very much the focus of our core business systems and analyzing those core business systems to expand these systems to new types of data. Now only about 20% of data-to-data is valuable for analytical purposes and I think there is increasing interest in bringing other types of data into the mix. Obviously, I think web data, social media data are good examples.
There’s a lot of useful textual information there that can give us valuable information about how people view our products, for example, the quality of those products and so on. But in addition to web data, I think we are now starting to see event stores being built of pool data records and things like that. They again can be brought into the virtualization environment, X&L themes, data syndication themes, and so on. So I think we want to start supplementing our decision-making systems with this non-structured or semi-structured data. And in the same way that we create views of relational data in our operational systems and our warehousing systems, I think we also want to create these virtual views and bring these other data sources in as well. And the big advantage of that, it eases development because the developer has a single view of his disparate data. It helps the end user as well when they’re using query tools to access this data. The other advantage is if you’ve got this law in front of all these data sources, it means we can actually move data between the data sources, and even applications without effect in these front applications.

And that really brings me on to use case seven is that normally I’ll be using virtualization to access data, I think now we’re bringing in the capabilities to access operational applications as well. And with the move toward service oriented architectures and more, simple web-oriented architectures, again, these provide information and data that we can use for decision-making and then secondly it provides, virtualization provides the ability, even in the operational world, to access these variety of data sources. I think we’re heading towards producing a unified data access services law that provides access to a variety of back ends and provide access to a variety of applications on the front end. Another thing about working on this webinar with Claudia and Bob is Claudia and I can talk about these seven use cases based on our experience, but Bob has some very good use cases as well. And I believe, Bob, Wells Fargo really is a company that shows this seventh use case at work and the value it can bring.

Bob Eve
That’s right, Colin, and it’s a pretty innovative approach that they’ve adopted there in trying to really extend this idea of a unified access layer and we’re seeing that more and more in our user base. And what they’ve done, really, is leverage a variety of sources that they can take advantage of and serve that up to a wider variety of users and the numbers are pretty impressive. It’s over 200 sources now that have been brought into this layer and that’s currently enabling over 25 different applications and it’s quite extensive adoption. And that’s provided excellent benefits to them because now that they have this sort of common approach, whenever there’s a new need, a new requirement, a new application, a new opportunity in the market, they’re able to quickly pull together the data that’s required, by 50% faster than they would have in the past. And, in addition, they’re also able to gain some reuse, so as they start to build certain objects that virtualize and make the data more available to maybe application 1 or application 2, all of a sudden when application 26 comes along, they need a very similar view of the same data and they’re able to really support that quite rapidly. Really, some excellent success in this model.

With that, why don’t we shift gears a little bit now that we’ve looked at the seven different models, and do a polling question where we will see what the adoption is amongst the people on the call today. Gary, do you want to go with that?
Gary Damiano
That’s a great idea. We’ve covered a lot of material and this is the perfect point to sort of get our audience to weigh in on some of the things we’ve been talking about with a polling question.

Okay, so it seems actually like a pretty broad reach of benefits that our audience feels that they can realize from these different use cases that are out there. Claudia, Colin and Bob, what do you think of these results?
Claudia Imhoff
Well, you know, it doesn’t surprise me – extending the reach (a), (c) and (e), I guess, were all pretty much the same amount and pretty high, 39, 39, 40%. That doesn’t surprise me that those answers are the ones, those selections were the ones that got the most votes. They want to reach more business uses, they want to reach a wider set of sources and they certainly want greater agility and that doesn’t surprise me too much at all. Not listed above only got 3%, so I guess that’s a good thing.
Colin White
It looks like we got most of the seven, right? It looks like our case studies addressed most of theirs, which is good.
Gary Damiano
Okay, great. Let’s start to talk about some key take-aways.

Claudia Imhoff
Yep, I’d like to talk about these. We have a number of questions and comments and, hopefully, we’ll have time for those. It looks like we’re going to have a fair bit of time. But let me talk about the first one. The problem that you’re trying to solve, picking the right data integration tool for the job. A number of the questions were, wow, if we want to federate our data, what about data quality, what about integration and all that kind of stuff. And the answer is, you bet. When you’re combining different sets of data, whether it’s from multiple warehouses or operational sources or even bringing in unstructured data, I doubt seriously that these sources were built using an overarching standard model. People buy systems off the shelves, they create their own, it’s kind of a chaotic mess. They’re mostly inconsistent formats and names and data quality processes in place, calculations even. So, to make the virtual connections, if that’s the route that you’re going, then the disparities have to be easily transformed during the federation process. If they’re not, then obviously you haven’t picked the right integration tool. You may have to back it back down to ETL and invoke all of the physical data quality processing that you have. The second one is enhancing your existing data model. You can use discovery tools, you can build a data abstraction layer and so forth.
So what you’re looking at there is being able to see something to begin with, enhancing your existing data model. Another consideration is optimizing performance, again choosing the most capable vendor. There were a number of comments about if I create a virtual data mart, for example, doesn’t that mean that my performance is going to be kind of bad? Well, it depends. If you choose a data warehouse database structure that is a real workhorse, and there are those out there now that can handle that kind of work, then you’ve got no problem. If you choose a kind of wimpy database then, yeah, you’re going to have some performance issues. So pick the most capable vendor. Make sure that you match data federation with a database vendor that can handle that kind of horsepower. And then the last one is the data security itself. We are opening up the world to a lot of new users, new uses, new data sources, and we’ve got to make sure that the data is as secure as it possibly can be. And it’s not just security. Keep in mind that it’s also the privacy of the data itself. If you’re opening up your customer data, for example, then you better be darn sure that everybody understands what the privacy issues are surrounding those customers and what the policies are of the organization, and so forth. Now let me go to the last slide here on key considerations.

Claudia Imhoff
These are the IT benefits and, Colin, I think this is one that maybe you wanted to talk about a little bit?
Colin White
Sure, I can talk about that. I think I’m going to look at really the IT benefits and then the business benefits. I think I mentioned before that with a lot of these new technologies it gives us greater agility. When I get on to the business benefits and when we sort of look at some of the questions that are being asked, is we’re not trying to replace enterprise data warehousing here. I think when data virtualization came out people often talked about virtual enterprise data warehousing. That’s not what we’re talking here. What we’re talking about is extending the existing data warehouse with new solutions and I think we’re going to see a whole set of solutions being built around the enterprise data warehouse that extend it. Those solutions using virtualization enable us to actually build applications faster. We don’t necessarily always have to bring the data into the warehouse before we can analyze it. And I think it also enables to be agile, that we can adapt very quickly to changes in our source systems. Virtualization, I think also lowers cost in the sense, as I said we don’t have to bring everything, design everything and bring it to the warehouse first. So we need fewer staff to do, less hardware and I think, of course, that reduces risk because we’re not replicating all the data in the analytical environment. And that means we can actually access the data where it exists and that improves governance and simply reduces the effort from an IT viewpoint to make data available to business users.

Colin White
And that really brings me to the next slide which shows the business benefits of data virtualization. If we go back to the four areas that we covered, the four approaches in my diagram, if we look at those, data virtualization enables us to reach a wider set of sources. Today I need 20% of data available for analytical processing. If we can make more data available and it’s good data, we can make better business decisions. Then at this stage enables us to build new types of applications. We’re saying this management system, we’re starting to look at applications that analyze how customers feel about our product and the quality of our product’s use in social media data. And so I think we’re just scraping the surface at the moment for Business Intelligence in terms of the number of applications that we can address and I think that really is going to extend the use of BI throughout the organization and give it much more of a scope of reach of these applications. I think there’s also a number of deployment options that I think I mentioned as we went through this, never one of anything. So we talk about an enterprise data warehouse, but in reality, most companies have multiple data warehouses.
So we’re not talking about replacing data warehouses. They cover our core business systems. What we’re talking about though is extending the data warehouse to add additional applications. And in reaching a wider user audience, it means we don’t always have to bring data into the enterprise warehouse before we can analyze it. Sometimes it’s not cost effective to do that. Sometimes we can’t do that in a timely manner. And, therefore, virtualization enables us to access very quickly source data without having to integrate it. And those applications, those analytical applications, built on that data can be extended and used in conjunction with the warehouse again using virtualization technology. So there are many advantages here and these pieces fit together, but the key thing is, and that is the question, are we talking about virtual data warehousing? No, we’re not. We’re talking about extending the existing traditional data warehouse environment, with new solutions, and that brings significant business benefit. So, I’ve finished what I have to say and I’m going to hand it over to Bob, who’s going to talk a little bit about next steps and then I think we should get into the Q&A session. So, over to you, Bob.

Bob Eve
Great. Thank you. We’re going to go to the next slide and then I think we’re going to try to resurrect that poll as well. So, thank you, Colin and Claudia. That’s really some excellent insights into different ways we can extend BI with data virtualization. And it’s a real opportunity, I think, out there today and it’ll drive significant business and IT benefits as we discussed and as we covered in the first poll. In a moment we’re going to have a second poll to talk about some of the specific areas and drill into those areas again kind of going back to the seven patterns that we adopted, and I’ll turn that over to Gary to do, but the slide here is really like what would be some of the next steps, what are some of the resources that are available to the listeners today so that you can move along in this journey?
There are some articles and we have a URL to the series of articles that Colin and Claudia wrote in BI network around extending BI. Also, if you go to our website, there’s some resources there, white papers and the like. You can also reach out to the analysts directly. The website URL there is on our website. That’s some of the analysts on our space, including Colin and Claudia and what they say about this area and you can reach out to us directly. Finally, we also provide a tool that actually Claudia helped us build on when do you virtualize, when do you consolidate the data physically? What are some of the criteria that you can apply on a project-by-project basis? A lot of our customers have found this to be quite useful to them in making some of those decisions from a technical point of view and then combining it with the different use cases and kind of matching up to these models that we’ve covered today. And between the two they are able to make a really good decision on when to virtualize and when not. So, Gary, you want to bring up that poll and talk about some of the adoption? And then we can close with some Q&A.

Gary Damiano
Okay, great. Thanks, Bob. We talked about benefits in the previous poll, but let’s go ahead and go back a little bit and talk about adoption. Our speakers talked about a number of different use cases and in this poll what we’d like you to do again is select all that apply to your situation. With the first one, complimenting an EDW with operational data, as Bob talked with the energy company example with their well information. Combining data from multiple data warehouses. That was with the pharmaceutical company, Pfizer-Wyeth. Combining virtual data mart from a data warehouse as in Putnam’s case in the investment community. Implementing a virtual operation data store. Constructing a warehouse data prototype as Claudia mentioned, to help optimize the design process and really understand business requirements. Combining structured data with new types of data, immediate data, web data, XML feeds, event stores. Creating unified data access services either SLA or even web-orientated architecture as in the case of Wells Fargo. And we didn’t mention it, but is your need getting into the area of BI and cloud computing? Or the perennial none of the above. So go ahead and vote and let’s see what your feedback is.
Colin White
I think we already have some results, right? Fantastic.

Colin White
It’s an even spread there. Again, I like the fact that only 2% of the people said none. Of course, we added the BI cloud computing which is interesting because I find significant interest in cloud. But a pretty good even spread across the use cases, I think. So, do you have any comment?
Claudia Imhoff
No, I think you’ve covered it pretty well. I am actually surprised that it is as evenly spread across the various ones. Interesting that the creating a unified data access area got as high as implementing a virtual operational data store. I thought that was kind of an interesting thing. I would not have expected that, but that’s good to see.
Colin White
Well, I think that comes back to our point that we’re now seeing a whole bunch of different federated information technologies to complement each other and what people are looking for are unified layers using these technologies to have a common view of data. It just makes it develop that much easier, provides a lot of flexibility, I think. Bob, any comments?
Bob Eve
No, this is Bob Eve. I’m actually quite pleased to see such a wide adoption and it’s pretty consistent, I think, with what our customers are doing. They oftentimes start in one area and then they adopt and extend and go to other areas. So it shows there’s a lot of places to start and a lot of places to go.

Gary Damiano
Okay, great. That’s great feedback from our audience and from our speakers. I know many of our audience must have questions that they’d like to ask at this time, so we’d like to start our Q&A. So why don’t we start off with our first question. What is the performance impact of data virtualization software on source systems?
Bob Eve
I’ll take that. This is Bob. I think it all depends in the types of queries. Are they wide shallow queries, do you have good resources available at the source systems? If so, you’re going to be fine. It will work quite well and quite fast. Now if the system is really an impacted system to begin with, you really probably don’t want to be adding extra query loads. But in that case you have a couple of options. One, if it’s a highly impacted system, that’s one of the major uses of operational data stores that offload a lot of the transactional information onto a data store that can be queried. So you certainly have that option. And as well, there’s a number of our customers who will use a caching approach to frequently pull in a more controlled fashion a subset of the data and then the data virtualization queries the cache as opposed to querying the original source. So, some real flexibility there. Most of the time it’s not a problem, but if it is an issue, there’s just several really good work-arounds.
Claudia Imhoff
Let me add just a little bit to that as well. I think one of the things that is necessary is the ability to monitor the performance of these operational systems. It may be difficult to understand who’s being impacted or what’s being impacted without some kind of a fairly sophisticated monitoring capability that allows you to see that, yeah, we got a problem here. We need to do something about it.
Colin White
Yeah, and I think the other thing is, the other key thing here is the actual quality of the virtualization technology. We’re moving a lot of data around the network and how much load we put on the source systems is dependent on how sophisticated the algorithms are in the data virtualization layer. Some virtualization products are very crude in their approach; others are very sophisticated. So I think the sophistication and technology use in these virtualizations is very key in this area as well.
Gary Damiano
We have another question. The concept of a virtual data store, whether it replaces a data warehouse or a data mart, is intriguing. Do you envision physical data stores to go away at some point in the future if a company uses a virtualized data tool?
Claudia Imhoff
Well, I saw that one earlier and I think what we have to understand is the data warehouse is a series of historical snapshots. Most of our operational data stores don’t maintain that kind of history. We’re talking anywhere from two years to 20 years of historical information that we may store in our data warehouse or business intelligence area. Will it ever go away? You never say never, right, because who knew that we could combine our operational BI with strategic and tactical BI? A few years ago I would have said that was not possible. So the technology is constantly making changes. But I think in the foreseeable future I don’t see the operational systems maintaining massive amounts of historical data for the sole purpose of creating a virtual data warehouse.
Bob Eve
I’d like to add to that too. I think what we’re talking about in this case is, before it was almost a half complex data that I needed to combine, I’d have to do that physically and it became, building marts and stores just became the standard approach. And I think what Claudia’s saying is, yes, if you’re doing historical consolidations and roll-ups and you’re going to re-dimension the data for analysis purposes, certainly the physical approach is the only way to go. But I think the cases where you’ve just sort of gone to a physical dependent data mart just because that’s what you always did, I don’t think that has to be the answer anymore. You examine the virtual opportunity, and that’s why we provide that decision tool, to help you kind of clarify some of the factors, and then make that decision based on what’s best for that particular project, not just based on, hey, we’ve always done it that way.
Claudia Imhoff
Exactly. I think that’s an excellent point, Bob. As we sort of get stuck in our ways and don’t realize that there are new techniques and new technologies that can help us do things that we couldn’t do 10 years ago.
Colin White
Well, I think the other thing is these virtualization layers can act as filters in the sense that they’re not necessarily alternatives, they can actually sit in front of systems and act to filter the data before it’s actually loaded to the data warehouse. There was a question I noticed about data sources and can they handle that? I think we can also use this as filtering mechanisms.
Gary Damiano
Okay Let’s move to our next question. Does Wells Fargo have a data governance organization in place to assist with the data virtualization layer?
Bob Eve
Yeah, I’ll take that. Absolutely. I think this really implies something more than just a data governance. It’s really where did some of the people in the process activities that occur in and around the warehouse. Certainly data governance is one aspect and especially if we’re going to share data across multiple applications, some common modeling techniques, some common definitions that are used consistently and need to be brought into place so it’s consistent across different uses. And also they apply some what would be called data quality techniques, some validation, some standardization, some parsing, some enrichment activities occur on the fly in their model. And that needs to be kind of managed in the governed approach, along with the data quality strategy of the company.
Gary Damiano
Alright, let’s take another question. Data federation would eventually lead to federated MDM. Do current MDM tools support federated MDM or federated governance?
Bob Eve
I’ll start on that one. I think the idea here is that you certainly want to do some work to get control of your master data, particularly around your customers, your products and that sort of thing. And there’s a number of MDM techniques to really get that master data together and proof the quality of that data, whether you store that in a hub or some sort of master model of various techniques. I think data virtualization compliments that. We’re often used as help source the data for the MDM hub. We’re often used to extend the, get a more 360 view around the data that’s not in the hub. So, for example, you may have some master data about a customer, and you may know the lifetime to date purchases from that customer, but you might virtualize in the specific orders that the customer has with a specific complaint from the help desk, etc., because you may not be maintaining that in the master data hub.
Colin White
Also that people sometimes start off with virtualized MDM and then start to make it physically consistent. But the other thing is where there’s shared master data, there might be some master data that may be unique to a particular store or application. It’s a virtualization that enables you to connect that unique master data to shared master data. So, again, I think what you said is true, Bob, you don’t, we’re not replacing consolidated systems, we’re supplementing them.
Claudia Imhoff
Yeah, and I think that’s… I agree with you. I was going to say that to me the master data store is somewhat similar to a data warehouse store. We can bring all kinds of data together. In fact, come companies are actually combining their master data store with the results out of their data warehouses. So, for example there’s customer segmentation or customer lifetime value or next best product or something like that that they can federate with their master data itself, not physically move it into the hub, but actually just federate the two together.
Colin White
But I think the other key message here is from an application user viewpoint, you shouldn’t have to worry if the data is consolidated or distributed. The whole point is to provide a single view of the data you need to support the business application. If you consolidate it or virtualize it, it’s purely sort of a factor of performance, it’s a factor of data quality, it’s a factor of time lists, it’s a factor of flexibility, it’s a factor if you’ve got time to consolidate it. The advantage of these common layers is you can move from a consolidation to virtual layers and backwards and forwards to suit your business needs without having to affect your application. So virtualization should be transparent.
Gary Damiano
Colin, that’s an excellent summary. I’m looking at the time and we’re actually running a few minutes over, so I think that’s all the time that we have for our webcast today. I’d like to thank all of our attendees for the questions and we actually will go through the questions that you’ve asked us a lot more than we were able to address and respond to them. If you have additional questions and would like to speak to any of our speakers to address after the webcast, please feel free to contact them using the information that we’ll provide in the post-event follow-up email that we’ll be sending out. We have more Thought Leadership Webcasts coming your way. There’s one in early August, another in September. Please go to www.compositesw.com to register for future events. You can also follow us on Twitter. Our handle is CompositeSW. This concludes today’s webcast, Time to Think Outside the Box: Extend BI with Data Virtualization. Like to extend a thank you to our presenters, Claudia, Colin and Bob, for spending their time with us today. And a special thank you to you, our audience, for attending our webcast. Please take a few moments to fill out the brief survey questions and add any additional comments regarding today’s webcast. Thank you for your participation and have a great day.