≡ Menu

Live from DAMA – A Reference Architecture for Integrating an Active Data Warehouse into the Real-Time Enterprise


Stephen Brobst of Teradata was next with A Reference Architecture for Integrating an Active Data Warehouse into the Real-Time Enterprise. He started with a great quote from a Gartner analyst:

No such thing as a business surprise – there is always a warning in advance

but were you listening – did you collect data about it, analyze it and deliver decisions and actions in a timely manner?

He prefers calling this “Active Data Warehouse (ADW)” rather than “real time” because it focuses on “right” time not on “real-time”. A traditional data warehouse has the concept of a batch window where everything shuts down and gets updated. This kind of data warehouse is also “passive”. The key thing in moving to “ADW is to get away from this close-down-and-update mentality.An ADW moves from a back office to an operational system and focused on decisioning services. As Dave McCoy said:

real-time means progressively removing delays from business processes

Traditional data warehouse is an integrated and locally consistent data store fed with batch feeds and used to produce lots of standard reports. Also support ad-hoc analysis to support strategic decisions but often questions asked too late to take any real action. An ADW is focused on providing analytics and thus actions in time to do something about it.

He gave an example of Continental Airlines who went from worst to first in the JD Power customer service studies. They realized that the causes of their worst ranking in customer service were decisions – bad overbooking decisions, bad re-routing decisions, bad seating assignments etc. People paid a premium to avoid them. Now they have used customer centricity and ADW to be #1 and are aiming at becoming a “favorite” for their customers. Their flight management dashboard tracks near real-time events (every minute) and stores them in data warehouse which then drives a dashboard for a hub showing “red zone” flights – more than 15 minute delay. For each flight showed map of flight arrivals, how long the connections had and how many of the people trying to make each flight were profitable customers v less profitable. Delivered to the director of operations so they could do their best to fix things e.g. by providing a cart to make the connection work. Use of information is critical – timeliness – but also had to make people/organizational changes to ACT on the data.

Teradata shows 5 stages on the move from traditional data warehousing to active.

  1. What happen?
  2. Why did it happen?
  3. What might happen next?
  4. What should I do right now?
    Driving limited automation or making actionable information available to someone
  5. Automation

Automation is critical for a real time enterprise – people are the bottleneck as any time you need them to do something it is hard to deliver “real-time”. Removing people and using events, business rules management systems to automate decisions is what it takes to go real-time. Not every decision, just those that stay inside the design parameters, but many.

An ADW requires a real mixed workload to be executed – complex strategic queries, continuous updates, batch updates, events, tactical queries….. Current data needs an historical context so an ADW needs both. This lead into Stephen’s discussion of the reference architecture or blueprint for ADWs – the services required, the components to deliver those services and the interoperability needed.

Key thoughts as he built the reference architecture:

  • Data warehouses do not create data, data comes from the “book keeping” systems.
  • Book keeping systems use data access middleware service to deliver data to databases and have users accessing them through various 2-tie, 3-tier or N-tier services.
  • Data Warehouse services provide decision support or automation, predictive analytics based on the data warehouse.
  • Decision making user base is also very varied with everything from thick clients to APIs to thin clients.
  • Data Acquisition services move data from the databases to the data warehouse. Traditionally focused on batch delivery but moving increasingly to streaming data acquisition.
  • EAI and ETL boundary is getting blurred as event-based data transformation merges them.
  • Event notification services, both push and pull, to update or cause activity on data warehouse. In a push model you want to trigger action-taking systems as data changes. In a pull model a regular event is triggered to go check the data warehouse.
  • Application integration is important for decision making applications as they interact with transactional systems. So a transactional system might identify a customer and then access a decision service that uses the data warehouse to deliver a sub-second response.
  • Using open standards is critical as you have lots of pieces that must work together.
  • The data warehouse does not end up in the center – message bus and broker-based middleware is in the middle as it allows the interoperability between the operational systems on one hand and the analytical systems on the other.

Steve mapped the 5 stages to this and the first 3 stages have a limited linkage between the operational systems and the data warehouse – moving from operational to the data warehouse. Stage 4 starts to close the loop, linking services so that they can support each other. Also become focused on mission-critical disaster recovery kinds of issues. Stage 5 is now pulling everything together creating a matrix, not just a loop.

Steve game some case studies:

  • Brazilian Telco using analytic CRM by hooking call center representatives to decisioning services driven by the data warehouse – not to the data warehouse using traditional BI tools, notice, but to decision services. Generates personalized offers based on their usage, where they are in the contract, how price sensitive we think they are etc etc. Best next offer using “active access” to the data, although the data is yesterday’s data which is sufficient.
  • Harrah’s Casino use an active data warehouse as part of their customer loyalty program. They use history to calculate the level at which you lose so much money that you leave. Based on your activity (collected through events) they track customers getting close to their limit and send someone to get you to stop gambling, by suggesting entertainment for instance. This is a classic Enterprise Decision Management or EDM application, using a rules engine, predictive and descriptive analytics based on the data warehouse.
  • Travelocity is another customer using an active data warehouse to manage everything from lifetime value calculations, to latest rates etc. It used to be optional but now has a 24×7 availability as it is used to drive the website by using every piece of data known (IP, customer etc) and applying analytics (what inventory at what prices might be compelling, especially as bundles) to drive content for deals, cross-sells and more. Sub-second response times – 200ms – for an EDM-like application aimed at making the site be targeted – not a “home page” but a “my page” – extreme personalization.

Key issues:

  • Operational
  • Happening NOW!
  • ADW must be open
  • ADW must be be able to interact with front-line business services
  • Need a reference architecture that covers both sides – analytic and operational

The idea of an Active Data Warehouse is clearly well aligned with the key tenets of EDM.


Comments on this entry are closed.