3rd December 2008

Intelligent OLAP: Data Mining and OLAP

James Taylor Posted by James Taylor
Categories: BI, Data Mining

Marty Gubar presented on Deliver Depper Insight by Combining Data Mining and OLAP. Marty presented on Oracle’s analytic spectrum and how OLAP and Data Mining fit and can be combined. OLAP and data mining are embedded in the Oracle database and share security, the partitioning, ETL etc. All can be accessed using PL/SQL so cubes (OLAP) can be joined with data mining and spatial etc. This supports what Marty calls the Analysis Continuum:

  • How are my value customers
    Easy to answer, simple query to select customers based on total purchases say.
  • Describe a high value customer, which customers will respond to a promotion or what product should be offered next to a customer
    Harder and more interesting. Uses data mining to analyze detailed and customer-level data. Turn uncertainty into probability in other words. Finds hidden patterns to find anomalies, make predictions and make associations
  • What is the contribution of my high value customers, what are the trends across segments and how will the new promotion affect the bottom line
    OLAP being used to assess the broader impact on my company. Supports ad-hoc data exploration, simplifies complex queries and is faster than accessing a database directly. Of course this is analysis of the results of the actions I took as a result of the decisions I made so this is analyzing at one step removed, as it must.
    Data mining can enrich OLAP with new metrics that can be analyzed in aggregate as well as focus on the important information (determined by data mining) not the noise

He then got into demo mode and walked through how a company might use OLAP and data mining to determine if there is potential fraud in expense reports. If expenses, for instance, are rising is this legitimate or is there fraud? If there is, what’s the impact? Process is basically:

  1. Start with expense reports in the database
    Objects are: Expense_Normal (known OK expenses), Expense_View (all of the expenses), Potential_Fraud_Results, Expense_Fraud_View that joins the potential fraud table with the raw expense information to show potential fraud cost, Expense_Analysis
  2. Use data mining to do anomaly detection on each item
    He makes a couple of selections based on what works and does not work in data mining, removing some attributes and reducing the outlier rate for instance. The data mining wizard for anomaly detection runs and builds the model.
  3. Calculate a potential fraud score for each item
    The model is then applied to the main expense view. Creates a table with additional columns for potential fraud prediction and a probability measures for this. The new table and columns are accessible everywhere using SQL.
  4. Load potential fraud score and expense information into an OLAP cube
    A 3 dimensional model (time, organization and expense category) is created against the data. Different dimensions have hierarchies (calendar or fiscal year hierarchy, expense category hierarchy). The dimensions are then mapped to a source, a table, and then the cube created.
  5. Use the cube to see what this tells you about potential fraud over time, over divisions, over categories etc.

Note that this scenario just looks for anomalies, there is no closed loop to actually mark those that were, in fact, fraudulent (not just anomalous) and use that to drive the fraud engine. The ability to have a set of rules that use the probability of fraud to drive an auto-approve decision was also not discussed, though clearly it would be easy to use the same data mining results.

This entry was posted on Wednesday, December 3rd, 2008 at 9:53 am and written by James Taylor. It is filed under BI, Data Mining.
Tags:, , , , , , , , , ,
You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

One Response to “Intelligent OLAP: Data Mining and OLAP”

  1. Diamonds says:

    Nice, data mining with olap is a good marketing tool, marketing gets smarter and smarter…

Leave a Reply

Hire me

Decision Management News

DMSA monthly newsletter dedicated to decision management


About

Blog Partners

PAW
15% discount code - BLOGJTDC09
SDC
MVP

Categories

Latest Series

More Links

Archives

Tag Cloud

Subscribe


 Subscribe to this blog
Enter your Email

Hear Me Speak

Gartner BPM
Advanced Decisioning for Process Excellence
October 5-7, 2009

Predictive Analytics World
Putting Predictive Analytics to Work presentation and tutorial
October 19-21, 2009

EDM Summit
Decision Management keynote and tutorial
November 1-5, 2009

Recent Comments

Recent Posts

Recent Trackbacks

The Book

Popular Tags