Inductive Business-Rule Discovery in Text Mining #pawcon

February 16, 2010

in Analytics, Business Rules, Data Mining, Decision Management

Share

Dean Abbott of Abbott Analytics presented on the induction of business rules having become, he said, reacquainted with and convinced of the value of business rules alongside data mining (something, of course, I would strongly support). The particular problem was a call center help desk in which text mining was used to find the rules that were needed. The call center was a repair call center for computer hardware devices and components. They wanted to become more efficient in solving a problem – understanding what expertise was needed and which part was the way to solve the problem.

Text mining was essential as the call center person who types the notes on the problem has no ability to solve or even to code/classify a problem. Initially they started with using SQL to find out about the data followed by simple text mining based on domain know-how and a fairly simple extraction of rules from text in the database. The unstructured data was analyzed to enhance and enrich the structured data stored about the calls.

Key text mining concepts:

  • Tokenization – converting  streams of characters in words
  • Stemming or lemmatization – the standardization of forms of words
  • Dictionaries and lexicons – business domain-driven sets of definitions of critical concepts, acronyms, spelling errors
  • Text understanding, linguistic knowledge, was not used as information extraction – "bag of words" identified mathematically
  • Regular expressions can be used to find specific known concepts like phone numbers
  • Keyword extraction or key phrase extraction turns the words in a record into structured information like counts or included/not included flags

For this project the company had a vast number of parts, which had to be prioritized. Ambiguity and different ways that the same part is described by different vendors plus the usual synonyms and jargon problems. Of course not all calls were captured and different countries described things differently.

First pass was pretty straightforward use of automation of word counts etc to replace the manual process but this did not add enough to the manual process because the domain expertise was so critical.

Next they went to rule induction with one record per call that combined the text and structured data. They used SQL and regular expressions to find hundreds of key words and phrases and flag records as having or not having them. Built decision trees against this data but could not get the confidence they wanted – aiming for 90% confidence that a part was needed. The data was too sparse to get this level of confidence so this was not good enough. The keywords don’t relate to the parts-used flag in an obvious way. What was missing was any experience with previous problems – something that would have been critical for a human technician trying to solve the problem.

The third pass went through and used historical data to find the most commonly used parts for sets of keywords, most common solutions for a given set of keywords. Derived top codes associated with a keyword in the historical data, find how often the keyword matched this code and count the number of times the keyword matched. This got them trees with more nodes ("bushier") that were more accurate – getting up to 80% or so.

Rather than trying to build a single decision tree decided to build many of them. Find decision trees that find terminal nodes that had very high correlation to a part being needed, or not needed. Build hundreds of trees and extracted the rules for the hotspots in the these trees ("nuggets"). Some of them contain 5 or 6 conditions and the use of trees therefore was more efficient that just trying to find association rules. The end result was a  bunch of analytically derived rules that were deployed using a business rules management system.

E.g. "If the customercode ="a" or "b" and keywordsResultinPart1 > 7.5% and < 50% and FailureType = "40" or"41" or "42" then do X…

The project has been trying to get to a 90% accuracy of prediction and is getting really close. The cost savings are tangible and the project has been expanded from a pilot to a strategic initiative.

Lessons learned:

  • Text mining had to be used along with domain knowledge and predictive analytics to work
  • Predictive analytics was essential for dealing with the combinatorics of the problem
  • Don’t feel you have to use the whole model – they found useful nuggets in decision trees and only deployed those
  • (and one from me) Think about business rules as a deployment mechanism for your analytic insight
Share

Previous post:

Next post: