<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>JT on EDM &#187; Data Mining</title>
	<atom:link href="http://jtonedm.com/category/data-mining/feed/" rel="self" type="application/rss+xml" />
	<link>http://jtonedm.com</link>
	<description>James Taylor on Everything Decision Management</description>
	<lastBuildDate>Thu, 18 Mar 2010 17:41:38 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Oracle Data Mining on the Amazon compute cloud</title>
		<link>http://jtonedm.com/2010/03/03/oracle-data-mining-on-the-amazon-compute-cloud/</link>
		<comments>http://jtonedm.com/2010/03/03/oracle-data-mining-on-the-amazon-compute-cloud/#comments</comments>
		<pubDate>Thu, 04 Mar 2010 06:18:06 +0000</pubDate>
		<dc:creator>James Taylor</dc:creator>
				<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Data Mining]]></category>
		<category><![CDATA[Product News]]></category>
		<category><![CDATA[amazon]]></category>
		<category><![CDATA[cloud computing]]></category>
		<category><![CDATA[in-database analytics]]></category>
		<category><![CDATA[odm]]></category>
		<category><![CDATA[Oracle]]></category>

		<guid isPermaLink="false">http://jtonedm.com/?p=3041</guid>
		<description><![CDATA[Copyright © 2010 http://jtonedm.com James TaylorI just heard from a colleague that you can check out Oracle&#8217;s Data Mining tools on the amazon.com compute cloud.  The Oracle Data Mining development team has set up an instance for prospective customers who want to try the in-database data mining algorithms via SQL/Java APIs or the Oracle Data [...]]]></description>
			<content:encoded><![CDATA[<p></p>Copyright © 2010 http://jtonedm.com James Taylor<br><br /><p>I just heard from a colleague that you can check out Oracle&#8217;s Data Mining tools on the <a href="http://amazon.com" title="http://amazon.com" class="autohyperlink" target="_blank">amazon.com</a> compute cloud.  The Oracle Data Mining development team has set up an instance for prospective customers who want to try the in-database data mining algorithms via SQL/Java APIs or the Oracle Data Miner user interface. You can launch an Oracle Data Mining Amazon Machine Image (AMI) directly through Amazon Web Services (AWS) and your only cost is the standard Amazon EC2 charges.</p>
<p>To get started go to <a title="Started on the Amazon Cloud with Oracle Data Mining" href="http://www.oracle.com/technology/products/bi/odm/odm_on_the_cloud_detail.html">the Amazon Cloud with Oracle Data Mining</a> or click here <a title="Click here for a step-by-step visual guide" href="http://www.oracle.com/wocportal/page/wocprod/ver-DRAFT/ocom/technology/products/bi/odm/pdf/gettingstarted-odm%20on%20the%20cloud.pdf">for a step-by-step visual guide</a>. There&#8217;s more on the Oracle Data Mining blog &#8211; <a href="http://blogs.oracle.com/datamining/">http://blogs.oracle.com/datamining/</a></p>
]]></content:encoded>
			<wfw:commentRss>http://jtonedm.com/2010/03/03/oracle-data-mining-on-the-amazon-compute-cloud/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>In-database analytics &#8211; a white paper</title>
		<link>http://jtonedm.com/2010/02/25/in-database-analytics-a-white-paper/</link>
		<comments>http://jtonedm.com/2010/02/25/in-database-analytics-a-white-paper/#comments</comments>
		<pubDate>Thu, 25 Feb 2010 15:12:53 +0000</pubDate>
		<dc:creator>James Taylor</dc:creator>
				<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Data Mining]]></category>
		<category><![CDATA[analytics]]></category>
		<category><![CDATA[beyenetwork]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[in-database analytics]]></category>
		<category><![CDATA[Neil Raden]]></category>
		<category><![CDATA[predictive model]]></category>
		<category><![CDATA[Smart (Enough) Systems]]></category>

		<guid isPermaLink="false">http://jtonedm.com/?p=3020</guid>
		<description><![CDATA[Copyright © 2010 http://jtonedm.com James TaylorSyndicated from BeyeNetwork
My co-author on Smart (Enough) Systems, Neil Raden, has written a great white paper on in-database analytics that is available from Sybase &#8211; Analytics from the start. This paper introduces the key concepts, discusses some of the key issues (our book contains more tips in this area) and [...]]]></description>
			<content:encoded><![CDATA[<p></p>Copyright © 2010 http://jtonedm.com James Taylor<br><br /><p><em>Syndicated from <a href="http://www.b-eye-network.com/blogs/taylor/archives/2010/02/in-database_analytics_-_a_white_paper.php">BeyeNetwork</a></em></p>
<p>My co-author on <a href="http://www.smartenoughsystems.com/">Smart (Enough) Systems</a>, Neil Raden, has written a great white paper on in-database analytics that is available from Sybase &#8211; <a href="http://www.sybase.com/files/White_Papers/Analytics-from-the-Start-WP.pdf">Analytics from the start</a>. This paper introduces the key concepts, discusses some of the key issues (our book contains more tips in this area) and describes some strong case examples. Well worth a read. As Neil says:</p>
<blockquote><p>Advanced analytics will be adopted by most organizations and attain the status of “must have.” While the majority of people in organizations will not become quantitative experts and modelers, the affect of predictive models will be felt across the organization and beyond. They already are. It would be wise to take steps now, and a good first step is to begin evaluating technology solutions that will be suitable for the development and implementation of analytics. From a technology perspective, one clear requirement is an analytic engine embedded in your analytical database technology.</p></blockquote>
<p>The approach Neil describes is one we see more and more as in-database and<br />
in-warehouse analytics become more common. This particular paper talks about the Fuzzy Logix libraries embedded in Sybase IQ but . Fuzzy Logix is one of the sponsors (with SAS, Oracle, Adaptive and Aha!) of the operational analytics research I am doing for BeyeNetwork. Look for it on the BeyeResearch site in a couple of months and, meanwhile, participate by taking the <a href="http://www.zoomerang.com/Survey/?p=WEB22A3HRGXRBS">survey</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://jtonedm.com/2010/02/25/in-database-analytics-a-white-paper/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>BI 2010 &#8211; Optimizing revenue collection</title>
		<link>http://jtonedm.com/2010/02/24/bi-2010-optimizing-revenue-collection-2/</link>
		<comments>http://jtonedm.com/2010/02/24/bi-2010-optimizing-revenue-collection-2/#comments</comments>
		<pubDate>Wed, 24 Feb 2010 09:04:19 +0000</pubDate>
		<dc:creator>James Taylor</dc:creator>
				<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Data Mining]]></category>
		<category><![CDATA[Decision Management]]></category>
		<category><![CDATA[associations]]></category>
		<category><![CDATA[audit]]></category>
		<category><![CDATA[BI 2010]]></category>
		<category><![CDATA[business analytics]]></category>
		<category><![CDATA[Business Rules]]></category>
		<category><![CDATA[Compliance]]></category>
		<category><![CDATA[Government]]></category>
		<category><![CDATA[neural network]]></category>
		<category><![CDATA[tax]]></category>

		<guid isPermaLink="false">http://jtonedm.com/2010/02/24/bi-2010-optimizing-revenue-collection-2/</guid>
		<description><![CDATA[Copyright © 2010 http://jtonedm.com James TaylorEugene from SARS, the South African Revenue Service, presented next on how SARS is using BI in revenue collection. He began by pointing out that there is a difference in how public sector organizations use BI &#8211; a focus on service delivery not profits, on taxpayers not customers, enforcement campaigns [...]]]></description>
			<content:encoded><![CDATA[<p></p>Copyright © 2010 http://jtonedm.com James Taylor<br><br /><p>Eugene from SARS, the South African Revenue Service, presented next on how SARS is using BI in revenue collection. He began by pointing out that there is a difference in how public sector organizations use BI &#8211; a focus on service delivery not profits, on taxpayers not customers, enforcement campaigns not marketing campaigns and so on. Of course public sector organizations still want an ROI, operational efficiency and use KPIs for performance management.</p>
<p>SARS has a wide range of core systems as well as a set of external data sources. Initially the IT department just dumped data from the source systems to their business users. This was replaced with a more formal information management department that responded to requirements defined by analysis teams but still hit the source systems. Capacity constraints led to an enterprise data warehouse (Teradata) but the Information Management department could not meet the demand for new reports etc while the business users wanted more control. Their current state is that of having their information management department acting as an enabler for business departments to manage their own BI capabilities. The technical architecture behind this has a primary staging layer for moving data into a production warehouse Operational Data Store and a secondary staging area supporting BI and data mining warehouses. This two stage approach allows them to present historical data through the lens of constantly changing business rules. A metadata repository underpins this and a presentation layer gives users access to reports, cubes etc.</p>
<p>SARS presents strategic summaries, aligned with the KPIs, as dashboards for the executive level who are typically considered measurement users. Tactical reports and dashboards are delivered to regional offices. These users tend to be exploratory users. Finally operational intelligence is delivered to execution users at the operational, branch level. The different levels consume different kinds of analytics.</p>
<p>SARS has learnt not to pursue big bang projects, to mix business and IT people, to plan for poor data quality and for peak season volumes and to manage change. From a business perspective they focus on changing how business people request data/reports, on showing ROI and on embracing user empowerment and self-service.</p>
<p>They use standard reporting on things like ontime filing, with an ability to drill down into zones, industries and more as well as self-service for reporting on metrics against various dimensions, slice and dicing etc. More interestingly they use various advanced analytics to catch fraud etc. For instance, a company might under report its corporate income tax and over-report the VAT it paid so that it continually gets refunds. However, this is a challenge because:</p>
<ul>
<li>Some critical fields are not mandatory </li>
<li>It can be hard to correlate these two kinds of tax return </li>
<li>Suspicious activity may have been reported but it is purely unstructured text. </li>
<li>At the end of the day the intent is to find those organizations who are truly suspicious so data on registration, status, payment rates/timeliness must also be considered. </li>
<li>And not everyone can be pursued so who to call and who to audit. </li>
<li>Finally, are there linked entities that need to be closed down when a fraudster is found. </li>
</ul>
<p>Advanced analytics are used in various ways:</p>
<ul>
<li>Neural nets predict values, or at least buckets of values, for missing values </li>
<li>Statistically infer outliers </li>
<li>Text mine the unstructured text reports to see if there are patterns of reporting that will allow early investigation </li>
<li>All of this feeds into a risk engine that predicts the risk of fraud </li>
<li>They then predict who is likely to be reached by the call center to prioritize calls to these taxpayers </li>
<li>Next they predict the likelihood of a successful audit so that the auditors can prioritize their work </li>
<li>They use association and geospatial data to find clusters of suspicious organizations, linking directors, audit companies etc. </li>
<li>3rd party information is brought in on things like houses and assets, travel etc to find suspicious mismatches between tax returns and lifestyle. </li>
</ul>
<p>Great example of advanced analytics to detect fraud and catch tax evaders.</p>
]]></content:encoded>
			<wfw:commentRss>http://jtonedm.com/2010/02/24/bi-2010-optimizing-revenue-collection-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>BI 2010 &#8211; Optimizing revenue collection</title>
		<link>http://jtonedm.com/2010/02/24/bi-2010-optimizing-revenue-collection/</link>
		<comments>http://jtonedm.com/2010/02/24/bi-2010-optimizing-revenue-collection/#comments</comments>
		<pubDate>Wed, 24 Feb 2010 08:52:20 +0000</pubDate>
		<dc:creator>James Taylor</dc:creator>
				<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Data Mining]]></category>
		<category><![CDATA[Decision Management]]></category>
		<category><![CDATA[associations]]></category>
		<category><![CDATA[audit]]></category>
		<category><![CDATA[business analytics]]></category>
		<category><![CDATA[Business Rules]]></category>
		<category><![CDATA[Compliance]]></category>
		<category><![CDATA[Government]]></category>
		<category><![CDATA[neural network]]></category>
		<category><![CDATA[tax]]></category>

		<guid isPermaLink="false">http://jtonedm.com/2010/02/24/bi-2010-optimizing-revenue-collection/</guid>
		<description><![CDATA[Copyright © 2010 http://jtonedm.com James TaylorEugene from SARS, the South African Revenue Service, presented next on how SARS is using BI in revenue collection. He began by pointing out that there is a difference in how public sector organizations use BI &#8211; a focus on service delivery not profits, on taxpayers not customers, enforcement campaigns [...]]]></description>
			<content:encoded><![CDATA[<p></p>Copyright © 2010 http://jtonedm.com James Taylor<br><br /><p>Eugene from SARS, the South African Revenue Service, presented next on how SARS is using BI in revenue collection. He began by pointing out that there is a difference in how public sector organizations use BI &#8211; a focus on service delivery not profits, on taxpayers not customers, enforcement campaigns not marketing campaigns and so on. Of course public sector organizations still want an ROI, operational efficiency and use KPIs for performance management.</p>
<p>SARS has a wide range of core systems as well as a set of external data sources. Initially the IT department just dumped data from the source systems to their business users. This was replaced with a more formal information management department that responded to requirements defined by analysis teams but still hit the source systems. Capacity constraints led to an enterprise data warehouse (Teradata) but the Information Management department could not meet the demand for new reports etc while the business users wanted more control. Their current state is that of having their information management department acting as an enabler for business departments to manage their own BI capabilities. The technical architecture behind this has a primary staging layer for moving data into a production warehouse Operational Data Store and a secondary staging area supporting BI and data mining warehouses. This two stage approach allows them to present historical data through the lens of constantly changing business rules. A metadata repository underpins this and a presentation layer gives users access to reports, cubes etc.</p>
<p>SARS presents strategic summaries, aligned with the KPIs, as dashboards for the executive level who are typically considered measurement users. Tactical reports and dashboards are delivered to regional offices. These users tend to be exploratory users. Finally operational intelligence is delivered to execution users at the operational, branch level. The different levels consume different kinds of analytics.</p>
<p>SARS has learnt not to pursue big bang projects, to mix business and IT people, to plan for poor data quality and for peak season volumes and to manage change. From a business perspective they focus on changing how business people request data/reports, on showing ROI and on embracing user empowerment and self-service.</p>
<p>They use standard reporting on things like ontime filing, with an ability to drill down into zones, industries and more as well as self-service for reporting on metrics against various dimensions, slice and dicing etc. More interestingly they use various advanced analytics to catch fraud etc. For instance, a company might under report its corporate income tax and over-report the VAT it paid so that it continually gets refunds. However, this is a challenge because:</p>
<ul>
<li>Some critical fields are not mandatory</li>
<li>It can be hard to correlate these two kinds of tax return</li>
<li>Suspicious activity may have been reported but it is purely unstructured text. </li>
<li>At the end of the day the intent is to find those organizations who are truly suspicious so data on registration, status, payment rates/timeliness must also be considered. </li>
<li>And not everyone can be pursued so who to call and who to audit.</li>
<li>Finally, are there linked entities that need to be closed down when a fraudster is found.</li>
</ul>
<p>Advanced analytics are used in various ways:</p>
<ul>
<li>Neural nets predict values, or at least buckets of values, for missing values</li>
<li>Statistically infer outliers</li>
<li>Text mine the unstructured text reports to see if there are patterns of reporting that will allow early investigation</li>
<li>All of this feeds into a risk engine that predicts the risk of fraud</li>
<li>They then predict who is likely to be reached by the call center to prioritize calls to these taxpayers</li>
<li>Next they predict the likelihood of a successful audit so that the auditors can prioritize their work</li>
<li>They use association and geospatial data to find clusters of suspicious organizations, linking directors, audit companies etc. </li>
<li>3rd party information is brought in on things like houses and assets, travel etc to find suspicious mismatches between tax returns and lifestyle.</li>
</ul>
<p>Great example of advanced analytics to detect fraud and catch tax evaders.</p>
]]></content:encoded>
			<wfw:commentRss>http://jtonedm.com/2010/02/24/bi-2010-optimizing-revenue-collection/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Inductive Business-Rule Discovery in Text Mining #pawcon</title>
		<link>http://jtonedm.com/2010/02/16/inductive-business-rule-discovery-in-text-mining-pawcon/</link>
		<comments>http://jtonedm.com/2010/02/16/inductive-business-rule-discovery-in-text-mining-pawcon/#comments</comments>
		<pubDate>Tue, 16 Feb 2010 23:27:32 +0000</pubDate>
		<dc:creator>James Taylor</dc:creator>
				<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Business Rules]]></category>
		<category><![CDATA[Data Mining]]></category>
		<category><![CDATA[Decision Management]]></category>
		<category><![CDATA[analytics]]></category>
		<category><![CDATA[business analytics]]></category>
		<category><![CDATA[business rules management system]]></category>
		<category><![CDATA[decision tree]]></category>
		<category><![CDATA[Predictive Analytics World]]></category>
		<category><![CDATA[predictve analytics]]></category>
		<category><![CDATA[Text Analytics]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://jtonedm.com/2010/02/16/inductive-business-rule-discovery-in-text-mining-pawcon/</guid>
		<description><![CDATA[Copyright © 2010 http://jtonedm.com James TaylorDean Abbott of Abbott Analytics presented on the induction of business rules having become, he said, reacquainted with and convinced of the value of business rules alongside data mining (something, of course, I would strongly support). The particular problem was a call center help desk in which text mining was [...]]]></description>
			<content:encoded><![CDATA[<p></p>Copyright © 2010 http://jtonedm.com James Taylor<br><br /><p>Dean Abbott of Abbott Analytics presented on the induction of business rules having become, he said, reacquainted with and convinced of the value of business rules alongside data mining (something, of course, I would strongly support). The particular problem was a call center help desk in which text mining was used to find the rules that were needed. The call center was a repair call center for computer hardware devices and components. They wanted to become more efficient in solving a problem &#8211; understanding what expertise was needed and which part was the way to solve the problem. </p>
<p>Text mining was essential as the call center person who types the notes on the problem has no ability to solve or even to code/classify a problem. Initially they started with using SQL to find out about the data followed by simple text mining based on domain know-how and a fairly simple extraction of rules from text in the database. The unstructured data was analyzed to enhance and enrich the structured data stored about the calls.</p>
<p>Key text mining concepts:</p>
<ul>
<li>Tokenization &#8211; converting&#160; streams of characters in words</li>
<li>Stemming or lemmatization &#8211; the standardization of forms of words</li>
<li>Dictionaries and lexicons &#8211; business domain-driven sets of definitions of critical concepts, acronyms, spelling errors</li>
<li>Text understanding, linguistic knowledge, was not used as information extraction &#8211; &quot;bag of words&quot; identified mathematically</li>
<li><a href="http://en.wikipedia.org/wiki/Regular_expression">Regular expressions</a> can be used to find specific known concepts like phone numbers</li>
<li>Keyword extraction or key phrase extraction turns the words in a record into structured information like counts or included/not included flags</li>
</ul>
<p>For this project the company had a vast number of parts, which had to be prioritized. Ambiguity and different ways that the same part is described by different vendors plus the usual synonyms and jargon problems. Of course not all calls were captured and different countries described things differently. </p>
<p>First pass was pretty straightforward use of automation of word counts etc to replace the manual process but this did not add enough to the manual process because the domain expertise was so critical. </p>
<p>Next they went to rule induction with one record per call that combined the text and structured data. They used SQL and regular expressions to find hundreds of key words and phrases and flag records as having or not having them. Built decision trees against this data but could not get the confidence they wanted &#8211; aiming for 90% confidence that a part was needed. The data was too sparse to get this level of confidence so this was not good enough. The keywords don&#8217;t relate to the parts-used flag in an obvious way. What was missing was any experience with previous problems &#8211; something that would have been critical for a human technician trying to solve the problem.</p>
<p>The third pass went through and used historical data to find the most commonly used parts for sets of keywords, most common solutions for a given set of keywords. Derived top codes associated with a keyword in the historical data, find how often the keyword matched this code and count the number of times the keyword matched. This got them trees with more nodes (&quot;bushier&quot;) that were more accurate &#8211; getting up to 80% or so.</p>
<p>Rather than trying to build a single decision tree decided to build many of them. Find decision trees that find terminal nodes that had very high correlation to a part being needed, or not needed. Build hundreds of trees and extracted the rules for the hotspots in the these trees (&quot;nuggets&quot;). Some of them contain 5 or 6 conditions and the use of trees therefore was more efficient that just trying to find association rules. The end result was a&#160; bunch of analytically derived rules that were deployed using a business rules management system. </p>
<p>E.g. &quot;If the customercode =&quot;a&quot; or &quot;b&quot; and keywordsResultinPart1 &gt; 7.5% and &lt; 50% and FailureType = &quot;40&quot; or&quot;41&quot; or &quot;42&quot; then do X…</p>
<p>The project has been trying to get to a 90% accuracy of prediction and is getting really close. The cost savings are tangible and the project has been expanded from a pilot to a strategic initiative. </p>
<p>Lessons learned:</p>
<ul>
<li>Text mining had to be used along with domain knowledge and predictive analytics to work</li>
<li>Predictive analytics was essential for dealing with the combinatorics of the problem </li>
<li>Don&#8217;t feel you have to use the whole model &#8211; they found useful nuggets in decision trees and only deployed those</li>
<li>(and one from me) Think about business rules as a deployment mechanism for your analytic insight</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://jtonedm.com/2010/02/16/inductive-business-rule-discovery-in-text-mining-pawcon/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Text Mining examples with John Elder #pawcon</title>
		<link>http://jtonedm.com/2010/02/16/text-mining-examples-with-john-elder-pawcon/</link>
		<comments>http://jtonedm.com/2010/02/16/text-mining-examples-with-john-elder-pawcon/#comments</comments>
		<pubDate>Tue, 16 Feb 2010 20:22:58 +0000</pubDate>
		<dc:creator>James Taylor</dc:creator>
				<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Data Mining]]></category>
		<category><![CDATA[analytics]]></category>
		<category><![CDATA[business analytics]]></category>
		<category><![CDATA[Government]]></category>
		<category><![CDATA[john elder]]></category>
		<category><![CDATA[Text Analytics]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://jtonedm.com/2010/02/16/text-mining-examples-with-john-elder-pawcon/</guid>
		<description><![CDATA[Copyright © 2010 http://jtonedm.com James TaylorJohn Elder, one of my favorite presenters, introduced a series of customer stories around text mining/text analytics. He calls this &#34;the wild west&#34; of analytics with lots of startups and innovation. He points out that these kinds of analytics must be designed to complement human capabilities, not least because the [...]]]></description>
			<content:encoded><![CDATA[<p></p>Copyright © 2010 http://jtonedm.com James Taylor<br><br /><p>John Elder, one of my favorite presenters, introduced a series of customer stories around text mining/text analytics. He calls this &quot;the wild west&quot; of analytics with lots of startups and innovation. He points out that these kinds of analytics must be designed to complement human capabilities, not least because the human brain is good at text. Examples:</p>
<ul>
<li>US Customs and Border Protection      <br />Using text analytics to detect unusual patterns of activity in border crossings and to mine comments about container shipments to see if they match the codes used on the shipment and to enable them to be compared with the physical results from x-ray machines etc. </li>
<li>NSA      <br />Discover and even prevent leaks &#8211; unauthorized disclosure of information &#8211; using past patterns of disclosure and by detecting unusual information movement. </li>
<li>Social Security Administration      <br />Process of applying for disability is long and complex, only 1/3 are approved and half of those declined are eventually approved. The text around these applications is typed by staff based on what applicants say is wrong about them. This is a rich, focused set of information. 20% of those who would eventually be approved could be identified and approved automatically. Processing it is complex thanks to misspelling, multi-word phrases like learning disability, spelling problems, synonyms and stemming (learn and learning for instance). </li>
<li>National Center for Medical Intelligence      <br />Looking for infectious animal diseases by monitoring the web. For instance, news reports of spontaneous abortions or bleeding in sheep might show that an outbreak of rift valley fever is happening. Find the words around the key phrases and find documents using those words. The process involves a review step by an expert where they can be presented a document that looks interesting to the engine then say yes/no. The engine then uses this review to re-prioritize the remaining documents. </li>
</ul>
<p>John wrapped up with some practical advice on text mining:</p>
<ul>
<li>Know the gain expected in terms of something low-hanging or otherwise leverageable </li>
<li>Have an interdisciplinary team </li>
<li>Be vigilant about data, capturing and maintaining the information you use </li>
<li>Allow for multiple learning cycles &#8211; time to learn </li>
<li>Have a business champion &#8211; someone willing to take the risk </li>
</ul>
<p>And some steps to follow:</p>
<ul>
<li>Assess data assets &#8211; lots of data warehouses are data mausoleums and can find serious problems. Get data owner on board </li>
<li>Identify pain points on the front line &#8211; the decisions that would make a difference to use my terms </li>
<li>Brainstorm a process, allowing for it to be VERY different than what is done today </li>
<li>Conduct a pilot project &#8211; aiming for a quick win and the potential for a big win </li>
<li>Have key people work with analytic experts </li>
<li>Prove the ROI </li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://jtonedm.com/2010/02/16/text-mining-examples-with-john-elder-pawcon/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>New article on smarter systems</title>
		<link>http://jtonedm.com/2010/02/03/new-article-on-smarter-systems/</link>
		<comments>http://jtonedm.com/2010/02/03/new-article-on-smarter-systems/#comments</comments>
		<pubDate>Wed, 03 Feb 2010 15:04:16 +0000</pubDate>
		<dc:creator>James Taylor</dc:creator>
				<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Business Rules]]></category>
		<category><![CDATA[Data Mining]]></category>
		<category><![CDATA[Decision Management]]></category>
		<category><![CDATA[action support]]></category>
		<category><![CDATA[analytics]]></category>
		<category><![CDATA[Business]]></category>
		<category><![CDATA[Business Agility]]></category>
		<category><![CDATA[Community]]></category>
		<category><![CDATA[predictve analytics]]></category>
		<category><![CDATA[smarter systems]]></category>

		<guid isPermaLink="false">http://jtonedm.com/?p=2960</guid>
		<description><![CDATA[Copyright © 2010 http://jtonedm.com James TaylorMy latest column on BR Community has been published &#8211; &#8220;Smarter Systems:  Action-oriented, Flexible, Predictive, Learning,&#8221;
Business Rules Journal, Vol. 11, No. 2 (Feb. 2010), URL:  http://www.BRCommunity.com/a2010/b524.html
Enjoy
]]></description>
			<content:encoded><![CDATA[<p></p>Copyright © 2010 http://jtonedm.com James Taylor<br><br /><p>My latest column on BR Community has been published &#8211; &#8220;Smarter Systems:  Action-oriented, Flexible, Predictive, Learning,&#8221;</p>
<p><em>Business Rules Journal</em>, Vol. 11, No. 2 (Feb. 2010), URL:  <a href="http://www.brcommunity.com/a2010/b524.html" target="eXternal">http://www.BRCommunity.com/a2010/b524.html</a></p>
<p>Enjoy</p>
]]></content:encoded>
			<wfw:commentRss>http://jtonedm.com/2010/02/03/new-article-on-smarter-systems/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Operational decision making as a corporate asset</title>
		<link>http://jtonedm.com/2010/01/27/operational-decision-making-as-a-corporate-asset/</link>
		<comments>http://jtonedm.com/2010/01/27/operational-decision-making-as-a-corporate-asset/#comments</comments>
		<pubDate>Wed, 27 Jan 2010 15:30:08 +0000</pubDate>
		<dc:creator>James Taylor</dc:creator>
				<category><![CDATA[Analytics]]></category>
		<category><![CDATA[BI]]></category>
		<category><![CDATA[Business Rules]]></category>
		<category><![CDATA[Data Mining]]></category>
		<category><![CDATA[Decision Management]]></category>
		<category><![CDATA[Strategy]]></category>
		<category><![CDATA[analytics]]></category>
		<category><![CDATA[asset]]></category>
		<category><![CDATA[Book]]></category>
		<category><![CDATA[decision analysis]]></category>
		<category><![CDATA[decision making]]></category>
		<category><![CDATA[experiment]]></category>
		<category><![CDATA[micro decision]]></category>
		<category><![CDATA[Neil Raden]]></category>
		<category><![CDATA[operational decision]]></category>
		<category><![CDATA[pricing]]></category>
		<category><![CDATA[Smart (Enough) Systems]]></category>
		<category><![CDATA[Smart Data Collective]]></category>
		<category><![CDATA[tom davenport]]></category>

		<guid isPermaLink="false">http://jtonedm.com/?p=1792</guid>
		<description><![CDATA[Copyright © 2010 http://jtonedm.com James TaylorSyndicated from Smart Data Collective
I often tell companies and other organizations that they should treat decisions and decision making as assets. In Smart (Enough) Systems, the book I wrote with Neil Raden, we said
Operational Decision Making as a Corporate Asset
If operational decisions must be made well for your organization to [...]]]></description>
			<content:encoded><![CDATA[<p></p>Copyright © 2010 http://jtonedm.com James Taylor<br><br /><p><em>Syndicated from <a href="http://smartdatacollective.com/Home/24482">Smart Data Collective</a></em></p>
<p>I often tell companies and other organizations that they should treat decisions and decision making as assets. In <em><a href="http://www.smartenoughsystems.com">Smart (Enough) Systems</a></em>, the book I wrote with Neil Raden, we said</p>
<blockquote><p><em><strong>Operational Decision Making as a Corporate Asset</strong></p>
<p>If operational decisions must be made well for your organization to deliver on its strategy, they can’t be made randomly. They have to be made systematically. You have to turn operational decision making into a corporate asset you can measure, control, and improve. After all, when [customers] interact with you, they consider every decision you make to be a “corporate” one—that is, a deliberate one.</em></p></blockquote>
<p>But is it reasonable to consider decisions, decision making, as an asset? After all an <a href="http://en.wikipedia.org/wiki/Assets">asset</a> provides future value to an organization &#8211; tangible or intangible (goodwill and trademarks for example add intangible value while a factory adds more tangible value). Fundamentally an asset &#8220;contributes to future cash flow&#8221;. How does this work for operational decision making?</p>
<p>Operational decisions, those taken in a transactional context, include decisions like next best offer, pricing or discounts, product eligibility, claims approval, credit or fraud risk. Clearly each such decision has an impact on cash flow and profitability &#8211; good decisions having a more positive impact, bad ones a more negative one. The thing about operational decisions, though, is how often very similar decisions are made.</p>
<p>Consider claims &#8211; even a relatively small insurer (like <a href="http://jtonedm.com/2009/09/15/putting-predictive-analytics-to-work-at-infinity-insurance/">Infinity Insurance discussed here</a>) might receive 10,000 claims or more a month. Each claim must be considered and approved or rejected and making the right decision in each case adds to the bottom line. As a result the insurer needs a defined decision making process for claims &#8211; each one cannot be considered as a special case if 500 or more are to be handled efficiently every day. If the insurer has a good decision making process then each decision will be more likely to be a good one. If they don&#8217;t, less likely.</p>
<p>If we apply our definition then an effective operational decision making process <strong>is </strong>a form of asset &#8211; it contributes to future value by ensuring that better operational decisions are made. If we define the business rules, the analytics that make up this decision making process then we are investing in an asset. If we embed those rules, those analytics, into our operational systems and processes then we can ensure this asset is fully exploited.</p>
<p>In the book we went on to identify some characteristics typical of other corporate assets. Each of these can be applied to decisions and decision making:</p>
<ul>
<li>They are strategic<br />
Planning exercises and budgets should consider decisions &#8211; ensuring that plans that rely on changed to decision making, for instance, include the definitions of the changes needed. Executives don&#8217;t care about individual decisions but they should care about the decision process.</li>
<li>They are managed<br />
The company invests in decision management and governance so that the quality of decision making doesn&#8217;t degrade over time</li>
<li>They are visible<br />
The cumulative value of an operational decision should be known (multiply the difference between a good and a bad decision by the number of times such a decision is made) and the investment made in improving decision making should show up on balance sheets and be visible to management</li>
<li>They are reusable<br />
Companies, well run ones at least, don&#8217;t duplicate assets or leave them idle. So with decision making.</li>
<li>They are improved constantly<br />
Companies should invest in analytics and experimentation to constantly improve decision making &#8211; this is the equivalent of preventative maintenance and upgrades for machine tools.</li>
</ul>
<p>High volume operational decisions drive your business every day, playing a role in every transaction. Investing in operational decision making will ensure that these decisions add, rather than destroy, value.</p>
<p>For more on decision making check out <a href="http://www.smartdatacollective.com/home/24466">Thinking different with decision analysis</a> by Ted Cuzzillo,  Tom Davenport&#8217;s article <a href="http://hbr.harvardbusiness.org/2009/11/make-better-decisions/ar/1" target="_blank">Make Better Decisions</a>, this piece on <a href="http://jtonedm.com/2009/03/05/decision-management-focuses-on-microdecisions-for-macro-impact/">Micro decisions for macro impact</a> (references another Tom Davenport article), <a href="http://www.teradata.com/tdmo/Article.aspx?id=12653">Prepare for Impact</a> (Teradata magazine) and of course <a href="http://www.smartenoughsystems.com">Smart (Enough) Systems</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://jtonedm.com/2010/01/27/operational-decision-making-as-a-corporate-asset/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Business Analytics in Operations</title>
		<link>http://jtonedm.com/2010/01/26/business-analytics-in-operations/</link>
		<comments>http://jtonedm.com/2010/01/26/business-analytics-in-operations/#comments</comments>
		<pubDate>Tue, 26 Jan 2010 15:53:17 +0000</pubDate>
		<dc:creator>James Taylor</dc:creator>
				<category><![CDATA[Analytics]]></category>
		<category><![CDATA[BI]]></category>
		<category><![CDATA[Data Mining]]></category>
		<category><![CDATA[analytics]]></category>
		<category><![CDATA[b-eye network]]></category>
		<category><![CDATA[beyenetwork]]></category>
		<category><![CDATA[business analytics]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[Decision Management]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[SAS]]></category>

		<guid isPermaLink="false">http://jtonedm.com/?p=2936</guid>
		<description><![CDATA[Copyright © 2010 http://jtonedm.com James TaylorSyndicated from BeyeNetwork
I am working with the folks at B-eye Network and sponsors Oracle, SAS, Aha!, Adaptive and Fuzzy Logix on some research &#8211; Business Analytics: Putting Analytics To Work.There is growing interest in the power of analytics, especially predictive analytics, to improve business operations. The use of data mining [...]]]></description>
			<content:encoded><![CDATA[<p></p>Copyright © 2010 http://jtonedm.com James Taylor<br><br /><p><em>Syndicated from <a href="http://www.b-eye-network.com/blogs/taylor/archives/2010/01/business_analytics_in_operations.php">BeyeNetwork</a></em></p>
<p>I am working with the folks at <a href="http://www.b-eye-network.com/index.php">B-eye Network</a> and sponsors Oracle, SAS, Aha!, Adaptive and Fuzzy Logix on some research &#8211; Business Analytics: Putting Analytics To Work.There is growing interest in the power of analytics, especially predictive analytics, to improve business operations. The use of data mining and analytic techniques in operational systems is moving beyond its early adopter base in financial services and into the mainstream. As companies adopt business analytic techniques they struggle with the balance between using these techniques to improve reporting and dashboards (“Predictive Reporting” as it is sometimes called) and using them to improve systems and thus every individual transaction (“Business Analytics” or “Decision Management”). A clear understanding of what business analytics are, how to use them, and the compelling business value of doing so is called for. Hence the research.</p>
<p>The study will describe business analytics and what should you expect from a business analytics vendor. It will discuss the motivation for adopting business analytics and how you should approach the evaluation of business analytics as well as how business analytics fit within an enterprise and business architecture. It will discuss risks and issues and describe the benefits and challenges based on real customer experience. Finally it will discuss the kinds of decisions thatwill show a positive return on business analytics and how business analytics can change businesses fundamentally.</p>
<p>All in all it should be a lot of fun to write and I am looking forward to completing it. In the meantime you can help by taking the survey &#8211; <a onclick="javascript:pageTracker._trackPageview('/outbound/article/www.zoomerang.com');" href="http://www.zoomerang.com/Survey/?p=WEB22A3HRGXRBS">http://www.zoomerang.com/Survey/?p=WEB22A3HRGXRBS.</a></p>
<p>Look for the report in a couple of months on <a href="http://beyeresearch.com/">BeyeResearch</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://jtonedm.com/2010/01/26/business-analytics-in-operations/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Book Review &#8211; Analytics at Work</title>
		<link>http://jtonedm.com/2010/01/26/book-review-analytics-at-work/</link>
		<comments>http://jtonedm.com/2010/01/26/book-review-analytics-at-work/#comments</comments>
		<pubDate>Tue, 26 Jan 2010 12:36:37 +0000</pubDate>
		<dc:creator>James Taylor</dc:creator>
				<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Book Reviews]]></category>
		<category><![CDATA[Data Mining]]></category>
		<category><![CDATA[Strategy]]></category>
		<category><![CDATA[analyst]]></category>
		<category><![CDATA[analytic competitor]]></category>
		<category><![CDATA[analytics]]></category>
		<category><![CDATA[Book]]></category>
		<category><![CDATA[business process]]></category>
		<category><![CDATA[competing on analytics]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[decision making]]></category>
		<category><![CDATA[Decision Management]]></category>
		<category><![CDATA[forecasting]]></category>
		<category><![CDATA[information]]></category>
		<category><![CDATA[jeanne harris]]></category>
		<category><![CDATA[operational decision]]></category>
		<category><![CDATA[predictions]]></category>
		<category><![CDATA[segment]]></category>
		<category><![CDATA[segmentation]]></category>
		<category><![CDATA[tom davenport]]></category>

		<guid isPermaLink="false">http://jtonedm.com/?p=2944</guid>
		<description><![CDATA[Copyright © 2010 http://jtonedm.com James TaylorI received a pre-release copy of Tom Davenport’s new book Analytics at Work: Smarter Decisions, Better Results. The book is a follow-on to Competing on Analytics (reviewed here) and is a shorter, pithier book than its predecessor. Once again Tom collaborates with Jeanne Harris and this time Robert Morison of [...]]]></description>
			<content:encoded><![CDATA[<p></p>Copyright © 2010 http://jtonedm.com James Taylor<br><br /><p><a href="http://www.amazon.com/gp/product/1422177696?ie=UTF8&amp;tag=enterpdecisim-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=1422177696"><img class="alignright size-full wp-image-2946" style="margin: 2px;" title="AnalyticsAtWork" src="http://jtonedm.com/wp/wp-content/uploads/AnalyticsAtWork.jpg" alt="Analytics at Work" width="106" height="160" /></a>I received a pre-release copy of Tom Davenport’s new book <a href="http://www.amazon.com/gp/product/1422177696?ie=UTF8&amp;tag=enterpdecisim-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=1422177696">Analytics at Work: Smarter Decisions, Better Results</a><img style="border: none !important; margin: 0px !important;" src="http://www.assoc-amazon.com/e/ir?t=enterpdecisim-20&amp;l=as2&amp;o=1&amp;a=1422177696" border="0" alt="" width="1" height="1" />. The book is a follow-on to Competing on Analytics (reviewed here) and is a shorter, pithier book than its predecessor. Once again Tom collaborates with Jeanne Harris and this time Robert Morison of the Concours group. Where the previous book focused on so-called analytic competitors, this is about “analytics for the rest of us”. It is a very readable book with some good practical advice that does not require the remaking of your company in a new image. It is also a quick read, it is only 180 pages or so, which should help get more people to read it.</p>
<p>And I hope people do read it. As Tom says “The unexamined decision isn’t worth making” and too many companies and organizations are making unexamined decisions, failing to apply data they have about what works and what does not, making the same mistakes over and making dumb decisions. Like Tom I think it is time for this to stop and this book will tell you how.</p>
<p>In the initial chapter, the book outlines the difference between areas with a history of analytic decision making and those where it is new – performance metrics may be progress in the latter but something like customer segmentation and treatment requires more advanced analytics to score and segment them. It’s important to remember this, to find the right degree of analytic sophistication to make a difference. The book’s focus is broad, covering how analytics can address key questions of information and insight in each of the past, present, future &#8211; reporting, alerts and forecasting give information in the past, present and future while modeling, recommendations and predictions/optimization do the same for insight.</p>
<p>For me the most useful part of the book is part one &#8211; a set of chapters describing The Analytic DELTA – Data, Enterprise, Leadership, Targets and Analysts – what Tom regards as the 5 critical elements of successful analytic adoption:</p>
<ul>
<li>D – accessible, high quality <strong>d</strong>ata<br />
I particularly like the focus on uniqueness as a criteria and on using the business need (decision) to drive quality and integration needs – being with the decision in mind. Focusing BI/analytics people on the quality of decisions they enable not on the data they manage like Humana’s “advocate of all matters quantitative” who relentlessly improves “corporate decision making efforts”.</li>
<li>E – <strong>e</strong>nterprise orientation<br />
The point here is not to focus on fractured analytic projects but on coherent ones across the enterprise. Enterprise-serving projects not self-serving ones. The authors make the great point that getting value from your enterprise applications means anticipating how to use the information they provide to improve performance.</li>
<li>L – analytical <strong>l</strong>eadership<br />
An organization’s leaders must care about analytical decision making, especially where it is multiplicative and delivers leverage (in highly repeatable operational decisions, for instance, where the improvement in decisions is multiplied across all your transactions).</li>
<li>T – strategic <strong>t</strong>argets<br />
A crucial element, that of focusing on using analytics to develop distinctive capabilities. This chapter has a great list of processes that lend themselves to analytics because they are data rich, asset or labor intensive, dependent on speed or consistency and more. The focus on decisions that are complex or ca be optimized, where consistency is required and those done poorly today is spot on. The “ladder of analytic applications” is a great tool for seeing how to develop from simple to more complex analytic solutions working from getting your data in order to segmentation and differentiation, becoming predictive, institutionalizing and finally optimizing. Interestingly this sequence matches exactly the pattern I have seen in research I have been doing for IBM on analytic journeys.</li>
<li>A – <strong>a</strong>nalysts<br />
A nice chapter with good thoughts on how to manage analysts as a strategic resource.</li>
</ul>
<p>Part two addresses how to stay analytical through embedding analytics in business processes, building an analytic culture, reviewing your business comprehensively and embarking on an analytical journey towards “more analytical decisions and better results”. I really like the focus on embedding analytics in business processes – this is a topic close to my heart – and like the authors agree that the use of analytics is especially valuable in workhorse or operational processes. The authors do a nice job of explaining why organizations need to adopt a test and learn mindset, to be always unsatisfied and mindful of change and to focus on an “industrial” analytic process.</p>
<p>While Tom and I disagree over the extent to which analytics can be used to drive fully or mostly automated decisions, we are in synch on his definition of nirvana – an organization that knows its decision points, relies on analytics, integrates them into its operations and monitors performance to close the loop. And one that MAKES DECISIONS AND TAKES ACTIONS using analytics – one that realizes it is not enough to just analyze its data.</p>
<p>The authors end by pointing out that becoming analytic is not a one-time activity but must be ongoing – it is a journey which organizations must begin, where they must build momentum and where they must go from thinking of analytics to thinking about decisions and decision making, from analytic management to decision management.</p>
<p>It’s a great book and you should buy it.</p>
]]></content:encoded>
			<wfw:commentRss>http://jtonedm.com/2010/01/26/book-review-analytics-at-work/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
	</channel>
</rss>
<!-- WP Super Cache is installed but broken. The path to wp-cache-phase1.php in wp-content/advanced-cache.php must be fixed! -->