22nd September 2008

Scorecard Development Efficiencies with Xeno

James Taylor Posted by James Taylor
Categories: Analytics

Sue Gonella presented on some efficiencies in building predictive scorecards. In particular she covered the   use of sampling data vs using all records into a model development exercise.

Rather than using all records she advocated using stratified random sampling where a sample of each group of interest is used to build and validate the models. This works better because turn-around times are better and experimentation easier. She demonstrated that predictive power is comparable if you use 10,000 records or so per performance group so there is no loss of accuracy if this is done right.

She walked through an example of this showing that for the same model performance she could save more than 99% of the time involved. This enabled a lot more experimentation as most changes to the model made little or no difference to the time taken when 10,000 sample records are being manipulated (whereas the same changes would have caused the full dataset to run even slower). Similarly demoting predictors that make zero contribution so that they don’t affect subsequent iterations makes for even better performance with little or no impact on predictive power.

Clearly taking these steps – stratified random sampling and the elimination of zero-contribution predictors – make for MUCH faster iteration in model development and thus better models. She also pointed out that, even if you are required to use all records in the final model, you can do a lot of the development work with the sample data to improve performance and so allow many more iterations.

This entry was posted on Monday, September 22nd, 2008 at 3:00 pm and written by James Taylor. It is filed under Analytics.
Tags:, , , , ,
You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Leave a Reply

Hire me

Decision Management News

DMSA monthly newsletter dedicated to decision management


About

Blog Partners

PAW
15% discount code - BLOGJTDC09
SDC
MVP

Categories

Latest Series

More Links

Archives

Tag Cloud

Subscribe


 Subscribe to this blog
Enter your Email

Hear Me Speak

Gartner BPM
Advanced Decisioning for Process Excellence
October 5-7, 2009

Predictive Analytics World
Putting Predictive Analytics to Work presentation and tutorial
October 19-21, 2009

EDM Summit
Decision Management keynote and tutorial
November 1-5, 2009

Recent Comments

Recent Posts

Recent Trackbacks

The Book

Popular Tags