Next up was Stuart Crawford, part of Fair Isaac’s extensive research staff, on new approaches to the creation, visualization and comparison of decision trees or, as Fair Isaac calls them, Strategies. Stuart has been working at Fair Isaac for many years and has a lot of background in analytics. This work is about how to apply various analytic techniques to managing decision logic.
The most common of the traditional approaches for managing decision logic is that of decision trees where each layer in the tree segments transactions and then asks more questions depending on which segment you are in. Trees are really strong when describing fairly simple problems but really don’t work for complex problems where they get really wide or where there are very large differences between the depth of nodes on different legs. Zooming and panning on a large tree (some can have 23,000 nodes, 1,700 is pretty common) gives you local information but loses context. One solution is to wallpaper your office with the tree but something more useful is called for.
One of the main problems with trees is “subtree replication” – the same nodes show up many times in the same patterns in multiple places in the tree. Visualization tricks don’t really help as the problem is fundamental to trees. A graph, however, would allow this logic to be shared.
This led to a project to use graphs to view strategies. These reuse sub-nodes, preventing the replication of sub trees, and make the whole thing more compact, balancing depth and width more usefully. This is not simply a question of different visualization because trees are often leveled – a designer has carefully designed the order of the levels so that all the nodes at the same level use the same decision keys. This becomes important when building a graph as the order makes a big difference to the complexity of the resulting graph. To build the graphs Fair Isaac has developed an optimal algorithm to manage this ordering.
Once the graph is built (and I realize this is hard to follow with just words not pictures) then it can be simplified with an exception graph – where the most complicated logic that handles exceptions is excluded because it becomes the default to execute if no other path is found. For instance, 492 node tree becomes an 89 node EDAG (Exception graph) or 23,000 became 300. But this is still complex so another technique is used.
An action graph is formed from the logic that leads to a single action or outcome. Each of these typically contains very few nodes. For example, the 23,000 tree has 150 or so action graphs, each with 10-15 nodes. Paths through these action graphs are ANDs and branches ORs. These action graphs can be created independently using any order of variables – you focus on the logic for that action or outcome. The problem is broken down into lots of smaller problems and the “else” is generated automatically by assessing the gaps in the various graphs. Having built these action graphs you need to stitch them together and this uses something Fair Isaac calls Action Graph Stitching. This combines the action graphs, reorders the levels to be optimal etc and generates a tree or graph while detecting overlaps between action graphs (same logic, different outcome) and gaps (no logic for a specific circumstance). The nice thing about this set of software and techniques is that it complements traditional decision tree building and that the representations are interchangeable. You can work with a tree, a Direct Action Graph (DAG), an EDAG (Exception-oriented DAG) or Action Graphs and swap back and forward between representations, always resulting in the same decision logic. Very cool.
Layout matters with these things – poor layout can obscure the decision logic. The algorithm for drawing the nodes must “understand” the structure so that lines don’t cross, lines don’t pass behind nodes (obscuring if they lead to the node or not) and so on. Dynamic layout can also be very revealing – seeing how data flows through the decision logic. You can find places where most data flows, others where little or no data flows. The software allows you to push data through and then emphasizes the paths with the most data and focuses on those areas by bringing them to the center. Very cool.
Finally, comparison of decision trees/strategies is really important. Comparing a Champion and a Challenger, for instance, is a key task. Can be important to understand your own work and to show others. Historically this has either involved swapset analysis (which transactions get different results from the two strategies) or describing the structural differences. The new approach is to focus on the logical differences between two sets of logic. You don’t want structural differences to obscure this – changing decision key order, for instance, need not make any difference to the outcomes but would make the structural comparison invalid. The new approach extracts action graphs that represent the difference between two strategies/decision trees. Again, very cool – much easier to see what the changes or differences are. I can see this being particularly useful in compliance – when a regulator asks “what’s different”, for instance.
- Trees can be a poor way to represent complex logic
- A graph can be more compact and representative
- An exception-oriented graph can be even simpler
- Breaking down a tree into action graphs, focused on outcomes, can be an effective way to build and understand logic
- These representations can be used interchangeably as the logic can be transformed to the different representations
- Data can be used to make the layout even clearer by focusing on the high impact elements
- Action graphs make it possible to see the logical, not structural, differences between logic.
All in all this should make it easier to build and understand decision logic, modify them more rapidly and accurately, make fewer errors and focus your thinking where it makes a difference.
Nice to see some innovation in the representation of decision logic.