I got a briefing from Conductrics recently. Conductrics is a start up delivering an agent-based decision optimization solution focused on next best action or next best message. Designed for a website, mobile application or even a POS it uses machine learning techniques to optimize the behavior of the site/application to meet some overarching objectives – goals and values. It will find values that will optimize for these objectives. The mPath agents handle AB testing, multi-variant testing and machine learning to discover and continually refine the most effective options.
The first step is to identify the decision points within your application. For instance, when you go to a particular page on a website there might be a decision point with three decisions – “which header”, “which offer” and “which image”. For each of these there are various options allowed from which mPath will be trying to select the “best” one. You also need to define your goal points – places where you can identify that your user/visitor has done something of value. For instance all the goal points may be defined on the sales page with registering for offers and actually buying something being the two goal points.
Generally tools try and optimize or run test on a single decision point at a time and lack any awareness of the structure of the application/site. Conductrics links the decision points together with the goal points. This allows it to drive optimization over the whole site or application. How someone navigates, where they might navigate to for instance, is being considered along with the goals themselves and decisions are being optimized to drive a successful conclusion (as defined by the goals). If all the goal points are on a specific page then messages/content that drives someone towards that page matters a lot to achieving the goal for instance.
To make this work within mPath you set up goals (or an objective function), decision-points (along with the decisions made at that decision point and the available decision options), user attributes and configuration information. The product itself has an online UI for administration and reporting, an mPath server where agent specification, controller, learner and data transformation layers are stored plus a RESTful API supporting JQUERY, Flash and Javascript.
When operating, the server gets requests for decisions from the application or site (each identifying itself as being a request related to a particular decision point) and gets back a set of decisions. In addition, when a visitor or customer hits one of the goal points, the server gets a “reward”. Many sizes of rewards can be defined to allow the balancing of competing objectives – a large reward for a direct lead to a smaller but more immediate reward from referral to a partner for instance. Similarly signing up for a newsletter might be worth something or registering for content while a big reward comes from buying something.
Any kind of content can request a decision from mPath and/or tell mPath when it achieves something of value. Most sites or applications will have many decision points and any given customer interaction will involve many decisions over time and may, eventually, reach one of several potential rewards. When a reward comes in the agent assigns this value to the decision that immediately led to the reward but also to the previous decisions in the chain that led to the successful outcome. It keeps the series of decisions made that resulted in either a reward or no reward and increases or decreases the value of the specific options used at those decision points appropriately. It learns what seems to work and what seems not to from the ultimate rewards received.
To do this mPath applies automated a/b testing and machine learning techniques. The control environment allows users to set the explore rate (what percentage of customers are experimented on to find what works) as well as the size of the control group (the group that always gets the one option that is the default in each decision). This allows a lot of testing early while the models are trained and then only a small amount during normal operation to keep things tuned – and perhaps none at all during times of peak demand. The product has some nice reporting, an easy to use call builder including redirect URLs to make it easy to integrate.
I was impressed by the thought Conductrics have put into the product (especially the use of a linked set of decisions to drive a desired outcome) and their use of machine learning/automated decision analysis to learn what works. I look forward to hearing more about them and seeing their product develop.
Comments on this entry are closed.
One thing I’ve noticed is a lot of decisioning applications like this focus on what are essentially zero cost actions where the ‘next best action’ is invariably the one with the highest expected return. It would be interesting to see how they can adapt to actions with a cost associated to them such as discounts or offers. In this space the ‘next best action’ may not be the one with the highest expected return but the one with the largest increase in expected return when taking into account the cost of the action and the expected return of all the other actions including the default ‘do nothing’.
For example, there are a lot of customers who I wouldn’t want to offer free shipping to, even though it may seem to be the best way of increasing sales volume.
From what you’ve seen of Conductrics do you think their solution has the extensibility to perform this type of decisioning?
Modelling for causal effects like this also has the advantage of reducing the requirements of a randomised control group which would have the additional benefit of allowing users to create and deploy models quicker…
Hi Matthew, I work at Conductrics. That is a great question. In many situations that shouldn’t be too much of a problem since mPath is able to receive goal rewards that are dynamically generated by the application. In your example, the decision is to offer free shipping or not to. One way to handle this is to pass mPath the goal(s) value(s) after factoring in the discount or cost. So if the user makes a purchase with the free shipping, then the passed goal reward should be the purchase value less the cost of shipping (direct or opportunity). This same approach would work for discounts as well. For added flexibility, one may also assign negative values to goals, which may be used as cost signals – useful if you had a situation where the decision is to provide a good for free. I hope that helps.
James,
I like the concept of linking decisions to desired outcomes and it’s an interesting idea to put together machine learning / goal orientation for interactive marketing, but I’m not convinced that using the word “optimization” is appropriate. What you described was still one transaction at a time and incremental. I suppose I could define goals that rewarded desirable behavior against a portfolio but then how do I get it to stop once I’ve reached a threshold. If I think about say a mortgage portfolio and I want to reward actions that lead to a mortgage origination but then track my portfolio and “stop” when I start to get too many loans in California I don’t see how this works. Furthermore I might want to start changing how I price my portfolio of mortgages relative to the portfolio risk picture as I go along.
If I have multiple products to offer a client at different costs and benefits (to me) always taking the biggest profit might not be the best offer because another product might create a more “sticky” long-term and eventually more profitable relationship. Perhaps the benefits of recommending a particular product are different for each person that comes in the door. How then do to decide on the rewards? If I’m balancing multiple constraints/objectives how do I set the rewards appropriately to get an optimized result? Isn’t that an optimization problem itself?
Mark
Hi Mark,
Great questions! Not sure I will be able to answer all of them here, but let me take a stab at it. mPath optimizes in the sense that it seeks to maximize the expected value of a sum of discounted scalar rewards received by the agent. It does this by searching for and selecting the decision policy that leads to this maximum. It is also true that there are other optimization problems that one would want to use other approaches to solve. For example, the current version of mPath does not explicitly address constrained optimization problems in the Lagrange dual sense, which maybe is what you are looking for (a great source for algorithmic convex optimization is Stephen Boyd’s site http://www.stanford.edu/~boyd/).
There are a few different ways that one can structure the agent to account for different customer sub-populations. If there are defined and knowable user segments, mPath can be configured to solve/find separate decision policies for each user segment. Additionally, mPath has function approximation capabilities, so it is able to estimate the decision values based on functions of user features.
I am not sure I really answered your questions, but please feel free to contact me directly, and I can try to answer any additional questions you might have.
– Matt