In any CRM system, when an opportunity is first created, it is hard to estimate the probability of winning, based on the initial data and limited prospecting. The prediction accuracy improves as the deal progresses through various stages of the sales process. It is also challenging to predict estimated time to close a deal. However, sales personnel know by the intuition of some characteristics of the deal and its inherent behavior based on their experience. However, by studying the deal behavior of the past, we may be yet being able to develop an intuition of the underlying characteristics of a successful deal and its duration of the closure.
It will be a holy grail if machines can learn from historical data and accomplish:
- The ability to predict win/loss much ahead of the sales cycle
- Predict the value of the deal
- Predict the time it takes to close the deal
We may inference the delays in “close date” are when sales reps assign a lower probability to the opportunities. For example, it is common knowledge that it is hard to guess the probability of a deal closure when the deal is in initial stages. The predictions at that time are as good as assigning a random number for probability. When opportunities mature to higher probabilities, “close date” estimates become higher. Delay in the deal “close date” converges. Additionally, the gap between closed-won and closed-lost opportunities in terms of the delay in “close date” is slim at high possibilities. Overall, it appears that there’s more value in your pipeline that is initially calculated. In addition, it may take more time than estimated to realize this value.
There is a way that we can Forecast sales better by prediction. Using machine learning we can develop models, to predict the probability of an opportunity to “close” in the current quarter.
Let’s discuss two models.
First, one is Predicting time model, this is done by a technique known as Poisson regression we assume that, in its current state, an opportunity has some fixed probability of closing each day. Integrating the corresponding exponential distribution gives the quantity for some time horizon. An optimization we make here is to only train on won opportunities Thus the output probabilities are conditionals, this significantly improves the quality of the prediction.
The second model is win/loss model which attempts to directly predict Pre-win for an opportunity, independent of the time horizon. This is computed by feeding the data into a calibrated random forest classifier, which provides state accuracy.
By combining conditionals of predicting time model with the win/loss model we can compute the probability that a given opportunity will be closed-won in the current quarter. Many potential models are built with varying parameters, and each is automatically cross-validated against historical data to determine the best model for the customer’s data and sales process.