Identifying and understanding your high spenders and frequent shoppers is vital to maintaining and improving your revenue stream, but if your customer base is large (on the order of several thousand customers), manually sifting through customer data to recognize the patterns can be costly and time consuming.  Machine learning can prove a viable solution to this problem – not only by removing the effort associated with combing through customers, but by exposing previously unseen relationships in your data to drive decision making. In this blog, we will uncover benefits to leveraging machine learning for customer analytics, and two examples of use cases – Customer Lifetime Value and Customer Segmentation – ripe for machine learning.


To start, let’s define what your “best” customer looks like.  The specifics will vary by business, but all customers have one thing in common: they buy things. In general, a good customer is one who buys often, spends a lot, and intends on continuing this activity over time.  As long as we are recording who buys what and when, then we have enough information to begin separating the best customers from the rest.  With knowledge of who these customers are and their key characteristics, we can look at what the best customers have in common and devise ways to target and acquire more.


There are many different machine learning algorithms that can help us automate the analysis. The decision of which to use can differ based on the data available and what is most important to your business, but they all work in the same general way by continuously iterating to adjust model parameters and improve accuracy.  There are many benefits of using machine learning models:

  • They can recognize patterns that you may never have noticed.
  • They are never “done.”  They can constantly get smarter as more data is fed into them and can make subtle adjustments to correct for changes in the business.
  • They can utilize exponentially more data than a human could, and in a fraction of the time.
  • The output of the models can be used to automate decision making processes that would normally take weeks or months.  For example, if your organization needs to decide  “is it actually worth it to send out coupons to this subset of our customers?”  Automating this decision could entail establishing cutoff points based on a customer’s propensity to engage and price sensitivity to maximize the value of the couponing program.

Examples of Machine Learning Algorithms

There are two commonly used algorithms that will solve this business question.  One type of model that generates a lot of interest is a customer segmentation model.  A segmentation model takes the data you have and separates customers into segments that have similar characteristics.  Because there is no specific value we are trying to predict, the analyst and business users work together to identify what models are both statistically significant and actionable for the business.

Customer Segmentation

The K-means clustering algorithm is an example of a commonly-used segmentation model.  It starts by randomly defining groups and incrementally updating them so similar customers end up being grouped together.  The result is a handful of groupings that can be investigated further to identify what the segments’ defining characteristics are.  Here is an example where we are segmenting customers with machine learning based on their average purchase size and income.

Kmeans Example

In this case we used K=3. The Kmeans algorithm randomly defines “K” centers and assigns each customer to its nearest center.  Then it iterates by moving the centers to the average of its assigned customers and then reassigning.

Customer Lifetime Value

Another commonly used model is a “Customer Lifetime Value” model (CLTV).  CLTV models predict the expected value that will come from each customer.  At their simplest, they can use data speaking to purchasing behavior (last purchase date, number of purchases, and money spent) to predict what we expect each customer to spend over a specified time period.  At their most advanced, they can include referral and cost to maintain, as well as how much customers will spend themselves. Once you know how much a customer’s spending is worth to your business, then you know what you may be willing to spend in marketing and other costs.  When paired with a clustering model, the projected value can be used to see which segments and characteristics are most or least valuable to your business.

Image by Nicky Cane from ‘Why Should You Care About Customer Lifetime Value?

How to Get Started

Machine learning can seem overwhelming, and if you haven’t gotten started with its implementation, you are certainly not alone.  Many companies haven’t taken their first steps simply because they don’t know where to begin. For others, it’s seen as something you can tackle when you’ve “made it” with your data and analytics strategy. These concerns are understandable, but getting value from machine learning may be much easier than you think.

You probably already have data on transactions: customer, product, location, time, and price.  As long as you have the basics, you have enough for a simple customer analysis, and that analysis can evolve over time as you collect more data.