RIP NPS and Surveys thanks to AI

nickplaters
Oct 27
8 min read

AI offers a Great Opportunity to Replace and Improve Customer Experience Measures.

One of the most exciting things about the explosion in available data and the AI to make sense of that data, is that we can measure more things, measure them with greater depth and measure more accurately. This is a great chance to get rid of some measures and ways of measuring that, if we are being honest, never worked very well through either dubious accuracy, high customer effort or a lack of information to really drive improvement.

Unfortunately, some businesses are so “rusted on” to some conventional measures that they may not be willing to admit that they aren’t working well. They may get a few surprises when they start to use the new tools. For example, many businesses will start to realise that customers weren’t as happy as their limited or biased Net Promoter Score (NPS) or Customer Satisfaction (CSAT) samples indicated. Once they get data on 100% of interactions rather than a limited sample, the bias and loading of historic samples will become clear. Other organisations will start to see that quality measurement in their business provided a limited view of what their front-line staff were really doing. Businesses may come to realise that they didn’t really understand what was driving customers to make contact and how poorly they resolved customer problems.

This paper explores the broken metrics and their replacements and what that will mean for insights, reduced customer effort and new ways of managing. The four measures that we have suggested to replace are Net Promoter Score, Quality through limited samples, Customer measured contact resolution, and staff captured demand reasons. We will explore each in turn.

Measures Out the Door and Their Replacements

Net Promoter Score (NPS)

What is broken?

Fred Reichheld's Net Promoter Score is, at best, a crude indicator of customer “disposition”, mostly captured through limited surveying and samples. It has become a very common metric and spawned a sub industry to try and work out what needs to change to move the NPS score. The irony here is that NPS isn’t the end goal; what organisations really want is customer outcomes like greater loyalty and increased “share of wallet, making less contact and complaining less”. The original Reichheld book claimed that companies with higher NPS gained over competitors in the long term by steadily increasing revenue and loyalty. The advantage of the metric was that no- one needed to understand this detail, executives only needed to remember a figure out of 100, and that it needed to be better than your close competitor. Easy.

Organisations have had to work hard to try and map the correlation between NPS scores and these other outcomes. A former Telstra CEO proudly stated that they could “show what every 1% change in NPS was worth”. The question for us is why they had to do that analysis? Why didn’t they put more effort into measuring what really mattered and why they were getting those outcomes? NPS change had become the goal not the measures the business really wanted to change like, products-per-customer, and rate-of-customer-churn.

Organisations focused on NPS must do further analysis (for example, by trawling though 100s of customer verbatims) to understand why they get their NPS scores, so they can change some of the underlying causes. We think this NPS industry can be removed because AI enables organisations to understand customer sentiment and reaction on every interaction and then correlate this with insights about causes and outcome. Organisations can get more insight continuously if they give up the need to use NPS as a proxy for the outcomes that they really want to change.

Solutions:

If organisations need to listen to customers, they can substitute NPS for more “open” feedback that the AI can then mine. In addition to the sentiment analysis (which tells them how customers are reacting) that they can obtain on every interaction, they can survey 100% of customers with a single request: “Please give us your feedback (on that flight/product/interaction)” and then put AI to work to make sense of those open ended responses. Rather than asking customers for a narrow and predetermined set of information, companies will be able to listen to what customers want to say. Listening better and continuously is a scary prospect, but will transform what can be learnt and provide insights to drive change.

Ok we hear you say, if we are an airline (for example), how will we know what customers thought of our flight without surveys and NPS type scoring? If we are a telco, how will we know about customer reactions to our new broadband plan or a new AI chat bot? Our answer is that organisations already have all the operational data to understand what customers went through. For example, whether the flight was late, by how much and why, which food choices ran out, what interactions occurred about a new product and so on. Unhappy customers will still complain, and you can use AI to mine all that information against the data you have on products, services and interaction. There is plenty of data to show you what is going on in the business and how customers are behaving. Sample based NPS wasn’t telling you what was going on for most customers anyway.

This also means that as and when business changes, they can get specific information on those changes. For example: What did customers think of a new menu option? Did they rebook after experiencing the new seat configuration? How are bookings changing because of the new pricing algorithm? Organisations won’t get a nice neat NPS score, but they will get information they can drive change with more directly. As the goal is improvement, organisations don’t need NPS approximated score keeping.

Result: We hope NPS goes in the dustbin of business fads.

Who is doing this: Not many organisations have been brave enough to get off the drug of NPS, but some are heading that way.

The end of quality as we know it

What is broken?

Quality samples were a necessary evil because most organisations couldn’t afford to sample more than a few interactions or work items per month. Sampling 4-5 (or at best 10) contacts a month was never statistically valid, but for years that has been ignored, and staff have lived in fear that, of their 1000-2000 interactions a month, one bad one will show up in the sample of four or five. Many samples measured subjective adherence to a narrow standard that drove poor behaviours. At one client every agent had learnt to ask fake rapport questions such as “how's your day going?” or to ‘recognise’ the crying baby in the background’ because that ticked a box on the quality survey (and hence the cartoon).

Solutions:

AI can now review every recorded interaction (not a sample) in near real time and assess hundreds of attributes of an interaction, rather than a few. It can also correlate these assessments to the customers’ reactions and business outcomes. AI can even warn supervisors in near real time that an agent did something wrong or that a customer is reacting badly. It can measure if an interaction spawns other work or correlates with a loss of revenue, loss of customer, or complaint. Where quality samples steered away from long interactions to build sample size, AI can dig deep into long and problematic contacts. The investment that quality teams put into measurement can now be re-invested in improvement through training, coaching or process change.

Result: Manual quality sampling should be dead and buried.

Who is doing this: about 10-15% businesses have some form of automated assessment and many more have started the journey.

Customer measured resolution

What is broken?

Many organisations must ask the customer if an interaction was resolved through some form of survey. This never works very well as often customers don’t know. For example, they may have been promised a resolution, without knowing if it will happen or not. They may think the information they have been given is correct, only to find later that it was wrong. Other organisations approximate resolution by trying to estimate repeat contacts as the “flip side” of resolution but they use crude measures such as “two contacts from the same customer in a week”. These are rough measures and can include valid or invalid repeats while repeats outside the time frame don’t get measured. It may be that customers don’t realise something wasn’t fixed till they get their next bill or use the service that was booked weeks later so the timing of “poor resolution repeats”, varies. Very few organisations have accurate resolutions measures.

Solutions:

AI can be trained to measure resolution from “both ends”. It can measure the outcomes of an interaction and classify which of those are resolved. It can also spot customer’s language that shows something is a repeat contact at the start of an interaction. For example, “I’m calling again about” or “I'm calling because last time you…”. AI can also be trained to tell which interactions are related and which aren’t so that repeat contacts can be measured accurately. In interaction analysis, AI can measure indicators of resolution such as agents completing work or supplying the information that customers asked for.

Result: No need to survey customers on their perception of resolution.

Who is doing this: About 10% of our clients have invested in real time AI and analytics.

Staff measured demand reasons

What is broken?

Another mini-industry in some businesses is trying to figure out why customers are interacting or complaining. Often staff must try and classify contacts into narrow categories or a series of inaccurate drop downs. That process often produces limited and untrusted classification of contact drivers.

Solutions

AI can slice and dice the reasons for interaction and summarise interactions just as it can summarise online meetings. It can provide insights on the apparent causes and the likely outcomes and link the reasons for contact with the actions taken. It can even advise staff, in real time what information to access, steps to take and on things they missed. Even if this demand classification is only 80-90% accurate its saves time and can provide far richer information than limited drop downs.

AI can correlate this information with other insights. For example, it could analyse which types of chat or calls cause most complaints, take longer to manage or have the lowest resolution rate. Smart organisations are using customer interaction data to analyse the root causes of demand across the business and to understand the end-to-end costs. AI is also changing how we, Limebridge, analyse opportunities in our clients as we can diagnose large samples of contacts far faster.

Result: Manual classification of interactions should be no more.

Who is doing this: 15-20% of clients have some automated analysis in place and more have it as work in progress. Automated interaction summaries we see at nearly 40% of our clients.

Why is this a better world?

What this should mean is a great reduction in surveys and customer effort. In the new world, we should only put customers to work when we want their opinion and open-ended feedback. We can reduce the costs of the tools and analysis for some of these metrics and reinvest in training the AI to give us better analysis. We can have fewer QA assessors and more coaches and trainers. We can spend less time analysing demand and more time investigating and building solutions to bad demand. While the initial transition to AI comes at a cost, it can lead to turning off measurement software which saves money.

Summary

AI has started changing not only how we measure but what we can measure. It can enable much deeper and more precise analysis and reduce the costs of current measurement mechanisms that have really been hidden overheads like surveying customers and manually assessing quality. This is a whole new world that should be both cheaper and more insightful. If you’d like to discuss in more detail, please get in touch at info@limebridge.com.au or call 0438 652 396.