An Executive’s Guide to Implementing AI and Machine Learning

Some lessons I’ve learned applying AI/Machine Learning to support business objectives

As a Chief Analytics Officer, I’ve had to bridge the gap between business needs and data scientists. How that gap is bridged is, in my experience, the difference between how well the value and promise of artificial intelligence (AI) and machine learning is realized. Here are a few things I’ve learned.

AI = machine learning (at least in 2019)

Machine learning is a path to get to AI. At least as of 2019, it is the only known viable path that I’m aware of. In the coming years, there may be other approaches. The two terms are not interchangeable, but for our purposes I will focus on machine learning.

Machine learning is a category of tools and approaches where a computer is given a large training set of data that includes an “answer key”. The machine then learns how to derive the answer key from combinations of the inputs. The model is then tested against a different testing data set to determine its accuracy.

Machine learning as a category can include basic statistical tools (e.g. linear regression) that fit this approach. It also includes neural networks, decision trees, and several other tools.

Is machine learning the right tool for the problem you’re trying to solve?

This one has tripped me up in the past.

For example, recently I had a data set with a lot of data collected from hospitals which had, for each employee, fifty measurements (for example, whether they showed up for work on time or whether they were consistently the only experienced person on their shift) and an indicator of whether they resigned in the weeks and months following. The question was: given this data set, could we create a model to predict employees who would resign before they did so, allowing hospitals to intervene early?

We spent months reviewing the data set and had used basic data visualization approaches to determine a set of rules. For example, employees who were just hired were twice as likely to resign than employees who had already worked at the hospital for ten years. Employees in certain clinical specialties had different average resignation rates as did employees in specific age ranges. Employees who were asked to work over sixty hours a week for two or more weeks per month were 50% more likely to resign.

Was this problem a good candidate for machine learning? Couldn’t we just assemble these rules and build a statistical model?

Ultimately, I believe it was a good candidate, but a similar problem might not have been. Caltech Professor Yaser Abu-Mostafa describes three qualities that problems need to have to be candidates for machine learning:

  1. There is enough data.
    • Machine learning becomes a better approach, relative to other options, the more data you have to train it. If you only have a few hundred rows of data, it may not work or may not work effectively.
  2. There is a relationship between the inputs you have and what you’re trying to predict.
    • In my example, our visual review of the data showed there were a lot of those relationships.
  3. The pattern cannot be described in plain English.
    • In my example, this one is less clear. Because we were able to describe many of the patterns in plain English, why couldn’t we just build a model on those patterns? In our case, there were both hidden patterns that we could not discern and relationships between variables that we could not discern or describe. For example, if someone is both working many hours and the most experienced member of their shifts, is the impact of both additive?

Starting here and making sure that the right tools are deployed might save you a lot of time. Even if the above are true for your data set, are they true enough to justify the investment in the approach?

From an ROI perspective, is the value of predicting an event that happens = to the cost of erroneously predicting one that does not?

Data analysts compare and optimize models based on accuracy. Accuracy is a tradeoff between predicting as many of the events that will happen versus incorrectly predicting as few of the events that will not. There are ways to adjust the model thresholds to see the tradeoff after the models are run, but interpreting such results requires a bit of mental gymnastics.

A better approach for business owners is to set up the models before running them in the terms of the business objective. In our example, the value of correctly predicting an employee who will resign might be worth thousands of dollars (i.e. the value of avoiding having to replace them with a new hire) whereas the cost of incorrectly predicting the resignation of an employee may be small. However, there are thresholds: if 10% of the workforce resigns every year, our understanding of our customers informs us that we can’t identify more than, say 20%, as high risks to intervene on. These criteria need to be translated from business requirements and vernacular (e.g. ROI) into model inputs (e.g. penalty matrices).

The important point is to determine the relative cost versus value of this tradeoff and to ensure the model is built to optimize on it.

One of the most famous machine learning implementations, the Netflix Prize, overlooked this I believe. The prize was given to the model that could most accurately predict movies that a viewer would like. Accuracy in their case was impartial to whether we gave a low score to a movie they might love or whether we gave a high score to a movie they might dislike. Is that the right business need though? If Netflix recommends five movies to me and I know I will strongly dislike two of them, as a user, I may well discount any of their recommendations in the future. But if I see five movies recommended for me and it doesn’t happen to include a movie I know I’ll love… that’s really not an issue for me at all.

Do you need stability in the model? In other words, if one small thing changes in the input, is it OK that the output swings wildly?

In my experience, the more accurate the model (at least by machine learning’s definition of accuracy), the more discontinuous the input-output relationship.

If you have a series of photos and you want a machine learning model to flag the ones that include a fire hydrant, you may well be OK with this level of instability because there’s no case where a user will change a pixel in a photo and expect that the output will remain similar.
In our case, though, some amount of stability is important. If we’re following an employee over time, we don’t want their predicted resignation risk to jump unpredictably every month when their age goes from 31.1 to 31.2 to 31.3.

Do you need to know which input fields are contributing to the predicted output? Or is it OK to have a complete black box?

Neural networks are black box models. You give it your inputs and it will return an output; that’s it. You get no insights into how it got the output or which inputs were most heavily weighted in the determination of the output. Neural networks tend to be more accurate than other models, but is the extra few percentage points of accuracy worth the complete lack of transparency to you from a business need?

Other models, such as linear regression and decision trees, will show you the combination of inputs that lead to the outputs. Data analysts may want to run multiple decision trees and take the average of all of their outputs. This may increase model accuracy but may reduce the ability to see how it is working.

I’ve found mixing models to be a way to get the best of both approaches. One model is a neural network model (with some dampening factor to maintain stability) that gives the turnover risk value. Another model is a linear regression that identifies the key drivers. We display the results of both to users. There’s some risk in that there’s no guarantee the models will match up (in other words, the neural network may identify a high-risk event that the other model does not identify drivers for) but there are ways to account for this.

In conclusion …

The above question and discussion sections were based on experiences I’ve had trying to bridge the gap between business objectives and analytics teams. Business objectives require a mix of both art than science. The science behind machine learning is increasingly commoditized: models are available for anyone to run and there is a wealth of resources available for getting to mathematically robust results. But the business you’re building is not a mathematical one. It’s a mix of user needs, marketing needs, and practicalities of integrating and deploying solutions. The art of deploying machine learning solutions is particularly interesting to me and hopefully these give some points for consideration.

. . .

All books and other resources referenced in this article