by Frank Huang | October 22, 2019
In Part One of this article, we discussed the PEO landscape to see whether AI was truly as widespread as marketing and sales departments would have you believe. While we can’t see everything going on in the market, AI appears to be confusingly used to mean other forms of analytics, and usually near the more foundational end of the analytics spectrum.
This raises a few more questions I’d like to address in Part Two of this article. Namely:
- What differences do I see even in the same category of analytics e.g. predictive or prescriptive?
- What is the potential for analytics in the PEO industry?
- What is the current and potential future state of regulation in the industry?
More Confusion Clarification
As a quick refresher, here is how we described the different types of analytics in the article’s Part One:
- Descriptive – “Something happened?”
- Diagnostic – “Oh, right. Something happened… what happened?”
- Predictive – “This is going to happen again. I’m guessing it’ll happen… here.”
- Prescriptive – “If it’s going to happen here, let’s do something about it!”
Before we move onto the rest of the article, we need to add some more definitions to the mix:
- Artificial Intelligence (AI) is usually defined as a computer being able to perform human-like tasks e.g. perception, reasoning, learning, etc.
- Machine Learning (ML) is a subset of AI and represents how computers can learn how to do something without explicit instructions. Within ML, there is both supervised and unsupervised learning.
- Supervised learning covers the building of models based on a dataset that has both inputs and a desired output. The modeler will give some direction but the computer is doing most of the computational labor. Classification and regression models fit in this category.
- Unsupervised learning covers the building of models based on a dataset that has only inputs and no identified output. The computer investigates and discovers patterns and groupings on its own. Examples of unsupervised learning include clustering, anomaly detection, dimension reduction techniques, neural networks, and more.
AI, in short, covers a lot of ground and a lot of different techniques. And because AI and ML are used interchangeably, it’s easy for people to get confused by what constitutes “true” AI. For the purposes of this article – and perhaps to better clarify my Part One conclusion – I define AI as models driven mostly1 by unsupervised learning.
Drivers of Model Performance
Now that we’re more informed about the different types of analytics and what AI really is, one would expect homogeneity of performance and results within that single category of analytics. But that’s not quite true. There can be great differences between two models due to three main reasons.
What’s Under The Hood. It’s important to understand what type of engine is running under the hood. Marketing may want you to believe you have a Tesla S P100D all-electric engine that goes 0-60 in 2.3 seconds when in fact you have a Toyota Corolla engine humming to 60 mph in a very patient 9.9 seconds.
In nerd-speak, this is asking what the underlying statistical method is. Since we’re agreed that AI – more specifically unsupervised techniques – is not heavily utilized in the PEO industry, that leaves us with more common regression and classification models. The engines used here are predominantly either linear or a generalized linear model (GLM).
A full comparison of the two is beyond the scope of this article but you can think of a linear model as a square and a GLM as a rectangle. Just as a rectangle isn’t restricted to having all four sides be the same length, a GLM isn’t restricted in the ways a linear model is. In practice, this allows the GLM to handle more realistic, real-world data that a linear model would be ill-equipped to handle. We see this play out in a variety of insurance problems2, where they are used to predict frequency, severity, and pure premium. We also see this across the spectrum of risks the PEO is involved in, such as workers’ compensation, health benefits, EPLI, et al.
How Good Is The Mechanic. Another major factor determining how effective a model can be is the experience and creativity of the modeler. All else equal, a modeler with more experience (and who was well trained) should be able to build a better model than a modeler with less experience.
Where experience and training speak to modeling being a science, creativity as a characteristic speaks to modeling being an art. Even with the same type of data (e.g. workers’ comp), there may be curveballs in the current data set that require an approach completely new to the modeler. A creative modeler will be able to identify and take advantage of these nuances to ensure a sound and effective model.
Who is Driving. The last factor that could drive a material difference in the effectiveness of a model is the model user him/her-self.
Users do not need to understand all the math and statistics behind the development of the model, but they should understand the outputs of the model, where there is certainty and where there is ambiguity. Much of this should be communicated from the side of the modeler, but third parties can assist in this space.
In the end, putting an inexperienced driver behind the wheel of a vehicle with an underpowered engine hastily put together can lead to a less than fluid driving experience and maybe even prevent the driver from arriving at his/her final destination. In reality, such a situation could result in unprofitable decisions being made. Because of the potential consequences, PEOs should thoroughly vet any analytics being developed or procured and ensure all end users are properly trained.
The Potential for Analytics (and Regulation) in the PEO industry
As I mentioned in my inaugural blog post, the data available within a PEO – and within the PEO industry – is unique. No other industry has the breadth or depth of data that the PEO industry has. Where economists may have to lean on surveys and sample sets of employment and wage data, PEOs have actual data that has the potential to dwarf government sample sizes3. Tack on the fact that they also have data related to one’s HR, Health Benefits, 401k, and other personal selections, and the opportunity to build truly robust and novel analytical solutions is virtually limitless.
The opportunity to do good also comes with opportunity to do not-so-good. Notwithstanding existing state and federal regulations that govern the use of data and its use in modeling, there exist many other potential pitfalls for the PEO industry. For example, a modeler could use information that s/he may not be allowed to utilize, there may be moral issues with using certain variables, and there may be intentional or unintentional discrimination occurring based on the model design4.
Because of the many potential issues that can arise, it comes as a surprise to me that there are relatively low levels of regulation and oversight. The insurance and mortgage industries have many similar functions to the PEO industry – such as underwriting and ratemaking – and yet both insurers and lenders are more scrutinized at both state and federal levels5. As the PEO industry evolves and better utilizes its data and risk programs, I anticipate greater regulation will follow suite6.
There is so much opportunity for analytics to improve the PEO’s way of life as they know it, so it is imperative that leadership understand what is being developed for and sold to them. We have found value in every model that we have seen in the PEO industry, which is most but maybe not all that are being used in the industry. Our recommendations when considering your many options is to make sure that you know what the products are, and are not, so that you make the best decision possible for your PEO.
- What is under the hood of analytics you are currently using?
- What do you foresee as the next data-based product PEOs will provide to their clients?
- Do you think the PEO industry is over-regulated, under-regulated, or appropriately regulated? Where do you see this going in the future?
- Can the PEO industry be effectively self-regulated?
 It is common for a single model to contain both supervised and unsupervised techniques.
 For example, the National Council on Compensation Insurance (NCCI) utilizes GLMs in their rate making.
 For example, ADP by itself has data on 1 out of every 6 paychecks in the U.S.
 Please note that we are not lawyers by trade and thus are not aware of all the legal issues and ramifications involved in this discussion.
 Admittedly much of this is due to the involuntary nature of these products, but arguably the broader collection of PEO data could lead to more concerning scenarios.
 As an aside, NAPEO has done a good job being proactive at both federal and state levels to ensure the PEO is growing responsibly and finding continued success.
|Frank Huang has more than 15 years of actuarial consulting experience serving a wide range of clients, including serving as ADP’s Chief Actuary. Learn more about our PEO consulting practice here.|