“Compliance Sabermetrics" – Data Will Change Assumptions That Plague Compliance

In 1969, the computer systems company Information Concepts Incorporated transformed the world of baseball with their creation of “The Baseball Encyclopedia.” The book (affectionately nicknamed “Big Mac” in homage to both its heft and its publisher, Macmillan) gave the sport its first fully comprehensive and rigorously researched compendium of baseball statistics – and inspired a generation of fans interested in baseball history and statistical research. In 1971, sixteen of these “statistorians” formed the Society for American Baseball Research (aka SABR or “saber”) and began using the Big Mac to develop new and innovative measures to compare players and predict outcomes. By 1980, this practice had been given a new name: sabermetrics.

Today, sabermetrics is an integral part of Major League Baseball. Virtually every MLB team employs sabermetricians, who replace “experience” and “intuition” with empirical data analysis. The approach has enabled teams like the Tampa Bay Rays and Oakland Athletics to build winning records on modest budgets (most famously illustrated in the 2003 book and 2011 movie Moneyball). This disruptive quality, along with its tendency toward counterintuitive maxims, has helped this “big data” approach to decision making capture people’s imagination.

A similar wave of change has started in compliance with data. For a long time, mostly due to data limitations, there was little examination into the cause and effect of different efforts in compliance systems. Even worse, there was no evidence for what compliance data was telling us about those systems. For example, are elevated hotline report volumes a good thing, illustrating an employee willingness to speak up, or a signal that the organization had deeper problems? Without additional information, this single statistic can’t tell the story. As a result, compliance officers have largely had to rely on their own experience and intuition when interpreting data. While these can provide valuable insights, they also open the door to incorrect assumptions and personal biases.

Uncertainty with decisions and data is beginning to change in compliance. We are entering into the golden age of understanding the information provided by compliance data to discover signals and causal mechanisms in compliance.

Those at the cutting edge of this movement are giving birth to new rigorous data protocols in compliance, enabling compliance officers and organizations to more easily contextualize risk signals and better predict outcomes.

In other words, hotline reporting data is an incredibly valuable set of information to compliance leaders and executive leadership. Reporting information is the pulse check of organizational culture and should be weighed, analyzed, and acted on accordingly.

Data’s Counterintuitive Insights for Compliance

A few counterintuitive insights have emerged from the initial efforts in this area. One of the most frequently noted is that firms with more actively used compliance reporting programs - those receiving more reports per employee- perform “better” in almost every measure (i.e., more profitable, better governance structures, less negative media coverage, etc.).

Moreover, organizations with the highest volume of reports per employee were the least likely to suffer lawsuits and fines; and those that did paid less on average than their peers in fines and settlements.

These metrics undercut the widely-held assumption that more reports is indicative of more problems.

This may beg the question, “What does the data tell you about quality of reports as report volume increases?” Compliance officers are certainly aware of the misuse of feedback systems and may be rightly concerned with time and resources wasted on bad faith reports. Additionally, as with most business activities, there is almost always diminishing returns at larger scales of investment. The question is, has this happened with feedback systems?

In a crude analysis of report quality, we measured the prevalence of two factors as reporting volume increased: named (vs. anonymous) reporting and report completion. Previous analysis of reporting data has demonstrated that reports which include the identity of the reporter and those with fully completed information (e.g., fields including management involved, time it has been going on, how was it discovered, etc.) were most likely to be useful or informative, or less difficult to examine. Granted, it is not known how important each individual report in these two categories is, but there is a clear difference in the measure of quality in providing their own identity for follow up and including standard information details about the reported problem.

Ex-ante we would likely expect the information quality of reports to go down as volume increases. As more individuals make reports it seems natural for there to be more problems and limitations with those users providing that data. The chart below shows this is not the case.

In the chart we separate the report volume by quintiles, with 1 being the lowest 20% report volume in a given firm year and 5 being the highest volume (top 20%), controlling for industry and other firm factors.

Instead of finding what we would assume, we observe quality of information (i.e., completeness) of reports increases with report volume. This pattern, in combination with other evidence is consistent with the assertion that firms with higher reporting volume likely 1) have more training and information on effective use of reporting systems resulting in 2) higher levels of information in those reports and 3) more problems being uncovered before they get worse.

This is just one example of the counterintuitive insights emerging from this field that are changing assumptions about compliance and employee feedback systems. The leading edge of the compliance industry is exploring what I call compliance sabermetrics – the use of big data to provide additional insights to management. These efforts are causing management to change their perspectives of compliance.

Under the old management mantra, when an audit committee observed higher-than-benchmark employee feedback through their reporting system, they might have wrongly asked, “Why do we have more problems than our peer firms?” The new wave of insights from data will cause a different question to be asked in the future: “Do we have the resources to make sure we are effectively investigating our increased information from employees?”

Internal reporting data should be treated as the wealth of information it is. Creating a culture that allows for honest reporting and appropriate follow up to rectify problems should be prioritized by executive leadership.


The increasing collection and analysis of compliance data will further challenge long-held assumptions about which metrics warrant attention and what they indicate about a company’s organizational culture and health. Successful firms will invest in these efforts, de-emphasizing intuition in favor of empirical data analysis.

Download the full 2022 Top 10 Trends in Risk and Compliance

Chat with a solutions expert to learn how you can take your compliance program to the next level of maturity.

Is Governance the “Bedrock" of ESG?

The past few years have seen an impressive increase in interest around the topic of Environmental, Social, and Governance (ESG). While much of the ESG conversation and news is centered around Environmental aspect, this post is focused on Governance and asking the question, is Governance the "bedrock" of ESG?

Previous/Next Article Chevron Icon of a previous/next arrow. Previous Post

EU Whistleblower Protection Directive: FAQs and Latest News

This NAVEX blog answers some of the most frequently asked questions and discusses the June 109-E/2021 and 93/2021 whistleblowing law updates in Portugal.

Next Post Previous/Next Article Chevron Icon of a previous/next arrow.