Distrust and Verify

By Mary Pat Campbell

The Stepping Stone, November 2021

ss-2021-11-campbell-hero.jpg

ASOP 23, “Data Quality,”[1] has long been one of my favorite Actuarial Standards of Practice, and gives some guidance on checking data. There is an expectation that if we’re using data in a material way, we at least consider that there may be problems involved. Of course, before one gets to the data behind this research, there is a reliance on “experts,” which we should also questioned. The new ASOP 56, “Modeling,”[2] may be good to consult.

I learned a lesson about reliance on data from an article I published in February 2013,[3] “Everybody Cheats, at Least Just a Little Bit.” It was a review of Dan Ariely’s book, The (Honest) Truth About Dishonesty.[4]

It turns out that an insurance example I used to illustrate the theme of the book was based on fraudulent research.

Maybe I should have taken a hint from the title of the book.

Explanation of the Fraud and its Detection

The experiment said that personal auto insurance applicants would self-report the number of miles they drive per year. There were two different application set-ups, one in which they signed a statement that all the information on the application was truthful in the usual place, at the end of the application; and the other had them sign the statement before filling in all the information.

The claim was that those who attested first and then filled out the rest of the application had an average self-reported mileage 2400 miles above those who attested at the end.

There is a dismissive statement regarding what the insurance company did with those results (to wit: nothing), but I should not have been so dismissive. It turns out the data were faked from the original research.

Different researchers trying to replicate the results looked at the original data, and saw some anomalies. Additional researchers looked at the data from the attempted replication (published 2020)[5] and data released on the original experiment (published 2012),[6] and found even more anomalies. The details can be found in a blog post at Data Colada.[7]

Pretty much all the anomalies, such as indications that a uniform pseudorandom number generator (PRNG) was used to generate odometer readings, pointed to data falsification.

The most hilarious “tell” was that two different default Excel fonts were used in the files; half the data used Calibri (the default sans serif font in Microsoft Office applications, and usually the default font in Excel), and half the data used Cambria (the default serifed font). It seems like the original data were in Calibri, somebody copied the data over and then used a PRNG to add to those numbers to try to hide the falsification, and somehow switched the font while doing so. Perhaps they had switched the font so whoever doing the falsifying could eyeball the original versus the copies.

Distrust and Verify

As the authors mentioned in the Data Colada write:

“We have worked on enough fraud cases in the last decade to know that scientific fraud is more common than is convenient to believe, and that it does not happen only on the periphery of science.
....
“A field that ignores the problem of fraud, or pretends that it does not exist, risks losing its credibility. And deservedly so.”[8]

I should have known to have been suspicious of the claim that those shortsighted insurance companies ignored the oh-so-wise scientists and their results. Insurance companies do not leave money on the table without good reason. Yes, some individual insurers do less-than-optimal things, but the entire industry generally does not without consequences. If insurers were underpricing risk, especially in an industry as short-tailed as personal auto, it would become rapidly clear.

So no, there wasn’t “one neat trick” that insurers could do and get more truthful odometer readings from drivers. As it is, many insurers have developed ways to directly measure not only how many miles someone drives, but where they’re driving and how erratically they drive. That’s a far better measure of the risk involved than moving a signature line on an insurance application.

So, this is my first lesson from this episode: distrust and verify.

I had not looked into the research result myself. I went by the author’s description (one of the co-authors on the original research) in his own book. If I had looked up his published paper, not only would I have found the details, but also I would have found the original data. Many of the anomalies in the data are not difficult to find, once you know to look for weirdness.

Consistent with ASOPs 23 and 56, if one were going to base one’s modeling on such results, then it’s time to actually test the original research.

While ASOPs 23 and 56 apply to U.S. actuaries, these are broad professional standards that can be used beyond the U.S., and even beyond actuarial work. Internationally, many different people are seeing new requirements to disclose modeling results and tests, so testing assumptions will have to be included in our regular work.

Keeping Motives in Mind

As mentioned, a default mode for trying to include findings from outside our own work should be: distrust and verify. There are many different motives involved in publishing research, and I should have remembered that.

It is not clear (yet) who perpetrated the fraud. It is extremely unlikely that senior people at the insurance company would have any incentive to do the type of data manipulation involved.

In the insurance business, there are repercussions for denying reality. In business, a “negative” result—finding that there is no “there” there—is a positive result. When you realize there’s a factor that doesn’t make a difference, you can safely ignore it, leaving it out of your models and your operations.

I’ve often been the person checking arguments that certain dimensions were driving results. Through various modeling projects I’ve been involved in, I’ve had access to policyholder transaction files (for annuities), and could test out proposed models directly. These past two years, I’ve been dissecting various COVID-related data sets, testing out different hypotheses. When I find out that something people think is driving a result actually shows no correlation whatsoever, that’s a publishable result for me, especially if it’s a correlation that many people think does exist. If the lack of correlation is real, that is something insurers need to know.

But in academia, it is difficult to publish a paper that says: “We hypothesized that treatment A would make a difference, and there was no difference between the default and treatment A.” Even if it is true, and even if other people would like to know that information, academic journals do not want to publish these negative results. That there’s no difference is boring.

But boring is the nature of actuarial work. Or, rather, when actuarial work gets really interesting, that’s generally a bad time for everybody involved.

Ok, I’m kidding about actuarial work being boring. But the point is, I trusted published results too much, and ignored the practical, real-world actions of insurers in light of this supposed true result. The margin in the printed research was a 9 percent difference—imagine that personal auto insurers were underpricing auto by 9 percent! Ouch.[9]

This is not to say that the academic papers that show a new correlation are always wrong, and that business is always right. I’m saying that the consequences for being right or wrong are different for academic researchers trying to get published and businesses trying to set pricing strategy.

The Consequences

So far, there have been small consequences for Dan Ariely and his co-researchers, so far as I know. The editors of the journal issued a retraction of their 2012 paper.[10] Ariely had conveyed his research into a series of popular nonfiction books, which I have reviewed, and he still has a column at the Wall Street Journal. He maintains his professorship at Duke University, and he has his association with the Center for Advanced Hindsight, which he founded. It remains to be seen if he will suffer professionally in any way. If the funding continues to flow in, what matters if that one little experiment went awry?

However, I have learned to be a little more suspect of such “one simple trick” methods for making insurance easier.

In the age of InsurTech and increased risks in a pandemic, a bit of skepticism in the face of gee-whiz claims is useful. It does feel a little like righteous thinking “what myopic management!” and I should have distrusted the result just due to my own biases. Here I thought I was being a sophisticate, and I was just as gullible as people who believe clickbait articles.

One thing this did cement for me is the importance of our own professional standards as actuaries. If an actuary were involved in falsifying this data,[11] there would be consequences in terms of the danger of having one’s credentials suspended or being expelled entirely from actuarial organizations. Actuaries’ credibility has been hard-won through a long history of not only developing better practices, and better standards, over time, but also policing those standards.

In light of those standards, I must respectfully retract part of my old article.

And I must remember to be more skeptical in the future.

Statements of fact and opinions expressed herein are those of the individual authors and are not necessarily those of the Society of Actuaries, the newsletter editors, or the respective authors’ employers.


Mary Pat Campbell, FSA, MAAA, is vice president, Insurance Research, for Conning in Hartford, Conn. She can be contacted at marypat.campbell@gmail.com. LinkedIn: https://www.linkedin.com/in/marypatcampbell/


Endnotes

[1] http://www.actuarialstandardsboard.org/asops/data-quality/, accessed 4 October 2021

[2] http://www.actuarialstandardsboard.org/asops/modeling-3/, accessed 4 October 2021

[3] SOA copy: https://www.soa.org/globalassets/assets/library/newsletters/stepping-stone/2013/february/stp-2013-iss49-campbell.pdf

[4] Amazon link to book: https://amzn.to/3l4Z7Hh

[5] PNAS March 31, 2020 117 (13) 7103-7107; first published March 16, 2020; https://doi.org/10.1073/pnas.1911695117

[6] PNAS Sept. 18, 2012, 109 (38) 15197-15200; first published Aug. 27, 2012; https://doi.org/10.1073/pnas.1209746109. There is a retraction on this, dated Sept. 13, 2021.

[7] http://datacolada.org/98 “[98] Evidence of Fraud in an Influential Field Experiment About Dishonesty,” published Aug. 17, 2021. Accessed Oct. 4, 2021.

[8] Ibid.

[9] Of course, as P&C actuaries know, pricing in the P&C world is also hugely driven by the competitive environment, so “hard” versus “soft” pricing cycles can often have an amplitude swamping out this 9 percent difference.

[10] https://www.pnas.org/content/118/38/e2115397118, retraction notice based on the Data Colada posting, issued Sept. 13, 2021.

11] Given incentives, I hugely doubt an actuary or even an actuarial student was involved in falsifying the data.