All We Know Is That We Don’t Know Nothing
It’s been a while since I last wrote one of these pieces. Then again, it’s been a while since I’ve been this angry. You should always write when you’re angry, though editing is probably best left until after you’ve calmed down.
One of the things you get used to as a statistician is that an awful lot of people don’t have any firm understanding of what your job entails. In general, of course, this is fine; there are any number of occupations and disciplines I’d be completely in the dark on myself. Fourteen years after my physics A-level, I still can’t offer any explanation of electricity more complicated than “Wizards, dude; wizards and fairies”. The guy who fixes my sockets might as well be Gandalf with a tool-kit.
The public comprehension gap only becomes the public comprehension problem when a statistician tries to tell that public – or elements within it – something that they don’t want to hear. And right now, as the United States decides who’ll be skipping round the Oval Office for the next four years, any number of statisticians are trying to tell any number of people any number of things that don’t seem to be going down too well.
Today’s question, then, is this: how do you respond to seeing a statistical model that shows you something you instinctively don’t like or don’t believe? We’re going to be using Monday’s piece by Dylan Byers as a case study, because it’s so wretched and unpleasant a response, so insulting to the intelligence of his readers and the professionalism of statisticians, that it constitutes an almost perfect example of exactly what not to do.
What has Byers so tied up in knots is the statistical model of Nate Silver, which attempts to predict the way each state will go in the presidential election, and hence predict the ultimate winner. The way Silver does this is quite complicated, and I don’t want to get into the ins and outs here, but crucially I could if I wanted to, because Silver is entirely upfront about how his model works.
Here is Byers' dilemma, then: Silver has explained how his model functions – as well as the limits restricting just how accurate such a model can ever be - but Byers apparently can’t understand the explanation. He lacks the statistical background. Again, not having that background isn’t remotely bad in and of itself. What makes Byers’ piece so galling is that he doesn’t seem to think that lack of knowledge should be a bar to rubbishing Silver’s work.
Douglas Adams understood the problem with Byers’ approach very well. There’s a short section in The Salmon of Doubt in which he describes becoming uncomfortable watching a stand-up comedian mocking aeronautical engineers. If a black box is indestructible, asks the comic, then how stupid must you be if you don’t make the plane out of the same material?
The reason this wouldn’t work, as Adams points out, is that a titanium plane would be too heavy to fly. Even if that were overcome, the only difference between crashing in a tube of titanium and a tube of composite materials and aluminium alloy is how much of the surrounding area would be splattered with pieces of passenger. A deathtrap that doesn’t leak is only of benefit to the cleaning crew.
So Adams always distrusted the suggestion that a layman could uncover fundamental problems with complicated systems by putting ten minutes thought into the whole idea*. Byers, we can assume, never read Adams, or if he did, that he didn’t understand what Adams was saying any more than he did Silver’s maths.
Or maybe he did, and this is all just a massive deception. A stand-up comedian can just pretend to be stupid for the sake of a laugh, after all. Maybe Byers is pretending to be stupid for the sake of a paycheck. The difference, of course, is that the guy behind the mic stand isn’t making any claims to be informing the public.
I’m not trying to claim these things are absolutely airtight, obviously. Occasionally everyone makes a dumb mistake, and that dumb mistake might even be obvious to people with little background in the field in question. We should be profoundly suspicious of Byers, but we can’t dismiss him until we work out if he’s wrong, and why.
Fortunately, Byers makes this process embarrassingly easy, given he finishes his argument with saying if something occurs which a model said only had a 1 in 4 chance of happening, that model shouldn’t be trusted (actually, he implies the guy who wrote the model shouldn’t be trusted, but that’s just a more unpleasant and sneering way to say the same thing). This sort of argument is actually very common, so it’s worth picking at it for a few moments. It all comes down to exactly what is stated. Silver is not saying that Obama will win. He is saying Obama is three times more likely to win. Much like one could say rolling a total of seven on two dice is three times more likely than rolling a total of ten. If I tell you that, and then you do in fact roll a ten; that makes you lucky. It doesn’t make me wrong. The people who claim they know the future are charlatans. The people who claim they have some kind of handle on what’s likely to happen are called analysts.
JM Keynes made this point very well in his Treatise on Probability: a model cannot be proved wrong by future events unless the model concluded those events were incredibly unlikely. Even that is not automatically disqualifying, since even incredibly unlikely events must happen occasionally, and with so many statisticians in the world, sooner or later someone’s perfectly functioning model will lead to a very surprising result. If I keep telling people they’re unwise to bet any money on pulling a royal flush from a shuffled deck, sooner or later someone’s going to manage to pull it off and wave the cards in my face. It’ll just take six hundred and fifty thousand goes or so to get there.
Really, the only way you can show a model is incorrect is by getting inside it and fiddling with its assumptions or its mechanisms (which is why those statisticians who won’t explain their methods often are worth the kind of treatment Byers so unfairly offers here). Just deciding you don’t like what a model says and hoping watching what happens next will bear you out really isn’t going to cut it. I’m not wrong to be surprised when you roll a double six. The weather forecast was not wrong if it suggested a 20% chance of rain, and you got washed out at your family picnic. We tell you your chances; what you do with them is your own business.
Unless what you do with them is to question the professional skills of someone in a field you don’t know a damn thing about. That’s something anyone with any class, or even a basic conception of self - should want to avoid. To answer today’s question, then: how do you respond to fairly-explained statistical models you don’t like or believe? You ask another statistician to take a look. You sit down and get a basic handle on statistics for yourself.
Or you shut the hell up.
*Another excellent example of this phenomenon came when climate change deniers jumped all over the fact that some climate models were comparing temperature readings from cities with those in the countryside (where it will be on average cooler), as though the scientists putting the models together couldn’t possibly have thought to compensate for that.