Thinking about thoughts, fourth downs, and the nature of evidence

belichickWhen it happened, I knew the Belichick story would be big, but I think few could have anticipated the shape or dimension of the conversation. Some of this I credit to the rise of new media: The immediate reaction to the call on NBC and ESPN was: Bad, awful, stupid call. But there was an undercurrent chorus of, “Hey, wait a minute. It actually kind of made sense.” I’d like to count myself as part of that chorus, but clearly the guy who quite nearly turned the entire debate on its head was my friend and New York Times co-blogger Brian Burke, whose post on Belichick’s call was cited everywhere from ESPN apparatchik Adam Shefter’s twitter feed to a piece by the excellent (and decidedly mainstream) Joe Posnanski on (I’d like to think I helped, as I linked to Brian’s bit within about a half hour after the game, and my tweet of his piece was one of the most retweeted things I’ve ever sent.)

Credit where it is due, the interesting thing is what happened after that: A mess. Some people ossified in their views: Trent Dilfer tried to back up his bombastic criticism of Belichick, though he had more passion than arguments. Peter King said the call “smacked of I’m-smarter-than-they-are hubris,” and compared Belichick to Grady Little. In the process, King messed up his math, but that was really besides the point for him. The call just didn’t feel right.

Although some stats junkies went the other way and proclaimed that it would have been affirmatively stupid for Belichick to have punted, most people, when faced with the compelling statistical evidence that the odds were roughly in Belichick’s favor (or at least so close as to be even with all the late game variables at play), were left in a fit of consternation. And this is why I think the decision has struck a national chord. It gets to the core of how people see themselves versus how they actually make decisions.

Most people fancy themselves as being driven by the evidence such that they will always follow it, but that’s not really true. As amazing and wonderful as the human brain is, it is full of inherent biases, and information, even compelling information, that does not comport with those biases is often devalued, even on a subconcious level. (One famous experiment confronts people with radios where the speaker is discussing views contrary to or similar to those already held by the listener, but the volume is set too low to be heard well. The listeners frequently turn up the volume when the speaker is saying things they already believe; they rarely turn the volume up if the speaker is discussing the contrary views.)

And so it was with the Belichick debate. It’s not that you must agree with the decision, but any reasonable person has to say, as Posnanski did, “Well, hmm, it seemed nuts at the time but I get it now, based on the evidence.” As Keyes said, “When the facts change, I change my mind – what do you do, sir?” Yet many people still refuse to reconsider their view on the subject. It was wrong and no degree of evidence can change my view or even make me reconsider. Consider Colin Cowherd’s admonition on SportsNation that “stats are overrated.” (Though I agree that many stats are.) The upshot is that, despite our best views of ourselves, it is very difficult to actually say that we are rational creatures in practice. As Jonah Lehrer wrote:

The reason I bring up this analysis is to demonstrate that even defensible decisions can have wrenching emotional consequences. Belichick’s call might have been statistically correct, but it felt horribly wrong.

. . . The point is that there’s often an indefatigable gap between the rigors of cost-benefit analyses and the emotional hunches that drive our decisions. We say we want to follow the evidence, but then the evidence rubs against a bias like loss aversion, and so we make an exception. We’ll follow the evidence next time.

It’s not really fair to pick on Tony Dungy, who was an excellent football coach, because his excellence had nothing to do with any training in statistics or probability. But his comment that “you have to play the percentages and punt” is symptomatic of a wider issue, which is that when something “feels horribly wrong” we inherently want the evidence to comport with that feeling and we convince ourselves that it does. Dungy is a conservative guy, he likely would say that punting gives him plenty of chances to win, he’s a defensive coach so he has no qualms about showing faith in his defense, and, bottom line, the idea of putting that much significance on one play just didn’t sit well with him. That’s all fine, but it has nothing to do with the percentages. Yet his brain and experience had told him that somehow the percentages supported it too, and thus Belichick’s move was the “risky gamble.”

The fourth down debate is significant (though I risk inflating its significance), because it forces you to consider how you actually tackle problems. Indeed, the entire point of probability, statistics, and science generally is to make progress in spite of, not because of or consistent with, our preconceived biases:

This does not mean that one should reject intuition and reflexive feeling. These stances often encapsulate the wisdom of evolution (e.g., aversion to sibling-sibling incest) and/or society (again, aversion to sibling-sibling incest). The totally rational life, where all acts and opinions are subject to deep and thorough criticism, is not the human life (even Karl Popper was more of a theoretical critical rationalist than an operational one judging by his private and personal actions and style of argument). But, serious problems emerge when our intuitive prejudices push themselves into the scientific domain. Natural science has over the past few centuries has proven itself to be a marvel not by extension of our intuition, but contravention of that intuition resulting in an even closer fit to reality (contrast Newtonian physics with “folk physics”). Humans have always had engineering in the form of tinkering with technology. But the last two centuries of productivity growth through mechanical improvements have been based in part on the rise of science as a theoretical framework which allows for more than trial & error experimentation guided by intuition. Science allows us to stand on the shoulders of giants, no matter how bizarre or counterintuitive their theories are, because they are judged not on plausibility but predictivity.

Of course, one of the fascinating things about the brain is it can be trained. I do not think Belichick worked out the numbers as Burke had. Yet he didn’t have to. His intuition was the kind of specialist’s ingrained intuition that came from years of thought about just such issues. Belichick, an economics major, had long thought outside of the box in terms of fourth downs, and we know he is familiar with David Romer’s research on the subject. When presented with the possibility of the fourth down, his intuition, built on three decades of thinking about fourth downs and many, many trials where his team had succeeded and won the game in such circumstances, that he knew the odds. This is the difference between a specialist’s intuition and a layman’s. Yet this is also the point of doing the analysis like I tried to do and Burke and others did: It trains your mind. The more you think in terms of possibilities and potential outcomes, the less you are fooled when some rare or at least relatively unlikely outcome occurs anyway. Ask any poker player.

As a counterpoint, I love the guys at Football Outsiders, but I was generally disappointed with their response to this. They are big “stats guys” in the sense that they track a lot of data and do a lot of good work to try to determine who the best teams, players, coaches, and the like are, but you don’t see a lot of probabalistic/stochastic reasoning in their work. Burke, on the other hand, is big Nassem Taleb fan, and his reaction, like Belichick’s, was to think about the variables at play and to mentally move them around to figure out what really was the best call. Even stats guys can have faulty intuition on these issues. (Barnwell and others eventually went back and crunched some numbers and admitted that it was at least quite plausible that Belichick made the right call, which in my view is an understatement.)

And yet, the unemotional Belichick aside, humans are not machines and do not make decisions like them; and nor should they. The emotional side of the brain — the side with all these crazy biases — is often our only hope for processing huge amounts of information in a limited time span, i.e. the seconds a coach has to make a fourth down call. Think of pilots, soldiers and their commanders during a firefight, lawyers being questioned by judges, doctors during surgery, or all manner of “learned” yet time-sensitive decisionmakers. And there is a human side to many stories. Even in this one, many have justified their rejection of Belichick’s call (mostly on an ex post basis, however) because “what message does that send to your defense.” And maybe there is something to that: Once Manning got the ball around the thirty, with the crowd and the frenzy of the moment, the defense gave up a huge run to Joseph Addai and Manning threw a touchdown pass shortly thereafter. (On the other hand, an already depleted New England defense had been decimated by injury throughout the game, and #18 is a very good quarterback, or so I have heard.)

So we don’t want decisions made only on the measurable evidence, always and forever. But this debate has reinforced a somewhat cynical view of people that I have. There are two basic types: Those who, when confronted with evidence that challenges their instinctive or “gut” reaction, are cynical of their gut, or those who are cynical about the nature of evidence itself. I think over the years of writing this blog I have shown that I am clearly in the former camp. Note that this does not mean you always and forever follow the first evidence that is shown to you: Often we have “gut” feelings for a reason, and some of the best work is done when some support is shown for a proposition that feels wrong, and then people try to figure out why they feel so differently about it. In those cases, the evidence either survives or is even improved (and hopefully some minds are swayed), or the rigorous testing shows that there was some flaw in the evidence. But this view almost always leads to a healthy, respectful debate, and we all learn through the process.

On the other side are those who distrust anything not in their gut. And these people, like Tony Dungy, might have very good instincts. But the result is the dismissal of many good ideas, along with any pretense at debate. “Why is that wrong?” “It just is. I’m telling you.” The sad fact is it is easier to dismiss or ignore arguments (and people) than it is to engage with them or to justify your own views.

Now, whether or not some coach went for it on fourth down is a pretty silly thing to get worked up about. Yet I think the reason people have is that this deep divide — between the instinct sceptics and the evidence sceptics — has become exposed again. To be fair, football is a fair place to leave rationality at the door, and most people, including me, no doubt occasionally operate in the opposing camp depending on the issue. But following the evidence is a lot harder than we usually allow. And for doing that here, Belichick deserves credit. May we all be so bold.

  • Chris G

    As a philosopher who has done some game theory/decision theory, I have to say that you are right on the money. Experiments have shown time and again that people’s instincts in complex decisions are seldom correct. I’m not against going with the gut, but seldom do we make the most “rational” choice as dictated by the numbers. Belichick made a good call (rationally speaking), even though it turned out bad.

  • wheaton4prez

    Is this article about the difference between liberals and conservatives? 😉

  • John Z

    Funny how you stick to facts and common sense with football, yet with politics you throw them out the window with your obvious slant towards liberalism.

  • Keyes?

    I think you mean Keynes.

  • Bill45.

    Another great post and discussion Chris.
    If BB’s calculation was that if the ball’s in Brady’s hands we win and if the ball’s in Manning’s hands we lose then he absolutely made the right decision to go for it. So they didn’t make it. BB needed to carry his original calculation on through to the end of the game let the Colts score on thier next play from scrimage putting the ball back in Brady’s hands for the final drive.
    If he had punted the calculation would been to defend more or less conventionally.
    But what do I know, I can barely play checkers.

  • Shane M

    Is the decision to go for it on 4th and short deep in your own territory consistent with other decisions made throughout the game? I guess I’m trying to understand the universe of decisions where it does make sense to punt the ball the Peyton, or do you pretty much go for it on 4th and relatively short the entire game?

  • Bill45.

    Shane you can look at sort of the opposite situation where NE received the kickoff at the beginning of the game. It’s 2 minutes into the first Quarter and it’s 4th and 2 at the same position on the field. The calculation is vastly different. No ebb and flow has been established as it had already been in the 4th qtr. of the actual game. Winning or losing is still basically a 50/50 proposition at this point if he considers the 2 teams as roughly equal overall. So at this point he’s playing for field position and a punt is probably the call. In the real game with about 2 min. left field position is not as big a factor. The Colts offense seems to have the advantage over his NE defense and situation and history has informed him that NE is unlikely to be able to stop them from scoring a TD.

  • JeffW

    The real story, as noted by Kilgore, is the mismanagement of the last 3 minutes of the game, not the 4th and 2 call. The Patriots were clearly not decided on the course of action if they don’t get the 3rd down conversion. This is evidenced by the personnel confusion. The best call is to run the ball and keep the clock moving, and possibly, shorten a 4th down attempt (if they “knew” they were going to go for it). They may even convert it right then.

    The factors preceding the 4th down call (no timeouts, clock is not moving, potential turnover near the red zone) count in the call to go for it or not. The decision cannot be made in a vacuum.

    The Colts out play-called them on third down with the blitz. The offenses calls for 3rd and 4th down leave much to be desired. While going for it seems courageous and maybe even smart, the game management before that point was pretty awful.