Thinking about thoughts, fourth downs, and the nature of evidence

belichickWhen it happened, I knew the Belichick story would be big, but I think few could have anticipated the shape or dimension of the conversation. Some of this I credit to the rise of new media: The immediate reaction to the call on NBC and ESPN was: Bad, awful, stupid call. But there was an undercurrent chorus of, “Hey, wait a minute. It actually kind of made sense.” I’d like to count myself as part of that chorus, but clearly the guy who quite nearly turned the entire debate on its head was my friend and New York Times co-blogger Brian Burke, whose post on Belichick’s call was cited everywhere from ESPN apparatchik Adam Shefter’s twitter feed to a piece by the excellent (and decidedly mainstream) Joe Posnanski on SI.com. (I’d like to think I helped, as I linked to Brian’s bit within about a half hour after the game, and my tweet of his piece was one of the most retweeted things I’ve ever sent.)

Credit where it is due, the interesting thing is what happened after that: A mess. Some people ossified in their views: Trent Dilfer tried to back up his bombastic criticism of Belichick, though he had more passion than arguments. Peter King said the call “smacked of I’m-smarter-than-they-are hubris,” and compared Belichick to Grady Little. In the process, King messed up his math, but that was really besides the point for him. The call just didn’t feel right.

Although some stats junkies went the other way and proclaimed that it would have been affirmatively stupid for Belichick to have punted, most people, when faced with the compelling statistical evidence that the odds were roughly in Belichick’s favor (or at least so close as to be even with all the late game variables at play), were left in a fit of consternation. And this is why I think the decision has struck a national chord. It gets to the core of how people see themselves versus how they actually make decisions.

Most people fancy themselves as being driven by the evidence such that they will always follow it, but that’s not really true. As amazing and wonderful as the human brain is, it is full of inherent biases, and information, even compelling information, that does not comport with those biases is often devalued, even on a subconcious level. (One famous experiment confronts people with radios where the speaker is discussing views contrary to or similar to those already held by the listener, but the volume is set too low to be heard well. The listeners frequently turn up the volume when the speaker is saying things they already believe; they rarely turn the volume up if the speaker is discussing the contrary views.)

And so it was with the Belichick debate. It’s not that you must agree with the decision, but any reasonable person has to say, as Posnanski did, “Well, hmm, it seemed nuts at the time but I get it now, based on the evidence.” As Keyes said, “When the facts change, I change my mind – what do you do, sir?” Yet many people still refuse to reconsider their view on the subject. It was wrong and no degree of evidence can change my view or even make me reconsider. Consider Colin Cowherd’s admonition on SportsNation that “stats are overrated.” (Though I agree that many stats are.) The upshot is that, despite our best views of ourselves, it is very difficult to actually say that we are rational creatures in practice. As Jonah Lehrer wrote:

The reason I bring up this analysis is to demonstrate that even defensible decisions can have wrenching emotional consequences. Belichick’s call might have been statistically correct, but it felt horribly wrong.

. . . The point is that there’s often an indefatigable gap between the rigors of cost-benefit analyses and the emotional hunches that drive our decisions. We say we want to follow the evidence, but then the evidence rubs against a bias like loss aversion, and so we make an exception. We’ll follow the evidence next time.

It’s not really fair to pick on Tony Dungy, who was an excellent football coach, because his excellence had nothing to do with any training in statistics or probability. But his comment that “you have to play the percentages and punt” is symptomatic of a wider issue, which is that when something “feels horribly wrong” we inherently want the evidence to comport with that feeling and we convince ourselves that it does. Dungy is a conservative guy, he likely would say that punting gives him plenty of chances to win, he’s a defensive coach so he has no qualms about showing faith in his defense, and, bottom line, the idea of putting that much significance on one play just didn’t sit well with him. That’s all fine, but it has nothing to do with the percentages. Yet his brain and experience had told him that somehow the percentages supported it too, and thus Belichick’s move was the “risky gamble.”

The fourth down debate is significant (though I risk inflating its significance), because it forces you to consider how you actually tackle problems. Indeed, the entire point of probability, statistics, and science generally is to make progress in spite of, not because of or consistent with, our preconceived biases:

This does not mean that one should reject intuition and reflexive feeling. These stances often encapsulate the wisdom of evolution (e.g., aversion to sibling-sibling incest) and/or society (again, aversion to sibling-sibling incest). The totally rational life, where all acts and opinions are subject to deep and thorough criticism, is not the human life (even Karl Popper was more of a theoretical critical rationalist than an operational one judging by his private and personal actions and style of argument). But, serious problems emerge when our intuitive prejudices push themselves into the scientific domain. Natural science has over the past few centuries has proven itself to be a marvel not by extension of our intuition, but contravention of that intuition resulting in an even closer fit to reality (contrast Newtonian physics with “folk physics”). Humans have always had engineering in the form of tinkering with technology. But the last two centuries of productivity growth through mechanical improvements have been based in part on the rise of science as a theoretical framework which allows for more than trial & error experimentation guided by intuition. Science allows us to stand on the shoulders of giants, no matter how bizarre or counterintuitive their theories are, because they are judged not on plausibility but predictivity.

Of course, one of the fascinating things about the brain is it can be trained. I do not think Belichick worked out the numbers as Burke had. Yet he didn’t have to. His intuition was the kind of specialist’s ingrained intuition that came from years of thought about just such issues. Belichick, an economics major, had long thought outside of the box in terms of fourth downs, and we know he is familiar with David Romer’s research on the subject. When presented with the possibility of the fourth down, his intuition, built on three decades of thinking about fourth downs and many, many trials where his team had succeeded and won the game in such circumstances, that he knew the odds. This is the difference between a specialist’s intuition and a layman’s. Yet this is also the point of doing the analysis like I tried to do and Burke and others did: It trains your mind. The more you think in terms of possibilities and potential outcomes, the less you are fooled when some rare or at least relatively unlikely outcome occurs anyway. Ask any poker player.

As a counterpoint, I love the guys at Football Outsiders, but I was generally disappointed with their response to this. They are big “stats guys” in the sense that they track a lot of data and do a lot of good work to try to determine who the best teams, players, coaches, and the like are, but you don’t see a lot of probabalistic/stochastic reasoning in their work. Burke, on the other hand, is big Nassem Taleb fan, and his reaction, like Belichick’s, was to think about the variables at play and to mentally move them around to figure out what really was the best call. Even stats guys can have faulty intuition on these issues. (Barnwell and others eventually went back and crunched some numbers and admitted that it was at least quite plausible that Belichick made the right call, which in my view is an understatement.)

And yet, the unemotional Belichick aside, humans are not machines and do not make decisions like them; and nor should they. The emotional side of the brain — the side with all these crazy biases — is often our only hope for processing huge amounts of information in a limited time span, i.e. the seconds a coach has to make a fourth down call. Think of pilots, soldiers and their commanders during a firefight, lawyers being questioned by judges, doctors during surgery, or all manner of “learned” yet time-sensitive decisionmakers. And there is a human side to many stories. Even in this one, many have justified their rejection of Belichick’s call (mostly on an ex post basis, however) because “what message does that send to your defense.” And maybe there is something to that: Once Manning got the ball around the thirty, with the crowd and the frenzy of the moment, the defense gave up a huge run to Joseph Addai and Manning threw a touchdown pass shortly thereafter. (On the other hand, an already depleted New England defense had been decimated by injury throughout the game, and #18 is a very good quarterback, or so I have heard.)

So we don’t want decisions made only on the measurable evidence, always and forever. But this debate has reinforced a somewhat cynical view of people that I have. There are two basic types: Those who, when confronted with evidence that challenges their instinctive or “gut” reaction, are cynical of their gut, or those who are cynical about the nature of evidence itself. I think over the years of writing this blog I have shown that I am clearly in the former camp. Note that this does not mean you always and forever follow the first evidence that is shown to you: Often we have “gut” feelings for a reason, and some of the best work is done when some support is shown for a proposition that feels wrong, and then people try to figure out why they feel so differently about it. In those cases, the evidence either survives or is even improved (and hopefully some minds are swayed), or the rigorous testing shows that there was some flaw in the evidence. But this view almost always leads to a healthy, respectful debate, and we all learn through the process.

On the other side are those who distrust anything not in their gut. And these people, like Tony Dungy, might have very good instincts. But the result is the dismissal of many good ideas, along with any pretense at debate. “Why is that wrong?” “It just is. I’m telling you.” The sad fact is it is easier to dismiss or ignore arguments (and people) than it is to engage with them or to justify your own views.

Now, whether or not some coach went for it on fourth down is a pretty silly thing to get worked up about. Yet I think the reason people have is that this deep divide — between the instinct sceptics and the evidence sceptics — has become exposed again. To be fair, football is a fair place to leave rationality at the door, and most people, including me, no doubt occasionally operate in the opposing camp depending on the issue. But following the evidence is a lot harder than we usually allow. And for doing that here, Belichick deserves credit. May we all be so bold.

  • kj

    The call was wrong because it failed. If they had got the first down and eventually won the game he’s be a genius.

    My biggest problem was the play call itself. It was the wrong play to run in that situation. 1 7/8 yard pass on 4th & 2. I also think they got robbed a bit on the spot but that was the call so be it. I’ll not get into what I think about the ref’s calls this year except to say it is worse than I’ve ever seen before.

  • Tom

    Chris-

    Amen to that. You’ve put together a very good examination of human rationality and risk aversion in the decision making process.

  • Topher

    Chris,

    Great story – this debate shows so much about ingrained biases and inability to see the other side or admit there was another way. I think that is part of the coarse debate where people can’t see both sides, they have to take one side and run it for all it’s worth.
    (Also a consequence of having too many former pros on TV, whose cocksure nature was a big part of athletic success but plays poorly in the booth).

    I think either call would have been defensible. The play call was whacked – having just been near-picked on a curl right, running another one is tempting fate. A play action might have opened the same route up, or allowed a deeper out running past the biting defender. Or overload one side and jump-ball Randy Moss on the other like they did for the earlier TD. Many options, poor choice of play.

    Also, it was a regular season game. These teams will likely play again. Next weekend we’ll have a new controversy to discuss. If it had been the Super Bowl it would be even a bigger debate that lasted all spring.

  • Topher

    kj – “I’ll not get into what I think about the ref’s calls this year except to say it is worse than I’ve ever seen before.”

    I have just come to accept that no game will be officiated decently. Once they started playing games with the rules to favor the passing game I knew that the refs were just another piece of marketing the product.

  • Bill

    All good points, but I think there’s a first principle that I haven’t seen discussed anywhere, notably: Given the situation, what is the desired outcome here? I think any statistical analysis does a good job of handling quantifiable, measurable outcomes (here, wins and losses) but does a poor job when the situation may be more nuanced and several goals given the time and place of the event are in play.

    Put another way, Belichick has already explained his reasoning: the goal was to win the game. I haven’t read any other reason for going after the first down. Assuming that was the only goal of import to him, then I think the evidence points supports his choice of action.

    But, what if Dungy (and others like him) see more than one goal in play: yes, we want to win the game, but there are other goals we want to realize. Maybe our team needs the experience of defending against a long drive at the end of the game. Maybe our defense needs the coach to express confidence in their ability. Maybe the risk of losing a regular season game is worth the gain if these unmeasurable intangibles are achieved.

    Maybe I’m expressing myself poorly, but before you can reasonably choose, you better be certain about what you want. Maybe all analysis of should start there. I think Belichick knew exactly what he wanted, and he was open enough to explain it.

  • PK

    I think it’s kinda ridiculous to even bring “equations” or “calculations” into this discussion. Football is not played inside of a calculator. There are far too many variables that must be taken into account before getting an accurate probability of success (play-call, fatigue level of each and every player on the field, situation, crowd noise, etc. etc. etc.).

    The other night, and in most similar situations, it’s a question of belief, not probability. Belichick believed his team could get one yard and win the game. I seriously doubt he said, “Hey, we’ve got a 73.7% chance of getting this first down. We should go for it.” His decision backfired. We’ll see how he decides next time.

  • stan

    One small quibble — no one knows what the real statistical odds are for either of the options. All of the estimates on making it on 4th down are based on league averages from plays that bear little, if any, relationship to the situation facing BB. I think 2 pt conversion attempts probably provide a better basis for a guesstimate, but it’s still guessing. The estimated odds on a potential drive after a punt are a little more realistic, but there remains a plethora of factors which add uncertainty for anyone trying to extrapolate from league averages to the case at hand.

    My problems with the number crunchers is that their pronouncements of certainty border on hubris. Way too often they build elaborate sophisticated statistical edifices upon faulty assumptions (see e.g. Romer, Wall Street risk models and global climate models). In the BB case, the number crunchers have provided an excellent example of extrapolating way too far from a tiny sample based on questionable assumptions.

    Of course, the “traditionalist” critics have mostly been an embarassment. They don’t even have an analytical structure from which to reason.

    Do coaches often choose to punt when they ought to go for it on 4th down? No question about it. Was this one of those cases where a punt would have been a mistake? Perhaps. Do we have solid statistical evidence from which to make a judgment? No way, not even close.

  • Rusty Ward

    The most overlooked point is that Welker was open well beyond the stripe

  • Zach

    Chris,
    This is a fantastic story. Great legwork, citing, sources, and commentary. It really does seem to be a touchy and difficult subject and I’m fascinated not only by other people’s reactions, but my own. I’m in the camp that initially said “That’s the wrong call”. What interested me, however, was my reaction after I read your post and several others backing up his decision: I still wanted it to be the wrong call because it just FELT wrong. I’ve enjoyed your other posts on the decision making process, I’m definitely interested in more.

    Kind of an aside: can you recommend any entry-level books on the decision making process?

  • http://smartfootball.com Chris

    Zach: I cited a couple in the post. Predictably Irrational is good, and How We Decide is a very readable one sort of made for the mainstream. I’m sure there are others. I find this discussion good too.

    Stan: I’m not sure if the quibble was directed at me. I noted that some of the stats guys have gone off the deep end too. But I do think the data shows it wasn’t “the worst call of all time” or all that sort of thing, and that there was enough data that, if Belichick then thought it was the best call, he was perfectly justified in doing so. Had Belichick punted I wouldn’t be among those who would say it was horrible or stupid.

  • JP – Chicago

    TOM WROTE “You’ve put together a very good examination of human rationality and risk aversion in the decision making process.”

    100% Agree. Great article.

    Football is still a game of execution after the decision.

    The decision process is a science unto itself. Your same article can be applied to investing in the market, a business decision to purchase another company, or a President deciding to go to war.

    Love your blog.

  • jerry

    I think every action on field is independent. You can’t use stats for plays of past. That call has nothing to due wit probability. The execution was the issue. Why not throw to the best 5 yard wr in league welker. That’s the problem not some gut feeling or .79 chance of making vs .60 chance of manning driving to score.

  • Topher

    “That’s the problem not some gut feeling or .79 chance of making vs .60 chance of manning driving to score.”

    The whole “gut feeling” discussion hinges on the following: football coaches are trained from their first days with a whistle to avoid losing games. More games are lost than won, and coaches seek to minimize bad outcomes by minimizing risk (power running games are about tiring out the defense, punting the ball is about preserving field position, etc).

    The ethos of football, especially in the risk-averse NFL, is “why decide the game now when you can defer the decision to later.” That’s essentially what we are arguing about today, and coaches are invariably criticized for taking the high-risk option when they could defer to a high-risk option later in the football game. Cf Raheem Morris against the Redskins – he took the low-risk kick and ended up losing to a bad team, but got far less criticism than Belichick who took the high-risk option with a dead defense and the game’s best quarterback running the play.

    My point is that the football “gut feeling” is a carefully honed and trained instinct, wildly influenced by one’s mentors and history, not a primordial, immutable evolutionary factor like the fight or flight response.

    Usually, going for it deep in your own territory is foolish and indefensible. We know this because teams that do it can quickly give up backbreaker scores – the “gut feeling” is ingrained by history. In some cases, with the game on the line and other opponent factors, it is defensible.

  • Dan

    Here’s a question I’ve never seen anyone address. If Belichick felt strongly that he should go for it on 4th down, wouldn’t he have been better served by at least introducing some sort of misdirection on the play rather than taking a timeout (giving Indy a chance to rest and strategize its defense) just to set up a conventional play?

    How about acting as if he is trying to draw Indy offsides, then as the clock ticks down to 0 and Indy begins to relax assuming a delay of game penalty is coming, run the play at the last second?

    Or just flat out fake the punt?

  • SRS

    I have to say that I usually enjoy the work on this site but this entry is disappointing. However, I should perhaps have tempered my expectations because the title is quite ambitious (at least, the implied discussion would have been).

    I agree that the interesting thing about exploring evidence and actual probabilities is that the results can be counter-intuitive. However, my problem with elevating this debate to the level which is attempted here is that the analysis of sports is not a science. Although quite a bit of work has been done to quantify sports and theorize about sports, there is no quantified theory of sports nor is there a general theory of sports.

    There’s absolutely nothing to tell us that our studies are right, wrong, accurate, precise…what have you. I’ll not get too far into epistemology and ontology here, but we should all take a step back and harken to the wisdom of our scientific forbears regarding “science” as an empricist endeavor. Which is what the current state of sports analysis is, plain and simple. Amassing all the data of past events and crunching it into quantitative analysis does not a science make. Nor is it even the necessary step toward a theory of sports (which is really what is needed to do what I think some aspire to do).

    One last note in my already long-winded reply: human behavior is a famously messy, complex problem. Behavior here is meant in the social science sense as “what we do and how we do it.” Which includes activities such as sports. Economists might think they’ve cornered the market on quantifiable analysis (pun intended) but other social scientists have quite a bit to say about that.

  • Jim

    SRS: I don’t think you read the whole post, particularly at the end. I don’t think Chris ever says that Belichick was objectively right. You’re right it’s an ambitious subject about how people deal with data. I took what he was saying as a commentary on how do we work through this messy process. The people who refuse to consider any evidence don’t have a process at all. It’s not about economists versus everyone else (or at least I didn’t take it that way). Chris said several times that there were a lot of variables at play, and it was a complicated decision, made in a short time. Nobody can really say it was perfectly right or wrong. (Though I didn’t like the playcall itself.)

  • Topher

    SRS,

    “However, my problem with elevating this debate to the level which is attempted here is that the analysis of sports is not a science. Although quite a bit of work has been done to quantify sports and theorize about sports, there is no quantified theory of sports nor is there a general theory of sports.”

    I concur with your take. However, I think Chris is agreeing with you as well – there are several ways to evaluate the call, one is statistical, another is statistical but is called “gut feeling” (which is based on decades of maximum-likelihood football data that coaches have been trained in in order to pre-decide lots of snap choices) and others are interpersonal relations (what message does he send to his team). For Tony Dungy to say “play the percentages,” he has to have some percentages, and in that case the %’s seem to come out in Bill’s favor. The “message to the team” people have a point, Bill responded by saying his sole goal was to win the game right there.

    I think Chris’ point is that certain frames of analysis show the call to be at least defensible, and that the people screaming “it was the WRONG CALL, TONY!!!” on PTI are just going too far. I agree – as a coach I wouldn’t have made that call, but I understand why a coach would.

  • http://brophyfootball.blogspot.com brophy

    wonderfully written piece, as usual.

    With your analysis of the talking head noise / cross-chatter, how much of this (in your opinion) is actually genuine and how much is manufactured for viewership/audience? How much are these talking heads criticizing or blowign this issue up beyond proportions just to stoke the “fanboy” flames of viewership? Do we really believe the Trent Dilfer’s, Mark May’s, and Rich Eisen’s of the world are really campaigning decision-making analysis, or are they just reading a rehearsed Kabuki-esque entertainment ploy to gain audience interest?

    I’m still really unsure how this is that big of a deal or so “OMG” that it has to become such an “outrage”.

  • cerebral

    my question is… dang it.. didnt kevin faulk SEE the yellow line he needed to cross?? sheesh

  • http://smartfootball.com Chris

    Broph, no doubt much of the commentary is WWE style blustering for attention and to make news. I think a lot of Pats fans were upset by giving up the lead and got mad about this. (I’d be mad at Laurence Maroney for fumbling the ball on the two; if he scores a touchdown the game is over.) Football commentary is weird because it is a game for fans and entertainment, so you can’t get too upset if the arguments are frivolous; on one level, the game itself is frivolous.

    But one of the main reasons I like football is that, frivolous or not, all of life is in it. There are much more fundamental lessons, like how in football everyone, even the best in the world, gets knocked on their butt so it’s all about how you get up, but these decisionmaking things are really interesting to me too.

  • james

    I wouldn’t be so quick to assume Belichick didn’t work out the numbers. What’s to say he doesn’t have a stats guy in the booth running real-time analysis similar to Brian’s? The only evidence that he doesn’t is that, like every other coach, he sometimes makes a conservative decision instead of playing the percentages, but I think the last few days have made it abundantly clear that conservation option is often simply much less of a hassle, even if the coach knows better.

  • Brad

    Chris,

    Loved the piece, and I have something to add, I think. Your criticism of Tony Dungy can be exemplified by last years Super Bowl. Dungy defended Mike Tomlin for kicking a FG when it was 4th and Goal on the 2 yard line, for the first score of the game (I think it was the first score). Dungy said something roughly to the effect, “It was the right decision because Tomlin knows that the strength of his team is its defense, so its right to play for a 3 pt lead, and then shut down the opponent.” This was said despite all the studies that you yourself have referenced many times that say it is overwhelmingly the wrong decision to NOT go for the TD in that situation.

    I think it comes down to that defensive/conservative mindset despite all evidence pointing to the contrary (ala “Tresselball”).

    By the way, I’m still waiting on your take of 2 deep zone blitzing…

    Brad

  • Aboojum

    I think you have way inflated the cold, statistical nature of his decision. If Belichick knew the stats and was planning to go for it on fourth down the whole time–which according to you he had to be thinking since he has in his back pocket evidence that to do otherwise is against the statistics of probablity–then he would never have a) ran that third down play or b) called timeout to “discuss it” which was his ultimate downfall as it limited him on challenging a call with a replay.

    I think Belichick made a very knee-jerk, emotional decision regardless of how you crunch the numbers. Also, he would have NEVER made this ame decision in the Super Bowl and if he wouldn’t make it there, he should not have made it Sunday.

  • DM

    The brunt of the criticism of Belichick shouldn’t be on the decision to go for it, but for the misuse of timeouts and poor play calling. Had that been better, the fourth down situation might never have come up.

  • Topher

    Chris,

    Glad you mentioned the Maroney fumble – in each of the last three Colts games, the Pats have flat-out given away a sure touchdown (two dropped TD passes and a goal line fumble) that would have likely put the game away. Maroney was also weak in the fourth quarter; I wouldn’t be surprised if NE gets rid of him.

  • Topher

    “Also, he would have NEVER made this ame decision in the Super Bowl and if he wouldn’t make it there, he should not have made it Sunday.”

    That doesn’t make any sense. I wouldn’t go out at night in southeast DC, but I go out all the time in my own neighborhood. They are wildly different situations requiring different decision-making processes – namely, nobody’s season is over due to this call.

  • james

    Who says he wouldn’t have done this in the Super Bowl? I’d go the other way–he might not have done it in a week 17 game with nothing on the line, but my emotional, subjective take is that this was Belichick’s shot across the bow.

    He’s 57, extremely accomplished and respected as a coach, and he’s probably known for a long time that many 4th down decisions, even his, are too conservative. It was time to make a statement in the boldest way possible, with the obvious bonus that it gave them the best chance to win.

    I won’t go as far as to say he also considered that the play would have a much bigger impact, in terms of starting the conversation, if it failed (or maybe I will–why not, it’s the internet), but it’s certainly ironic. I completely disagree, by the way, with the idea that he’d be hailed as a genius if it had worked. I think the mainstream media reaction would be more like “that didn’t make any sense to me, but he’s the one with the Super Bowl rings, so maybe he knows what he’s doing,” and then they would quickly forget the whole thing.

  • Tim James

    It’s been an interesting week, touching on human risk analysis (see flying vs. driving!), conservative NFL coaches concerned about job security, mainstream analysts that stick to the norm, and people who are unable to go beyond platitudes like “it was just dumb, he should have faith in his defense.”

  • Tim James

    Also, I’m glad to see folks here are mentioning that few are arguing that percentages and stats are the only way to call a game, and simply point out that it was NOT a “terrible call,” and did have a foundation of reasoning behind it to add in with other situational factors and “gut feelings.” Glad to see the response to the response is getting out there.

  • Drew

    Between the 4th and 2 call and several “heads” reactions to Jacksonville giving up a touchdown to kill the clock, I think there is an important mental crutch involved in the “gut” decision to do what the coaches didn’t.

    Put simply, if you don’t have the lead, get it. If you have it, hold on to the lead as long as possible. Which seams rational, but it really isn’t because having the lead only matters at game end.

    Going for 4th and 2 moved the big “swing” point forward about 90 seconds, but at better odds than putting the imfamously iffy New England 2 minute defense at Peyton’s mercy. Jacksonville taking a knee moved the moment of truth back about the same. Both are defendable as probabilties. For the Jags say 95% you make the field goal, versus 10% the Jets score late (not very likely but 2-1 odds). Neither “feels” right because you do not try to have the lead for the longest (irrelevent) part of the game.

  • Cee

    “why decide the game now when you can defer the decision to later.”

    Yes, to Topher. (and Drew too)

    I think a lot of the ‘bad gut feelings’ about this are due to the fact that by making the call he did, he was admitting that he was probably going to loose the game, and was just going to get it over with one way or the other.

    Facing bad things stinks. It is human nature to want to do our taxes tomorrow or put grandma on life support even if we know those to be bad decisions.

  • JM

    I’ve been fascinated by this debate, too. I think one reason people react so strongly to the decision is that it seems to go against a certain sense of “fairness” in football. The “fair” way to play is that the offense gets three downs to make a first down, and if they don’t make it, they give the ball to the other team in roughly the same field position so that team can have a similar chance. That way the offense and defense both face similar challenges, with the added element that teams that “control field position” slowly swing the field position to their favor. This style of football rewards the slow accumulation of advantage over the entire game, but it requires that both teams play the same way. If one team starts going for it on fourth down, the symmetry of the “fair” competition is off — one team has one extra down to continue their drives. To people who were raised in this style of football (which is, after all, the orthodox style), it’s almost as if going for it on fourth down changes the rules of the sport — as if a tennis player started taking every two out of three serves instead of alternating. I think it’s profoundly disconcerting for traditionalists to see this new style of football, even if it ends up helping their team. They feel like they won by cheating, as if not punting somehow makes football into a poker game where chance plays a role, rather than an arm-wrestling competition where the strongest man always wins. Wow, that went on for a long time. I’ll stop there.

  • KungFuPanda9

    The purpose of the punt is to give your defense some space. If the coach believed his defense was not going to be able to stop the opponent in the space provided by his punter, then he goes for the first down with his offense, if he thinks they can make the play. I believe that was Belichick’s intuitive reasoning.

  • Kings

    Easy question – If the opponent was an anemic offensive team like the Browns and the situation was the same, would the decision to go for it on 4th down still be the correct call?

  • CH

    This is a great post, and all I’ll add is that I also think it tells us something about people’s view of punting. I think a lot of the “newer school” football thinkers are starting to realize that punting is not always such a great solution.

    For years we’ve always believed that on 4th down, you punt. How many times do you hear an announcer say “and Team X is forced to punt?” All the time. That’s just how it’s always been. But of course you are never FORCED to punt. You always have the option of going for it, and I’m positive that in the long run you’d gain more than you’d lose if you went for it virtually everytime.

    Bill Belichick is one of those people who doesn’t view punting as an automatic solution. I’ve always wondered why people don’t consider punt turnovers, even though that’s exactly what they are. Sure, they result in a gain of maybe 40 yards in field position, but you lose possession and the other team gains. No one thinks of it this way, though. Except Belichick; he knew instinctively that handing the ball back to Manning, no matter the field position, was not an attractive option.

    I’ll be the first to admit that potentially giving the ball to Manning at the 30 isn’t attractive either; the fact that New England was so deep in it’s own territory is why I have trouble fully supporting what Belichick did. However, giving Manning the ball at his own 30 or 40 isn’t great either.

    Agree or disagree, I just wish more people would appreciate how Belichick put the ball in his best player’s hands and told him to win the game. What more could you ask from your coach? Are you more comfortable having Brady throw for a first down, or having Jonathan Wilhite try and cover Reggie Wayne? The play didn’t work (and only because Faulk bobbled the ball; had he possessed it the whole time, he had the first down), but I like that Belichick played to win. If you watched the Iowa-Ohio State game on Saturday, you know what it looks like when two coaches play not to lose. Tressel and Ferentz in the last two minutes of that game were like two fighters each trying to take a dive. “I don’t wanna score” “Well neither do I” At least Belichick played to win.

  • brian

    I want to add a clarification for those critiquing statistical analyses as lacking in context (or general theories of sports) for that matter. 4th and 2, for example, happens many times each week. A statistician of this collects every single instance it happens. And the more of these instances you have, the more contexts you have from which the 4th and 2 came.

    What does this mean? We can begin to control for some, then as our data set grows over time, many contexts. In other words, if our data set becomes large enough we can get a more accurate probability of what will occur on 4th and 2 with an x-rated offense against a y-rated defense, at this location, with 2 minutes left (and so on). Then we do have an accurate (or at least much more accurate) probability of success.

    I don’t think all contexts can be controlled for (like tiredness of a defense), but we can still get a probability that’s fairly accurate. This is how statistical analysis deals with “messy” contexts.

  • Tracey Gere

    Fascinating discussion! I agree that this has created a great opportunity for learning / discussing about how we make decisions. This was exactly the kind of post I have been hoping to find.

    I wonder if anyone has read the work of Gerd Gigerenzer? He has done a lot of work on heuristics and “gut instict”. He has also written some very good critiques of how we think about probability. Among other things, he has conducted experiments to show that “rational” thinking (such as utility maximization) is not only unrealistic (as has been proven in countless experiments) but that using heuristic strategies that actually ignore information can actually provide better choices.

    Some books to check out:

    Calculated Risks: How to Know When Numbers Deceive You
    Gut Feelings: The Intelligence of the Unconscious
    Bounded Rationality: The Adaptive Toolbox

    I started reading the analysis about conservatism in football a few months back (such as the Romer paper and Brian Burke at Advanced NFL Stats) and was initially very skeptical. I have since come around, and am in the camp that Belichick made the right choice. However, the work by Gigerenzer is also very compelling, and offers a viewpoint that not only are we “boundedly rational” (explaining why people resist analysis such as Burke / Romer) but that the use of “simple heurisitcs” can actually be a BETTER decision strategy (at least in some cases).

  • Hypocrisy

    A couple of people already mentioned Tressel/Ferentz in the Iowa/Ohio State game. It was quite fortuitous that we had these events happen on the same weekend. Tressel pursued a conservative strategy and won – and was ripped apart for his conservatism. Belichick was aggressive and lost – and was ripped for his aggressiveness. win/lose, passive/aggressive, the same lame criticisms come screaming from the peanut gallery.

  • daddymag

    A fine analysis. None of which I believe Belichick considered. Considering all that Bill Belichick achieved, I suspect that he didn’t really spend so much thought delivering such a rigorous, unemotional analysis of the facts presented to him.
    He simply decided to go for it.
    He tried to win the game on the spot, and was not afraid of whether he lost or not. He tried to win, because he was not afraid to lose. The Patriots all wake up today, and they have another game to try to win this weekend.

  • Devin

    “That doesn’t make any sense. I wouldn’t go out at night in southeast DC, but I go out all the time in my own neighborhood. They are wildly different situations requiring different decision-making processes – namely, nobody’s season is over due to this call.”

    This is a poor reason to change the decision. If it was a good call in Week 10, it is still a good call in the Super Bowl, presuming all you care about is maximizing your chances of winning the game and don’t care at all about outside factors such as job security/media criticism/etc. An edge is an edge in the first week of the season and the last week of the season, even if making the decision in the Super Bowl would draw 10x the condemnation than it received this week.

  • Todd

    Chris, I’ve been a fan of yours since your old spread/no huddle site and I have to say this is one of the best things you’ve written.

    This section in particular made me think of Malcolm Gladwell’s “Blink” and the concept of “thin slicing”:

    “I do not think Belichick worked out the numbers as Burke had. Yet he didn’t have to. His intuition was the kind of specialist’s ingrained intuition that came from years of thought about just such issues.”

  • http://smartfootball.com Chris

    I think one of the other reasons this decision is counter intuitive — or even the idea that it could remotely be the right decision is counter intuitive — is that Burke’s and others’ late game win probability models show that the team with the ball late in games, even if they actually trail, tend to actually have a better chance of winning than they do losing. For example, if you’re down two points and have the ball with a minute or more on your own 40 yardline, and someone offers you money that they lose, you should take it (unless they’re the Browns). That simplifies a lot of things than everything that went on with Belichick’s decision, but it is counterintuitive to think that a team could EVER have a higher chance of winning the game than their opponents even though they trail on the scoreboard.

  • Jon

    Hypocrisy,

    Kirk Ferentz went ultra conservative by opting to run the clock out despite the fact that they have 1 timeout left. Iowa should’ve gone for the win by driving down the field and kick a FG to win the game instead of giving up to go OT.

    That’s why Jim Tressel’s conservative strategy won out. It’s because Ferentz’s strategy is even more conservative.

  • Nick

    To qualify myself: I have a Masters in Mathematics, specializing in stochastic processes, from one of the top few schools in the field. The reason I say that is because I went through the data and the percentages are probably closer to the truth than you would believe.

    At this level, it’s more of an art than a science and as a relatively unbiased and educated observer I have to say that you are looking at a ~8-15% increase in EV (i.e. winning the game) by going for it in that situation, with those players.

    I knew two things immediately when I saw the Pats run that play; i) I instantly had a deeper respect for BB, moreso than any other coach in professional sports and ii) he was going to get crucified for it. As a long time football fan, I had a similar intuition that the call was right mathematically but it takes an iron stomach to stick with that call in the face of unilateral post-fact condemnation.

    On a larger basis, what the author is referring to is pretty widely known. Not coincidentally, I work in finance with experience at some pretty major hedge funds doing quantitative work; the human biases mentioned are much clearer in financial markets.

    In fact, one major quantitative strategy set up at Goldman Sachs was essentially a systemized version of capturing the delta of true value caused by human biases. That fund (Global Alpha) at one point was the largest hedge fund in the world (until it blew up in late 2007); ironically, the synchronized quant blow-up was a symptom of yet another human bias.

    In addition to the great books suggested above, I would also suggest reading Market Wizards by Jack Schwager and Reminiscences of a Stock Operator by Edwin Lefevre. Unlike most finance-y books, these heavily deal with human psychology and what the implications are. Both are classics, and I frequently pull them out to read specific chapters over and over.

    For the more technical, I’d suggest looking into stochastic integration (unlike normal calculus, Brownian motion cannot be differentiated at a specific point in time) – if you play around with it for 30 min. it will truly change the way you think about probability.

  • http://leftrightandcentered.wordpress.com/ cory

    Belichick is an arrogant jerk but besides that he was completely inconsistent (not wrong).

    I didn’t think going for it was necessarily wrong – I wouldn’t have done it, but it’s not a bad call. I understand that he is either

    1. afraid of Manning
    2, has total confidence in his passing offense
    3. both
    (I think the answer is 3 – with 2 being greater than 1)

    Before analyzing it, let me say regardless of that call he got away with coaching malfeasance that people haven’t talked about because of the call. What he did with his time outs was terrible. No excuse for that, he needs those two timeouts he blew on nothing, especially if he is afraid of Manning. Bespeaks arrogance.
    As for the decision to go for it – if he makes the first down, game over. Takes it out of chance. Okay
    The counterargument not to go is just as good, his move works better against bad teams that can’t stop him or would have trouble from the 30. Won’t argue going for it – unknowable answer.

    But- here are two questions I have no one brought up because they are hung up on the call itself. Buy the premise that it’s not a bad move.

    1. Have you no confidence in your run game? If you, and no one else, know you will go for it on fourth down it means you have two downs to run for two yards. They can’t do that? This now tells everyone the Patriots are not a running team. Valuable info.

    2. More importantly – once the move failed, and the Colts went down to the one with 45 seconds to go, it makes no sense not to let them score. If you remember they put up a stand and stopped them on the first run from the one. Think about this. The Colts have three more plays and a timeout. You’ve just essentially said I don’t trust my defense to stop them from going 70 yards, you mean you think you are going to stop them three times for one yard? Come on. IF he lets them score with 45 seconds on Addai’s first run, at least he gives Brady 45 seconds to move for a field goal, not out of the realm of possibility – five or six plays before a kick. Otherwise the Colts will do exactly what they did, run the clock down so there is only time for three plays. I can’t see any rationale that is consistent there. Happened in the Jets game – and the running back messed them up by taking a knee at the one to run the clock to the field goal . But in this situation the Colts had to score a touchdown immediately -couldn’t wait to kick a field goal. Pats could have had the ball with time to move it.

    What it says I think is that Belichick has unlimited confidence in his passing game. He can pass any time on anyone. Problem is he can’t do that against a really good team, who will put pressure on Brady. That’s how he lost against the Giants in the Super Bowl -unless his offensive line can contain the rush, then the fact that Brady is the best (and I think you saw that last night) and Moss and Welker can’t be covered is a moot point. The Colts just pressured Brady so that he couldn’t automatically complete the passes.

    Belichick figures that passing attack can win any game regardless of his running game or defense- it will beat bad teams but they can’t win the Super Bowl like that because good teams can stop them enough times to outscore them. The call proved he needs either a defense or a running game to complement his passing attack to beat a really good team.

  • http://spreadoffense.com Whajonahle

    One game, one play to win it. 2 yards to make. Against the best team (record-wise) in the league. I think anyone going into that game would take that, all things considered, not knowing what the outcome of the rest of the game would be.

    I have no problem with the decision, although my gut tells me I would have punted. But the fact is, they MADE the first down and the ref blew the call, so it really turned out to be a good call by BB, and a bad call by the refs.

    ~SAS

  • http://spreadoffense.com Whajonahle

    @Chris … here’s an example of a higher chance at winning if your losing. Under 2 minutes or so left in a game, winning by 2, no timeouts. Other team has the ball chip shot field goal range and a 1st down. The best way you have a good chance to win is to let them score as quickly as possible so you can get the ball back with enough time on the clock to score. Otherwise, they run the time down, kick the field goal, and win. It’s why one of the NFL running backs took a knee on the one — pissing off many “fantasy” owners but doing the right thing to win the game. At that point, it was better to remain behind on the scoreboard and play the clock.

    ~SAS

  • Eduardo

    I think that this choice is just like that when you are playing bridge, need to hit a finesse, knows that everybody in the field will play from the dummy but you have a gut feeling that the winning plays is play from your yur hand.

    You have to choose between 1)win or loose with the entire field (and get a average board or 2) do the other way and get a top or bottom result.

    Belichik chose the latter, and it went wrong. When you go against the field you always take heat, even if you are right. That´s human nature.

  • E.D.

    Del Rio would’ve gotten taken apart and endured all the ‘he insulted his defense’ BS if his move didn’t work. It’s the usual results-oriented mindset. You see those types at the blackjack table all the time. They stop playing correct basic strategy because it failed once and start playing like idiots.

  • feralboy12

    I’m one of a gazillion people who blogged about this, but I still haven’t seen anyone else make a couple of points.
    Football has too many variables to create a model using linear math like simple percentage calculations–the calculations refer to situations that can continually be broken down (are we using league averages? Patriots’ averages? All 4th and 2, or last two minutes percentages? etc.).
    Sometimes you put the game in the hands of your best player. Tom Brady needs two yards. Who’s betting against him?