Trouble with the Curve starring Clint Eastwood and Amy Adams. It is a movie with a predictable plot and typical characters but still nice to watch. What is interesting is that unintentionally it makes a good point about applicability of data analytics and statistics in decision making process as well as reveals some hidden dangers of relying too much on it.
Part of the plot is that an old baseball scout recommends on his employer not to recruit a promising star while the guys in the team management want him badly. The management sees the player as a valuable acquisition based on his stats over the time. One of them even says that he had been following the player for few years on his computer screen. The position of the old scout was based on shaking of player's hands when hitting and the sound of the ball. It rendered him not good at hitting fast and curve balls. This conclusion came from a man who had spent his life watching and evaluating baseball players. And here comes the stand-off between the methods of the old dog and the new shiny analytics. We risk missing important pieces of information if we are focused on data only. The abundance of instantly available data, huge databases and powerful software to crunch it gives the false security that we could build models that could tell us everything we need to know and make the best decision. There are three points I would like to make and are nicely illustrated in the movie.
One is the available data may not be covering all the possible cases and outcomes that are out there and before using any piece of data we have in hand we need to critically analyze it - definitions, sample sizes, included value ranges and so on to make sure that it is a full and undistorted capture of reality. In the movie, the player's stat looked perfect on paper but it came out that that only "good"balls were thrown at him - balls he handled very well therefore statistics for the balls he could not hit was simply not in the data. The managerial decision to recruit him was good but the KPIs were based on incomplete data.
The second point is there maybe many factors that are simply not measured or recorded but could have a significant impact on the outcome. Not accounting for these in a model or analytical framework could render its conclusions utter nonsense and push toward unfavorable actions. In the movie it was the shaking of the hand and the sound of the ball. It spoke a lot to the scout but was not on the computer screen of the management and was not an element in the player's evaluation model. A hint for the analyst is to review the case in hand in a search of a feature or influencing factors that are not in the data at all.
The third point is that the wisdom of experience should not be undervalued. The young guns with their laptops and smart ways to deal with data seems to tend to put less and less trust on experience. But the experience in the field is what really makes analytics a powerful tool. Numbers speak only when you could relate them to what happens off the computer screen.
I would not have expected such a good points about analytics in a random lazy Sunday afternoon movie but this is thing about good movies - there are lot of things to think about in them, isn't it? In truth's sake, baseball teams are well aware of the three points (and much more) I make here. After short Moneyball craze now baseball teams employ both advanced analytics and seasoned scouts on the play-fields.