Baseball’s Billy Beane Shows Companies the Power of Data

Oakland Athletics General Manager Billy Beane brought a data-driven and unconventional approach to winning baseball games. By setting strategy and articulating the metric to evaluate and acquire the players who would ultimately implement his strategy on the field, Beane’s sabermetrics approach brought about a cultural shift in baseball from the players and managers to coaches and scouts. Professor Srikant Datar discusses how strategy and metrics work hand-in-hand, and how Beane’s story provides companies with important lessons in data science.

Subscribe on iTunes  Follow on Libsyn

Subscribe on iTunes  Follow on Libsyn

Brian Kenny: Intangibles. It's a word that by definition defies definition, and it's used all the time in the world of professional baseball. No one knows when the term was first used, but the idea that some players' contributions simply can't be measured goes back a long way. Maybe all the way back to Eddie Stanky, the shortstop who played on five major league teams over 10 years. Dodger’s manager Leo Durocher said of Stanky in a 1950 scouting report, "He can't hit, can't run, can't field. He's no nice guy. All that little SOB can do is win."

In addition to hitting, fielding, and running, generations of baseball scouts have given credence to the, “can't quite put your finger on it gut feeling that this guy has it.” These days, however, they're not the only ones in on the decision. Today we'll hear from Professor Srikant Datar about his case study entitled, The Oakland Athletics: Strategy and Metrics for a Budget, co-written with Caitlin N. Bowler. I'm your host Brian Kenny, and you're listening to Cold Call.

Srikant Datar is an expert in the areas of cost management and management control, strategy implementation, and governance. He also developed two new courses at Harvard Business School on developing mindsets for innovative problem solving and managing with data science, and I think those courses are highly relevant to the case we’re going to talk about today. Srikant, thanks for joining me.

Srikant Datar: Pleasure, Brian. Lovely to be here.

Kenny: I think a lot of people will recognize this case from the movie that came out a few years ago, Moneyball, very funny movie with Brad Pitt. But here, you come at this from a very different perspective and one that makes it a topic worthy of discussion in the MBA classroom. Start by setting the case up for us. Who’s the protagonist and what's on his mind?

Datar: The protagonist is Billy Beane. He's the general manager of the Oakland A’s. The case is situated at the time of the 2002 player draft. Oakland is a low-budget team but has been competitive. Billy Beane wants to draft players that would help him win while keeping his budget low.

Kenny: What prompted you to write this case?

Datar: I use it as an introduction on my course on managing with data science to show that data can change the way you manage, but it raises all manner of organization issues that need to be managed at the same time. It's interesting as I was looking at the case and working on it, that baseball as it turns out has a lot of data. And yet it seems as you look at what Billy Beane was doing at that time, that much of the data wasn't being used. The interesting question is, Could it be used? Could it be used profitably? What challenges does it pose? And if you were going to use it, would it make a difference? Those are the topics we talk about in the case.

Kenny: Billy had some specific things he was trying to accomplish here. What were some of his strategic goals?

Datar: It’s very interesting the way Billy Beane frames the question. He says, "I want to recruit players who will help the A's win." At one level you look at it and say, yeah sure, that's how everyone ought to be deciding how to recruit players. But at that time in baseball, and I'd say to some extent that continues now, there was a great deal of emphasis around, I don't think you can really do that. I don't think you can really figure out how to recruit players who can win. What you can do is figure out the players who are [the most] talented—just choose the best athlete and they will help you win. What is very interesting in the case is that there are players who seem to be very talented but who do not get picked by the A's. It is a very interesting way to think about data, and to think about organizations.

Kenny: This gets back a little bit to what I was saying in the introduction, the notion of intangibles. If you look at baseball scouting traditionally, how has it been done? How were decisions made?

Datar: The most common model used to be you evaluate the player on five tools and at some level there's really nothing wrong with those tools. Batters, who have a different set of tools than pitchers. For the batters it’s the ability to hit, what's your batting average? Can you hit for power? Can you hit homeruns? Can you run? Can you field? That would include catching and throwing.

If you evaluated a player on those, why bother with the rest of these data statistics that are there? Billy Beane was challenging that notion in part, and this is always an interesting part of the case. One doesn't really know in the end whether it's the reason why he begins to think about it differently, but Billy Beane himself was a very good baseball player and he was seen to have all these five tools. He was seen by scouts at the time as a “can't miss” player. I mean, he was just terrific. Turns out he did pretty badly in the Major Leagues as a baseball player.

Kenny: Right.

Datar: He did fantastic as a manager afterwards. I always wondered whether the fact that he was rated so highly on those very tools, and that it didn't actually work out, caused him to rethink. Nothing like personal experience to learn from and grow from, but to his credit he does.

Kenny: You mentioned earlier that an awful lot of data is captured about baseball. There are whole almanacs that are devoted to the game. You've got people at games who actually keep their own scorecards and then submit them to Major League Baseball afterwards. So it's definitely a data-intensive crowd.

Datar: Yes.

Kenny: Tell us a little bit about Bill James and sabermetrics. That was really the core of the story behind Moneyball.

Datar: It's very interesting. Bill James actually worked as a warehouse clerk but he was fascinated by baseball. He was well educated and therefore also fascinated by what data might do for baseball. It was interesting to him that no one seemed to use that data systematically. That is one important part of how sabermetrics comes about. It's around the Society for American Baseball Research, that’s where the word “saber” comes from and metrics are the kind of data that these folks who are very interested in statistical research were looking at. James then starts producing these baseball abstracts, as he calls them, and they were very powerful treatises on what you could do with the statistics that were available in baseball.

He says, “The problem is that baseball statistics are not pure accomplishments of men against other men, which is what we're in the habit of seeing them as. They are accomplishments of men in combination with their circumstances.” Context matters, and so when you're thinking about wins, it's just not that a person's a very good player, but you've got to think about how would you put that in the context of a team and what you're doing.

And his other interesting question was, which I use a lot in my course, is that although there's a lot of data available, do you have a good question to ask? Until you have a very good question to ask, data can help you to some extent … but it's much more effective when you have a good question to ask. Then you can really look at the data and say, does it make sense or not?

I have another short quote from James. "I do not start with the numbers any more than a mechanic starts with a monkey wrench. I start with the game, with the things that I see there, and the things that people say there, and I ask, is it true? Can you validate it? Can you measure it? How does it fit with the rest of the machinery? And for those answers, I go to the record books." Now it's remarkable that despite all the records that have been kept around baseball, not much thought had been given how might you actually use it. He comes up with a number of very interesting metrics that were not being looked at.

Kenny: What are a couple of those metrics beyond the hitting and the running and the fielding?

Datar: The thing that he's most well known for, which becomes a bit of an adage afterwards, is that a walk is as good as a hit because you [still] get to first base, whether or not you hit the ball successfully.

Kenny: And you know on that point, Ed Stanky, who I mentioned in the introduction, had the most walks five years in a row in the National League. He walked 100 times in one year. So if you looked at him just on a value basis, he was the most valuable player in the National League at that time.

Datar: At that time it wouldn't have shown up as a very [meaningful] statistic. So, James and his analysis of what it takes for a team to score runs and win, comes up with this analysis around how important on-base percentage is. This wasn't a very well-known statistic before. He begins to start thinking about things that we now take for granted but at the time were not. So slugging average. He's thinking, what does it take to score runs? How do you get runs scored, and how many bases can you get? So slugging, and then what became known later then as OPS where you take slugging and also on-base. He started looking at all these metrics and trying to figure out which metrics are better predictors of runs and wins than metrics used before. He finds a remarkable causation correlation. I think is very powerful evidence that some of these variables that he began looking at provided better explanation of why teams score runs and wins.

Kenny: So the case takes us sort of into the bowels of the Oakland Coliseum, the stadium where the A’s play baseball. You describe a conversation that's playing out between the traditional baseball scouts that the A’s still use and Billy Beane's new cadre of people who are using sabermetrics. How does this change the dynamic of that relationship?

Datar: This is one of the most important points that we dwell on in context of the case discussion because, as you start getting this data, what do the scouts add? Can I just look at data and forget about the scouts? Or, can it be done in addition to the scouts, and what are those challenges? That's a little bit of the tension that we try to discuss.

Students very quickly like to think of it as something that can be done in addition to [what the scouts do], and eventually that's where we'll end up. But we push quite hard to argue that that's not so easy to do because you've got these scouts who have a lot of expertise, a lot of experience. They've played the game. They know what it is to see a good player when they see one, and they've done this repeatedly, many hundreds of times. They go from one little town to another trying to watch all the players who they might recruit from college. And after watching hundreds of them, they have two or three who they can say, “That was my guy that we eventually took.” He’s competing against other scouts and trying to figure all this out. But scouts also have their own biases—you might have a bias around the status quo. You always evaluate a player in a particular way… So you're overconfident.

And it often happens that these are always interesting discussions Brian, that the player may not look like the player you associate with a good baseball player. Do you go with the statistics, or do you go with what you're seeing with your own eyes? That's the big tension in the case.

Where we want to have students pause and think about is, first, what are the benefits that the scouts bring, and what are the benefits that data brings, and how might you bring them together. Because the data is not perfect either. That's true whether you're in baseball or in other organizations…

And then, of course, you always get into this interesting discussion that the guys doing the data have not played much baseball—they’re data guys.

Kenny: Now that's a stereotype.

Datar: Certainly in terms of their respective skills, they are very different even though they bring stuff to it. Organizations have a tough time trying to deal with that challenge. And in the very first class where this case is taught, I just want to expose students to that tension that we continue to revisit throughout the course.

Kenny: It sounds like what you're saying to some extent is you know, there's validity to both of these approaches, and you can make the best decision if you kind of bring all of that information together to help you drive your decision.

Datar: At this point the case transitions, Brian, into thinking about three circles or three parts of a Venn diagram. We always say it's in the intersection of those three circles that the best data science is done, and the case really helps you get there because you can actually see each of these circles play out.

So one circle is the computer science folks. They know about the data. They can manipulate the data. They can extract data. They can scrape data. You know, they can do a lot of things with data. On the other circle are the statisticians, the knowledge of mathematics. How might you combine this data in interesting ways to get interesting insights, just from a data analytic point of view. But the third circle, which we emphasize a fair amount, is domain knowledge. What do you know about the problem you are trying to solve? What do you know about why this data might help you solve the problem?

If you miss all three, if you do not have in a good data science application all three of these attributes coming together, you'll either get irrelevant predictions [or,] you'll get static research. You really do need the scouts bringing their domain knowledge. You do need the data folks coming up with the data, and you do need the statistical analysis all working together.

Therein lies the challenge in modern organizations because for the most part those two circles haven't been as prominent as we have seen in the last 10 years, where data sciences continue to play a much, much bigger role.

Kenny: So there is the lesson for people who are not just general managers of baseball teams, but general managers generally speaking, right?

Datar: Very much so.

Kenny: Great. Srikant, thank you so much for joining us today.

Datar: Thank you, Brian, for having me.

Kenny: If you enjoyed hearing about the Oakland A's case, you might want to check out other episodes of Cold Call. Subscribe on Apple podcasts or wherever you listen. I'm your host Brian Kenny, and you've been listening to Cold Call, an official podcast of Harvard Business School.

 Read more

Brian Kenny: Intangibles. It's a word that by definition defies definition, and it's used all the time in the world of professional baseball. No one knows when the term was first used, but the idea that some players' contributions simply can't be measured goes back a long way. Maybe all the way back to Eddie Stanky, the shortstop who played on five major league teams over 10 years. Dodger’s manager Leo Durocher said of Stanky in a 1950 scouting report, "He can't hit, can't run, can't field. He's no nice guy. All that little SOB can do is win."

In addition to hitting, fielding, and running, generations of baseball scouts have given credence to the, “can't quite put your finger on it gut feeling that this guy has it.” These days, however, they're not the only ones in on the decision. Today we'll hear from Professor Srikant Datar about his case study entitled, The Oakland Athletics: Strategy and Metrics for a Budget, co-written with Caitlin N. Bowler. I'm your host Brian Kenny, and you're listening to Cold Call.

Srikant Datar is an expert in the areas of cost management and management control, strategy implementation, and governance. He also developed two new courses at Harvard Business School on developing mindsets for innovative problem solving and managing with data science, and I think those courses are highly relevant to the case we’re going to talk about today. Srikant, thanks for joining me.

Srikant Datar: Pleasure, Brian. Lovely to be here.

Kenny: I think a lot of people will recognize this case from the movie that came out a few years ago, Moneyball, very funny movie with Brad Pitt. But here, you come at this from a very different perspective and one that makes it a topic worthy of discussion in the MBA classroom. Start by setting the case up for us. Who’s the protagonist and what's on his mind?

Datar: The protagonist is Billy Beane. He's the general manager of the Oakland A’s. The case is situated at the time of the 2002 player draft. Oakland is a low-budget team but has been competitive. Billy Beane wants to draft players that would help him win while keeping his budget low.

Kenny: What prompted you to write this case?

Datar: I use it as an introduction on my course on managing with data science to show that data can change the way you manage, but it raises all manner of organization issues that need to be managed at the same time. It's interesting as I was looking at the case and working on it, that baseball as it turns out has a lot of data. And yet it seems as you look at what Billy Beane was doing at that time, that much of the data wasn't being used. The interesting question is, Could it be used? Could it be used profitably? What challenges does it pose? And if you were going to use it, would it make a difference? Those are the topics we talk about in the case.

Kenny: Billy had some specific things he was trying to accomplish here. What were some of his strategic goals?

Datar: It’s very interesting the way Billy Beane frames the question. He says, "I want to recruit players who will help the A's win." At one level you look at it and say, yeah sure, that's how everyone ought to be deciding how to recruit players. But at that time in baseball, and I'd say to some extent that continues now, there was a great deal of emphasis around, I don't think you can really do that. I don't think you can really figure out how to recruit players who can win. What you can do is figure out the players who are [the most] talented—just choose the best athlete and they will help you win. What is very interesting in the case is that there are players who seem to be very talented but who do not get picked by the A's. It is a very interesting way to think about data, and to think about organizations.

Kenny: This gets back a little bit to what I was saying in the introduction, the notion of intangibles. If you look at baseball scouting traditionally, how has it been done? How were decisions made?

Datar: The most common model used to be you evaluate the player on five tools and at some level there's really nothing wrong with those tools. Batters, who have a different set of tools than pitchers. For the batters it’s the ability to hit, what's your batting average? Can you hit for power? Can you hit homeruns? Can you run? Can you field? That would include catching and throwing.

If you evaluated a player on those, why bother with the rest of these data statistics that are there? Billy Beane was challenging that notion in part, and this is always an interesting part of the case. One doesn't really know in the end whether it's the reason why he begins to think about it differently, but Billy Beane himself was a very good baseball player and he was seen to have all these five tools. He was seen by scouts at the time as a “can't miss” player. I mean, he was just terrific. Turns out he did pretty badly in the Major Leagues as a baseball player.

Kenny: Right.

Datar: He did fantastic as a manager afterwards. I always wondered whether the fact that he was rated so highly on those very tools, and that it didn't actually work out, caused him to rethink. Nothing like personal experience to learn from and grow from, but to his credit he does.

Kenny: You mentioned earlier that an awful lot of data is captured about baseball. There are whole almanacs that are devoted to the game. You've got people at games who actually keep their own scorecards and then submit them to Major League Baseball afterwards. So it's definitely a data-intensive crowd.

Datar: Yes.

Kenny: Tell us a little bit about Bill James and sabermetrics. That was really the core of the story behind Moneyball.

Datar: It's very interesting. Bill James actually worked as a warehouse clerk but he was fascinated by baseball. He was well educated and therefore also fascinated by what data might do for baseball. It was interesting to him that no one seemed to use that data systematically. That is one important part of how sabermetrics comes about. It's around the Society for American Baseball Research, that’s where the word “saber” comes from and metrics are the kind of data that these folks who are very interested in statistical research were looking at. James then starts producing these baseball abstracts, as he calls them, and they were very powerful treatises on what you could do with the statistics that were available in baseball.

He says, “The problem is that baseball statistics are not pure accomplishments of men against other men, which is what we're in the habit of seeing them as. They are accomplishments of men in combination with their circumstances.” Context matters, and so when you're thinking about wins, it's just not that a person's a very good player, but you've got to think about how would you put that in the context of a team and what you're doing.

And his other interesting question was, which I use a lot in my course, is that although there's a lot of data available, do you have a good question to ask? Until you have a very good question to ask, data can help you to some extent … but it's much more effective when you have a good question to ask. Then you can really look at the data and say, does it make sense or not?

I have another short quote from James. "I do not start with the numbers any more than a mechanic starts with a monkey wrench. I start with the game, with the things that I see there, and the things that people say there, and I ask, is it true? Can you validate it? Can you measure it? How does it fit with the rest of the machinery? And for those answers, I go to the record books." Now it's remarkable that despite all the records that have been kept around baseball, not much thought had been given how might you actually use it. He comes up with a number of very interesting metrics that were not being looked at.

Kenny: What are a couple of those metrics beyond the hitting and the running and the fielding?

Datar: The thing that he's most well known for, which becomes a bit of an adage afterwards, is that a walk is as good as a hit because you [still] get to first base, whether or not you hit the ball successfully.

Kenny: And you know on that point, Ed Stanky, who I mentioned in the introduction, had the most walks five years in a row in the National League. He walked 100 times in one year. So if you looked at him just on a value basis, he was the most valuable player in the National League at that time.

Datar: At that time it wouldn't have shown up as a very [meaningful] statistic. So, James and his analysis of what it takes for a team to score runs and win, comes up with this analysis around how important on-base percentage is. This wasn't a very well-known statistic before. He begins to start thinking about things that we now take for granted but at the time were not. So slugging average. He's thinking, what does it take to score runs? How do you get runs scored, and how many bases can you get? So slugging, and then what became known later then as OPS where you take slugging and also on-base. He started looking at all these metrics and trying to figure out which metrics are better predictors of runs and wins than metrics used before. He finds a remarkable causation correlation. I think is very powerful evidence that some of these variables that he began looking at provided better explanation of why teams score runs and wins.

Kenny: So the case takes us sort of into the bowels of the Oakland Coliseum, the stadium where the A’s play baseball. You describe a conversation that's playing out between the traditional baseball scouts that the A’s still use and Billy Beane's new cadre of people who are using sabermetrics. How does this change the dynamic of that relationship?

Datar: This is one of the most important points that we dwell on in context of the case discussion because, as you start getting this data, what do the scouts add? Can I just look at data and forget about the scouts? Or, can it be done in addition to the scouts, and what are those challenges? That's a little bit of the tension that we try to discuss.

Students very quickly like to think of it as something that can be done in addition to [what the scouts do], and eventually that's where we'll end up. But we push quite hard to argue that that's not so easy to do because you've got these scouts who have a lot of expertise, a lot of experience. They've played the game. They know what it is to see a good player when they see one, and they've done this repeatedly, many hundreds of times. They go from one little town to another trying to watch all the players who they might recruit from college. And after watching hundreds of them, they have two or three who they can say, “That was my guy that we eventually took.” He’s competing against other scouts and trying to figure all this out. But scouts also have their own biases—you might have a bias around the status quo. You always evaluate a player in a particular way… So you're overconfident.

And it often happens that these are always interesting discussions Brian, that the player may not look like the player you associate with a good baseball player. Do you go with the statistics, or do you go with what you're seeing with your own eyes? That's the big tension in the case.

Where we want to have students pause and think about is, first, what are the benefits that the scouts bring, and what are the benefits that data brings, and how might you bring them together. Because the data is not perfect either. That's true whether you're in baseball or in other organizations…

And then, of course, you always get into this interesting discussion that the guys doing the data have not played much baseball—they’re data guys.

Kenny: Now that's a stereotype.

Datar: Certainly in terms of their respective skills, they are very different even though they bring stuff to it. Organizations have a tough time trying to deal with that challenge. And in the very first class where this case is taught, I just want to expose students to that tension that we continue to revisit throughout the course.

Kenny: It sounds like what you're saying to some extent is you know, there's validity to both of these approaches, and you can make the best decision if you kind of bring all of that information together to help you drive your decision.

Datar: At this point the case transitions, Brian, into thinking about three circles or three parts of a Venn diagram. We always say it's in the intersection of those three circles that the best data science is done, and the case really helps you get there because you can actually see each of these circles play out.

So one circle is the computer science folks. They know about the data. They can manipulate the data. They can extract data. They can scrape data. You know, they can do a lot of things with data. On the other circle are the statisticians, the knowledge of mathematics. How might you combine this data in interesting ways to get interesting insights, just from a data analytic point of view. But the third circle, which we emphasize a fair amount, is domain knowledge. What do you know about the problem you are trying to solve? What do you know about why this data might help you solve the problem?

If you miss all three, if you do not have in a good data science application all three of these attributes coming together, you'll either get irrelevant predictions [or,] you'll get static research. You really do need the scouts bringing their domain knowledge. You do need the data folks coming up with the data, and you do need the statistical analysis all working together.

Therein lies the challenge in modern organizations because for the most part those two circles haven't been as prominent as we have seen in the last 10 years, where data sciences continue to play a much, much bigger role.

Kenny: So there is the lesson for people who are not just general managers of baseball teams, but general managers generally speaking, right?

Datar: Very much so.

Kenny: Great. Srikant, thank you so much for joining us today.

Datar: Thank you, Brian, for having me.

Kenny: If you enjoyed hearing about the Oakland A's case, you might want to check out other episodes of Cold Call. Subscribe on Apple podcasts or wherever you listen. I'm your host Brian Kenny, and you've been listening to Cold Call, an official podcast of Harvard Business School.

Post A Comment

In order to be published, comments must be on-topic and civil in tone, with no name calling or personal attacks. Your comment may be edited for clarity and length.