Developing Theory Using Machine Learning Methods

Author Abstract

We describe how to employ machine learning methods in theory development. Compared to traditional causal inference methods, ML methods make far fewer a priori assumptions about the functional form of the underlying model that best represents the data. Given this, researchers could use such methods to explore novel and robust patterns in the data that could lead to inductive theory building. ML strengths include replicable identification of novel patterns in the data. Additionally, ML methods address several concerns (such as “p-hacking” and confounding local effects for global effects) raised by scholars relative to the norms of empirical research in the fields of strategy and management. We develop a step-by-step roadmap that illustrates how to use four ML methods (decision trees, random forests, K-nearest neighbors, and neural networks) to reveal patterns in data that could be used for theory building. We also illustrate how ML methods could better illuminate interactions and non-linear effects, relative to traditional methods. In summary, ML methods could act as a complementary tool to both existing inductive theory-creating methods such as multiple case inductive studies and traditional methods of causal inference.

Paper Information

Full Working Paper Text
Working Paper Publication Date: September 2018
HBS Working Paper Number: HBS Working Paper #19-032
Faculty Unit(s): Technology and Operations Management

Developing Theory Using Machine Learning Methods

Author Abstract

Paper Information

Why Progress on Immigration Might Soften Labor Pains

Struggling With a Big Management Decision? Start by Asking What Really Matters

What's Enough to Make Us Happy?

Need to Solve a Problem? Take a Break From Collaborating

Why Boeing’s Problems with the 737 MAX Began More Than 25 Years Ago

Sign up for our weekly newsletter