Knowing when to use artificial intelligence and when to rely on the human mind is a shifting fine line, one delineated by new research that shows considerable benefit and speed from generative AI—if it’s applied to the right tasks.
What businesses need to know from a study of how 750 Boston Consulting Group employees work with AI: Humans are still needed to make that call. To operationalize AI successfully, managers must carefully select its applications, train workers in using it properly, and quickly move the line as AI improves.
A multidisciplinary team including several Harvard Business School faculty worked closely with BCG to test AI in real-world simulations. They found that consultants using AI complete certain kinds of tasks faster, with results that are 40 percent higher in quality (though AI may somewhat stifle the diversity of their ideas). But on more complex tasks, consultants using AI were 19 percentage points less likely to produce the right answer.
“Organizations need an experimental mindset where they implement a methodical test-and-learn approach.”
How to tell which tasks fall on which side of that “jagged technological frontier” requires “studying it with your people, deeply in your context,” explains Edward McFowland III, assistant professor at Harvard Business School and one of the study’s authors. "The challenge right now is that this is brand new. Companies cannot simply ignore these tools, because they have tremendous value that their competitors will be exploiting. However, turning them loose on all use cases can have an array of detrimental consequences. So, organizations need an experimental mindset where they implement a methodical test-and-learn approach."
The findings have broad implications for how companies think about using AI, and the nuances of its limits. Since ChatGPT debuted a year ago, automation hopes and fears—previously limited to factory floors and supermarket checkout lines—have shaken the ranks of highly-educated knowledge workers and provided new avenues for explosive growth.
The research team also includes Karim Lakhani, the Dorothy and Michael Hintze Professor of Business Administration at HBS, and Fabrizio Dell'Acqua, a postdoctoral research fellow at HBS. They are joined by Boston Consulting Group’s François Candelon, Lisa Krayer, and Saran Rajendran; Ethan Mollick from the University of Pennsylvania’s Wharton School; Katherine Kellogg from MIT’s Sloan School of Management, and Hila Lifshitz-Assaf of Warwick Business School.
Embedded inside a multinational consulting firm
The researchers tested how 758 consultants—some 7 percent of BCG’s individual contributor consulting workforce—performed highly-skilled tasks with and without AI. After establishing a performance baseline, consultants were randomly divided into two groups.
The first group would “conceptualize and develop new product ideas, focusing on aspects such as creativity, analytical skills, persuasiveness, and writing skills,” the study explains. The second would engage in “business problem-solving tasks using quantitative data, customer and company interviews, and a persuasive writing component.”
The experiments represent typical activities for consultants; some mirrored situations the company uses to screen job applicants, who often have PhDs.
Within each group, some consultants had no AI access, others used GPT-4 alone, while still another group was given GPT-4 and some training in how to use it.
Because the authors partnered with BCG, they were able to observe each interaction and evaluate it with specificity, down to each answer, McFowland notes.
Where AI excels: Design a shoe
The first half of the consultants tackled a series of 18 tasks selected to exist “inside the frontier” of what ChatGPT can do well.
They were asked to imagine that they worked in the development division of a footwear company and to come up with a new product. Consultants had to generate 10 ideas, describe a prototype, and consider at least four concepts—and why the final concept was chosen. They were also asked to write a 2,500-word Harvard Business Review-type essay to describe the process.
The results: Consultants using AI finished 12 percent more work at a 25 percent faster clip than their non-AI using counterparts.
“The observation and documentation of the existence of this phenomenon and its implications is fascinating.”
And using AI gave consultants who were previously rated below the median in a related pre-task performance the biggest boost. AI helped lift the lower-end group’s performance by 43 percent compared to previous scores. For top-rated performers, that figure was 17 percent.
“There's still a skill distribution (among knowledge workers) and the question becomes how to help them work effectively with AI,” explains McFowland. “People in the lower half of that distribution had a much larger productivity bump. But on average, everyone seems to do better.”
Surprisingly though, says McFowland, certain solutions using AI yielded a more limited variety of answers, suggesting a potential stifling of creative thinking that requires further study.
“You can imagine many industries and many contexts where variety and the diversity of ideas is really important,” he says. “Now, there might be answers on how to increase the variety [of answers], but the observation and documentation of the existence of this phenomenon and its implications is fascinating. We talk a lot about the productivity boosts from this kind of tool. But we are now also discovering some of the unforeseen consequences.”
Where AI struggles: “Tactical actions”
The “outside the frontier” assessment task involved more complex problem-solving. The researchers designed a task “where consultants would excel, but AI would struggle without extensive guidance,” the paper explains.
The BCG consultants in this group first had to understand a company’s distribution channels that included franchise, fully-owned, and online stores, advise which one to focus on for profit growth, and offer “tactical actions” to improve profit through that channel. They were then instructed to assess a company's three brands—men’s, women’s, or kids’—with an eye toward boosting revenue growth.
Using spreadsheets and interviews, consultants were asked to write a short note to the company’s CEO outlining the rationale for their suggestions. Instructions told the consultants to be creative and to “feel free to rely on your own business judgement” when making their recommendations.
Here, humans outperformed AI by wide margins. Consultants using AI plus training saw a 24 percentage point decline in correct answers compared to non-AI-using peers, and those using AI alone saw a 13 percentage point drop.
Lessons from AI’s first year
So, how can knowledge-based businesses use AI well without wandering outside the frontier to stifle creativity or risk incorrect outcomes?
For organizations: Lean in on learning
- Move beyond whether to adopt AI. Instead decide what specific tasks would benefit—and stay within those borders.
- Partner with academic or other experts to help design an evaluation. Experts can “help design a study to test them in your context and see what’s happening,” McFowland says.
- Once you decide to use it, train your workers to use it right. “With any new technology comes a need for training, a need for folks to use it the right way and avoid the pitfalls” McFowland says.
For workers: Are you a centaur or cyborg?
Consultants who used AI most effectively behaved like either “cyborgs” or "centaurs,” the research finds.
- Centaurs, named after the half-horse, half-human animal from Greek mythology, split tasks between AI and themselves. They were able to “discern which tasks are best suited for human intervention and which can be efficiently managed by AI,” the paper explains.
- Cyborgs, referring to fictional beings who are part human, part machine, integrated technology with every task. “Cyborg users don’t just delegate tasks; they intertwine their efforts with AI at the very frontier of capabilities,” the researchers write. “This strategy might manifest as alternating responsibilities at the subtask level, such as initiating a sentence for the AI to complete or working in tandem with the AI.”
You Might Also Like:
- Is AI Coming for Your Job?
- Swiping Right: How Data Helped This Online Dating Site Make More Matches
- When Experts Play It Too Safe: Innovation Lessons from a NASA Experiment
Feedback or ideas to share? Email the Working Knowledge team at hbswk@hbs.edu.
Image: HBSWK