Are defect rates declining? Is customer satisfaction rising? You inspect the numbers, but you're not sure whether to believe them. It isn't that you fear fraud or manipulation, it's that you don't know how much faith to put in statistics.

You're right to be cautious. "The actual statistical calculations represent only 5 percent of the manager's work," says Frances Frei, an assistant professor at Harvard Business School who teaches two-day statistics seminars to corporate managers. "The other 95 percent should be spent determining the right calculations and interpreting the results."

Here are some guidelines for using statistics effectively, derived from Frei's seminar and other sources. Although the perspectives offered here won't qualify you to be a high-powered statistical analyst, they will help you decide what to ask of the analysts whose numbers you rely on.

1. Know what you know—and what you're only asserting

"In real life, managers don't do as much number crunching as they think," says Victor McGee, professor emeritus at Dartmouth College's Amos Tuck School of Business. "In fact, managers are primarily idea crunchers: They spend most of their time trying to persuade people with their assertions." But they rarely realize the extent to which their assertions rest on unproved assumptions. McGee recommends color-coding your "knowledge" so you know what needs to be tested. Red can represent what you know, green what you assume, and blue what you "know" because of what you assume. Assumptions and assertions—green and blue knowledge—shouldn't be taken seriously unless there is red knowledge supporting them.

2. Be clear about what you want to discover

Some management reports rely heavily on the arithmetic mean or average of a group of numbers. But look at Figure 1, a histogram analyzing customer-satisfaction survey results on a scale of 1-5. For this data set, the mean is 4. If that's all you saw, you might figure people are pretty satisfied. But as Figure 1 shows, no one actually gave your product a rating of 4; instead, the responses cluster around a group of very satisfied customers, who scored it at a 5, and moderately satisfied customers, who gave it a 3. Only by deciding beforehand that you wanted to look for subgroups within your customer base could you know that the mean would not be the most helpful metric for your search. "Ask the direct question," advises McGee: "What do you want to know?"

3. Don't take causality for granted

Management is all about finding the levers that will affect performance, notes McGee: "If we do such-and-such, then such-and-such will happen." But this is the world of green and blue knowledge. Hypotheses depend on assumptions made about causes, and the only way to have confidence in the hypothetical course of action is to prove that the assumed causal connections do indeed hold.

The only way to have confidence in the hypothetical course of action is to prove [that] the assumed causal connections do indeed hold. |

Say you're trying to make a case for investing more heavily in sales training, and you've got numbers to show that sales revenues increase with training dollars. Have you established a cause-and-effect relationship? No, says Frei—all you have is a correlation. To establish genuine causation, you need to ask yourself three questions. Is there an association between the two variables? Is the time sequence accurate? Is there any other explanation that could account for the correlation? To establish the association, Frei cautions, it's often wise to look at the raw data, not just the apparent correlation.

Figure 2 shows a scatter diagram plotting all the individual data points derived from a study of the influence of training on performance. Line A, the "line of best fit" that comes as close as possible to connecting all the individual data points, has a gentle upward slope. But if you remove the point Z (which represents $45,000 in training and $100,000 in sales volume) from the data set, the line of best fit becomes Line B, whose slope is nearly twice as steep as that of Line A. "When removing a single data point causes the slope of the line [of best fit] to change significantly," Frei explains, "you know that point is unduly influencing your results. Depending on the question you're asking, you should consider removing it from the analysis."

For the second question—Is the time sequence accurate?—the challenge is to establish which variable in the correlation occurs first. Your hypothesis is that training precedes performance, but you must check the data closely to make sure that the reverse is true—that it's the improved sales volume that is driving up training dollars. Question three—Can you rule out other plausible explanations for the correlation?—is the most time-consuming. Is there some hidden variable at work? For example, are you hiring more qualified salespeople and is that why performance has improved? Have you made any changes in your incentive system? Only by eliminating other factors can you establish the link between training and performance with certainty.

4. With statistics, you can't prove things with 100 percent certainty

Only when you have recorded all the impressions of all the customers who have had an experience with a particular product can you establish certainty about customer satisfaction. But that would cost too much time and money, so you take random samples instead. A random sample means that every member of the population is equally likely to be chosen. Using a nonrandom sample is the number one mistake businesses make when sampling, says Frei, even though a random sample is simple to generate with (for instance) Microsoft Excel.

"All sampling relies on the normal distribution and the central limit theorem," says Frei. These principles—found in any statistics textbook—enable you to calculate a confidence interval for an entire population based on a sample. Say you come up with a sample defect rate of 2.8 percent. Depending on the sample size and other factors, you might be able to say that you're 95 percent confident the actual number is between 2.5 percent and 3.1 percent. The fewer defects, incidentally, the larger your sample must be to establish a 95 percent confidence interval. "So as you get better," says Frei, "you need to spend more on quality-assurance sampling, not less."

5. A result that is numerically or statistically significant may be managerially useless

Take a customer satisfaction rating of 3.9. If you implemented a program to improve customer satisfaction, conducted some polling several months out to test the program's effectiveness, and found a new rating of 4.1, has your program been a success? Not necessarily—you have to check the confidence interval. In this case, 4.1 might not be statistically different from 3.9 because it fell within the confidence interval. In other words, you could have no more confidence that 4.1 is the real customer rating than you could that 3.9 is correct.

Because they're unaware of how confidence intervals work, managers tend to over-celebrate and over-punish. For example, a VP might believe the 4.1 rating indicates a genuine improvement and award a bonus to the manager who launched the new customer satisfaction program. Six months later, when the number has dropped back to 3.9, he might fire the manager. In both instances, the VP would be making decisions based on statistically insignificant shifts in the data.

See the latest issue of Harvard Management Update.

### Figure 2: Scatter diagram

Reprinted with permission from "The Use and Misuse of Statistics," **Harvard Management Update,** Vol. 11, No. 3, March 2006.

**Figure 1: Histogram**

Reprinted with permission from "The Use and Misuse of Statistics," **Harvard Management Update,** Vol. 11, No. 3, March 2006.