16 Nov 2010  Lessons from the Classroom

Data.gov: Matching Government Data with Rapid Innovation

Data.gov is a young initiative of President Barack Obama for making raw data available on the Web. In an HBS executive education class for technology specialists, professor Karim Lakhani and the US Chief Information Officer, Vivek Kundra, sparked dialogue about new routes to innovation. Key concepts include:

  • Data.gov makes government data--as long as it does not compromise national security or individual privacy--available on the Web in raw, machine-readable format.
  • Data.gov is part of the Open Government initiative launched by President Barack Obama on his first day in office.
  • As a lean organization with a mandate to move fast, Data.gov posted the first datasets five months later.
  • Its goals are transparency, participation, collaboration, and management of systems and processes.
  • The HBS case study of Data.gov, coauthored by professor Karim R. Lakhani, highlights a number of useful applications sparked by the Web site. One in particular creates benefits for taxpayers by sharing information between the Internal Revenue Service and the Department of Education.

 

Innovation happens fast and slowly. The GPS applications so prevalent today to guide us from Point A to Point B took their first baby steps nearly three decades ago when President Ronald Reagan encouraged the release of military GPS signals free of charge. Will a key initiative of President Barack Obama-to move government data to the Web-lead to public benefits much faster?

Data.gov, the subject of a new HBS case study, taught for the first time this summer, highlights the potential of raw data to spur citizen creativity and practical applications. It also suggests the possibility that organizations in private industry could learn from the example of Data.gov to the extent of unlocking data from individual silos in their firm even though data remain protected within firewalls. HBS assistant professor Karim R. Lakhani, who specializes in the management of technological innovation and product development in firms and communities, co-wrote the case with former HBS professor Robert D. Austin and Yumi Yi to encourage further exploration of the benefits and tactics of open-data approaches.

" All agencies will have issues, of course, about making data available, because historically they may have not "

Joined in class by the Chief Information Office (CIO) of the United States, Vivek Kundra, who oversees Data.gov, Lakhani led the case discussion for 50 technology executives in a weeklong HBS executive education course, Delivering Information Services. The participants-CTOs, CIOs, and other top executives representing fields as diverse as telecommunications, financial services, and pharmaceuticals, as well as government entities in the US and overseas-debated the pluses and minuses of Data.gov's decisions, its organizational realities in the context of their own experience, and tactics to improve its reach and impact. Kundra joined the conversation near the end of class to answer questions and share insights.

"There is tremendous interest internationally" in the example of Data.gov, said Lakhani. When the initiative was less than a year old it had already posted 118,000 datasets for public use. "I cowrote the case in part to provide a field guide to suggest how to encourage data openness within organizations and even countries. Some countries, I think, would be better at it: Canada, Scandinavian countries, for instance, and Western democracies generally."

Goals of the case study

"I have several teaching goals," said Lakhani as he prepared for class. "One is to highlight the imperative for organizations to shift towards an open-data approach, especially in government where the default has been to keep data closed and secure.

"Second, to explore the organizational constraints and resistance to an open-data approach. All agencies will have issues, of course, about making data available, because historically they may have not.

"Third, to probe issues of strategy concerning the best way to launch a similar initiative, both in terms of technology as well as the buy-in needed from various agencies.

"Fourth, to ask executive participants in my class how Data.gov should reconcile both its public-citizen aspects of accountability and its potential to mediate private innovation."

Success for CIO Kundra is two-fold, Lakhani added. It means fulfilling the public mission of an informed citizenry as well as also the private mission of enabling innovation. In terms of building government IT infrastructure, Data.gov demonstrates a way to be rapid and agile, not bloated and bureaucratic.

"Kundra wants to post on the site any government data that does not have national security or privacy concerns. That's a lot of data. Anything that can be put online should be put online."

Pros and cons

In class, participants who had read the case pointed to a wealth of positive factors about Data.gov:

  • Transparency: Its official motive of transparency allows citizens more control of information that affects them. Giving "power to the people" puts a new set of eyes and ears on government and holds officials more accountable.
  • Business opportunities: Data.gov opens the door for the private sector to add value to government data. In particular, it may prove a boon to small businesses, which can devise creative applications.
  • Organizational agility: As a lean organization with minimal staff, Data.gov made the right move by posting, as a first step, varieties of data from the US Census Bureau, the Centers for Disease Control, the Environmental Protection Agency, and the Department of Interior, without focusing on specific "customer" needs. One executive observed, "What customers do is up to the customers."
  • Changing the face of government: Its example could improve the culture of government. "Getting agencies into the habit of making data available is a good first step," said a CTO in the class. Other agencies want to look good, too. There is pressure on officials to not get left behind.
  • A go-to site for citizens: It centralizes datasets for citizen use. It may cut down on the volume of requests that local agencies need to field on a day-to-day basis.

    Participants also probed questions of concern:

  • Political window dressing: Will Data.gov release controversial datasets or will it favor uncontroversial information such as health statistics over military casualties? If it releases what is construed by the public as sanitized data, will citizens view the site cynically?
  • Customer needs: As business people, some class participants wanted to see a clearly outlined customer perspective defining customer needs.
  • Tradeoffs for fast growth: Several participants wondered what the endeavor interrupted in local agencies as it began to fulfill its mandate of gathering data. "I don't believe there were no ripple effects," said one executive.
  • Public trust and consistency of data architecture: Does government data match across various agencies? Will inconsistencies raise doubts among the public about data veracity overall?
  • As instructor, Lakhani challenged the executive participants to consider Data.gov as a lean organization needing to fulfill quickly President Obama's mandate without excessive discussion of pros and cons. They would experience similar pressure from an executive directive in any industry, he said. "We have to face tradeoffs when we design and execute. There are different ways to approach the same problem," he said.

    Several participants in the class agreed. Said one, who works for a government agency, "I can attest that when huge initiatives come along, whatever 'seems impossible' soon becomes a fact of life. To say, 'I need to do a study first' is not a [wise] response."

    Advice for the US CIO

    Asked by Lakhani how Data.gov should grow strategically, executive participants suggested that while transparency of government data overall was an important goal in principle, Data.gov should prioritize its acquisition efforts and pursue specific high-value targets. One CTO recommended giving priority to environmental data in order to encourage the public to invent ways to help clean up the disastrous effects of the recent oil spill in the Gulf of Mexico.

    Another class participant pointed to an example of high-value data use documented in the Data.gov case study: In Virginia where there were problems with a bidding process, citizens were able to learn where exactly money was being wasted, and take action to stem the tide.

    A third said that Data.gov should focus less attention on data acquisition than on encouraging private industry to develop applications. "Brand them as 'powered by Data.gov.' The end user, rather than the average citizen, should be a key focus of your strategy," he advised.

    These views were challenged by one participant, however. For the sake of public trust, he said, Data.gov should focus on transparency rather than commit too much organizational attention to the development of applications. "A problem we face in the United States today is a lack of trust in government officials," he explained. "There is no point in adding services over a foundation we don't trust. The number-one priority of Data.gov should be to restore confidence in our government. The average person should be able to interpret these data."

    The US CIO weighs in

    "This discussion has been about binary choices," observed Kundra with a smile as he rose to address the class. "I would like to step back a bit and share with you some of the motivations behind Data.gov."

    Information is power, he began. By "democratizing data," ordinary citizens have the ability to shift the balance of power in positive ways that can encourage innovative ideas to be developed into practical goods and services. "Washington, DC does not have a monopoly on the best ideas," he told the executives. "The public has the ability to innovate."

    Data.gov allows people to be watchdogs as well as innovators, he continued. One helpful innovation marries government data about recalls of baby products with the nifty Red Laser app that is available for the iPhone: Before considering purchase of something for their child, parents with the app can scan the bar code of any product and immediately check for recalls, thus ensuring the safety of their children.

    Releasing government data and allowing the public to innovate creates a process of continuous feedback, he said. People can see how the government spends taxpayer money. "Our goal is to create a runway, a platform for innovation. The government can't make the most innovative apps. But Data.gov can be a platform."

    "Our goal is to create a runway, a platform for innovation"

    Transparency of information leads by necessity to controversy, Kundra allowed. People are bound to ask which data is on the site and which is not. "We release data on toxicity, but not on national security and privacy. It would be a mistake, for instance, to release zip-code level data about health care" because the privacy of individuals would be at stake, he said. Data.gov is seeking even more raw data from US agencies such as the Department of Defense, Health and Human Services, and the Environmental Protection Agency, but his organization does not expect to gain controversial datasets.

    Rather than focusing unduly on issues of data governance, the executive participants should think about innovation opportunities in data curation, he suggested. "Some US government data is still on COBOL-based platforms," Kundra reminded the class. "So we think an industry will form around data curation. The Internal Revenue Service, the Centers of Disease Control, and the National Institutes of Health are huge enterprises. There will not be a single governance model."

    What can private industry learn from government?

    Data.gov serves as a beacon for changing the IT culture in Washington, DC to focus more attention on execution, he said. As CIO of the United States, his role from President Obama is to identify troubled projects, hold CIOs accountable (there are 200 CIOs across various government agencies, he said) and practice relentless follow-up. Gone are the days of deliverables scheduled five to ten years from now, Kundra promised. "A deliverable that is customer-facing should be ready within 6 months. On Data.gov we put up our IT Dashboard, an interactive site that tracks Federal IT investments over time, in 60 days."

    A key aspect of the Data.gov case is the extent to which innovation can be encouraged within individual organizations by pursuing a similar model of openness of data, added Lakhani. Just as there are benefits for the public when data is unleashed, so are there benefits for innovation-minded employees in private enterprise when data that was formerly held within silos is made available throughout the organization.

    "CIOs and CEOs could consider what data they should make available throughout their enterprise," Lakhani said. "It would be great if any employee could look at data and think about different ways to mash it up. Just as there are concerns about security and privacy with the government's data, there are security, privacy, and intellectual property concerns about private-sector data. But Data.gov shows that those things are manageable."