NOAA’s Data Heads for the Clouds

OSTP - NOAA Data

The VIIRS satellite sensor alone currently produces over 2 terabytes of data daily, and the launch of the next-generation GOES-R satellite in 2016 promises to add another 3.5 terabytes each day. (Photo by NASA/NOAA)

If you ask the National Oceanic and Atmospheric Administration (NOAA) about big data, they will give you some big numbers. Over 20 terabytes per day of observational data are produced by their satellite systems alone, and then there’s the the massive weather and climate models from the bureau’s 1.5 petaflops of computing power, incoming observations from a network of hundreds of buoys, and live streaming video from the research vessel Okeanos Explorer.

NOAA is America’s environmental intelligence agency, and its mission -- to protect life and property, provide the information communities need to become resilient to severe weather- and climate-related events, and conserve and protect national resources -- requires a great deal of number crunching.

The sharing of knowledge and information with others is also part of NOAA’s mission, and it has long been a leader and supporter of government open data efforts. As an agency within the Department of Commerce, NOAA appreciates the importance of private industry and economic growth, and is proud that its data already helps to support vital markets such as the commercial weather, aviation, and insurance industries. Expanding data access even further could create new markets, spur economic growth, and create jobs; research by the McKinsey Global Institute suggests that open data could add more than $3 trillion in total value annually to the education, transportation, consumer products, electricity, oil and gas, healthcare, and consumer finance sectors worldwide.

However, the effort and cost involved in distributing tens of terabytes of data daily is staggering, and so the agency has been seeking innovative ways to increase its data’s availability without exhausting its own resources in the process. In early 2014, NOAA reached out to the private sector through a Request for Information to enlist help in making data available to the public in a rapid, scalable, and inexpensive manner. The response was overwhelmingly positive, with over 70 responses to the initial RFI and over 200 companies represented at a subsequent Industry Day last October. It was clear that industry saw great untapped economic potential in making NOAA’s environmental data more accessible, and that this economic potential could far outweigh the data distribution costs.

Amongst all of the support and enthusiasm, though, there were still a great many open questions about the specifics of the implementation and the business model, which questions could not be answered without further input and innovation from industry. Reaching out to the Infrastructure-as-a-Service (IaaS) providers mentioned most frequently by RFI respondents, NOAA suggested a joint experiment: If the IaaS providers could help to position NOAA’s data next to their own high performance computing, analytic, and storage services, would the rest of the private sector take advantage of that positioning to run algorithms, perform research, and create inventions? Might the revenue gained from new products, infrastructure services, and analytics be so great that it could help cover the cost of the original data dissemination, creating a self-sustaining ecosystem?

On April 21st, NOAA and the Department of Commerce, along with Amazon Web Services, Google Cloud Platform, IBM, Microsoft, and the Open Cloud Consortium, announced they had entered into Cooperative Research and Development Agreements to explore answers to these questions and bring the Department closer to its goal of unleashing its vast resources of environmental data. To support the participation of others in the private sector, collaborators will form “data alliances” comprised of companies from existing major economic sectors, specialized small business, value-added resellers, entrepreneurs, researchers and non-profits, individuals -- anyone who wants to work with NOAA data.

Data alliances will each serve as a prototype for the larger market, ensuring that while NOAA and its collaborators research and develop technology to efficiently distribute the data, they can also test out the hypothesis of self-sustainability within a fully representative market ecosystem. NOAA will maintain its existing data distribution sites and portals, and ensure that all of the data distributed through the new “data alliance” model is also in the public domain and accessible non-preferentially. And while collaborators may opt to recover their “shipping & handling” costs of distribution, the hope is that the profits and value the market recognizes through use of NOAA data will far outweigh and, potentially, help to subsidize those costs.

This is a brand new way of approaching data distribution, and of leveraging the value that publicly funded datasets bring to the commercial marketplace. If NOAA’s endeavor proves successful, it could lead to far greater availability of open data across the entire government, providing taxpayers and private industry with information vital to economic growth, innovation, and discovery.

Read more about NOAA and its collaborators’ Big Data Project in the Department of Commerce press release and the project site’s FAQs.

The Presidential Innovation Fellows program is currently on the lookout for the most talented innovators and technologists to work on our nation's most pressing challenges. Acting as a small team alongside federal agency “co-founders,” Fellows will serve for 12 months as entrepreneurs-in-residence, working quickly and iteratively to turn promising ideas into game-changing solutions. Interested? The first step is to apply online (use referral code “Andromeda” on your application). 

Maia Hansen and Alan Steremberg are Presidential Innovation Fellows based at the National Oceanic and Atmospheric Administration (NOAA)

Your Federal Tax Receipt