Flash floods are most of the deadliest weather events in the world, killing more than 5,000 human each year. They’re also among the most tough to predict. But Google thinks it has fixed that problem in an unlikely way— by reading the news.
While humans have organized quite a few weather data, flash floods are too short-lived and localized to be measured comprehensively, the way the temperature or even river flows are monitored over the years. That data gap means that deep learning models, which might be an increasing capable of forecasting the weather, aren’t capable of predict flash floods.
To resolve that trouble, Google researchers used Gemini — Google’s large language model — to sort via 5 million news articles from around the world, keeping apart reports of 2.6 million different floods, and turning those reports into a a geo-tagged time series dubbed “Groundsource.” It’s the first time that the company has used language models for this kind of work, in keeping with Gila Loike, a Google Research product supervisor. The research and dataset was shared publicly Thursday morning.
With Groundsource as a real-world baseline, the researchers trained a model constructed on a Long Short-Term Memory (LSTM) neural network to ingest global weather forecasts and creates the probability of flash floods in a given area.
Google’s flash flood forecasting model is now emphasizing risks for urban regions in 150 countries at the corporation’s Flood Hub platform, and sharing its data with emergency response groups around the world. António José Beleza, an emergency response official on the Southern African Development Community who trialed the forecasting models with Google, stated it supported his organization respond to floods more rapidly.
There are still limitations to the model. For one, it is fairly low resolution, figuring out risk throughout 20-square-kilometer areas. And it isn’t as precise because the U.S. National Weather Service’s flood alert system, in part due Google’s model doesn’t incorporate local radar data, which allow real-time tracking of precipitation.
Part of the point, though, is that the venture was designed to work in places in which local governments can’t manage to invest in costly weather-sensing infrastructure or don’t have extensive records of meteorological data.
“Because we’re aggregating millions of report, the Groundsource dataset truely support rebalance the map,” Juliet Rothenberg, a program manager on Google’s Resilience team, instructed newshounds this week. “It allows us to extrapolate to other regions where there isn’t as much information.”
Rothenberg stated the team hopes that using LLMs to develop quantitative datasets from written, qualitative resources could be implemented to efforts to constructing datasets about other ephemeral-but-essential-to-forecast phenomena, like heat waves and mud slides.
Marshall Moutenot, the CEO of Upstream Tech, a corporation that makes use of comparable deep learning to forecast river customers like hydropower companies, stated Google’s contribution is a part of a increasing effort to assemble data for deep learning-primarily based weather forecasting models. Moutenot co-based dynamical.org, a group curating a collection of machine learning-ready weather data for researchers and startups.
“Data scarcity is one of the most difficult challenges in geophysics,” Moutenot said. “Simultaneously, there’s too much Earth data, and then when you want to evaluate against truth, there’s not enough. This become a truely creative approach to get that data.











