We rely so heavily on information gathered by satellites and weather instruments to help us program our daily lives, imagine what would happen if the data we received from these technologies went bad and foretold of cataclysmic outcomes in the days or weeks ahead? Panic could induce scenes on our streets reminiscent of Hollywood disaster movies. To avert such events - or just help get things right even if the forecast is more mundane - scientists at the National Center for Atmospheric Research (NCAR) and the University of Colorado at Boulder (CU) have devised an innovative computational technique called Intelligent Outlier Detection Algorithm, or IODA, that draws on statistics, imaging, and other disciplines in order to detect errors in sensitive technological systems.
Acting as a type of quality control for collected data, IODA can alert operators to faulty readings or other problems associated with failing sensors. If sensors malfunction and begin transmitting bad data, computers programmed with the algorithm could identify the problem and isolate that bad data well before things get out of hand.
The developers hope to extend the use of the algorithm’s principles to cars and other transportation systems, power plants, satellites and space exploration, and data from radars and other observing instruments.
"This could, at least in theory, enable operators to keep a system performing even while it's failing," says Andrew Weekley, a software engineer at NCAR who led the algorithm development effort. "When a system starts to fail, it's absolutely critical to be able to control it as long as possible. That can make the difference between disaster or not."
IODA performs quality control on data collected over time, such as wind speeds over the course of a month. The algorithm draws on statistics, graph theory, image processing, and decision trees, and can be applied in cases where the correct assessment of data is critical, the incoming data are too numerous for a human to easily review, or the consequences of a sensor failure would be significant.
At present the algorithm consists of several thousand lines of a technical computing language known as MATLAB but it could be translated into a computer programming language such as C so it can be used for commercial purposes.
Ensuring the quality of incoming time series data is paramount for every organization involved in complex operations. If sensors begin relaying inaccurate information, it can be highly challenging to separate good data from bad, especially in cases involving enormous amounts of information.
IODA is different to many common analytical system because it compares incoming data to common patterns of failure - an approach that can be applied broadly because it is independent of a specific sensor or measurement.
Weekley and the colleagues took a new approach to the problem when they began developing IODA ten years ago. Whereas existing methods treat the data as a function of time, Weekley conceived of an algorithm that treats the data as an image.
This approach mimics the way a person might look at a plot of data points to spot an inconsistency. For instance, if you were to study a line drawn between points on a graph that represented morning temperatures rising from 50-70°F, and then spotted a place where that smooth line was broken, dipping sharply because of numerous data points at 10°F, you would immediately suspect there was a bad sensor reading.
But imagine studying cases where there were thousands or even millions of data points about temperature or other variables? How hard would it be to pinpoint the bad ones?
Weekley believed that a computer could be programmed to recognize common patterns of failure through image processing techniques. Then, like a person looking at data, the computer could identify problems with data points such as jumps and intermitency; view patterns in the data; and determine not only whether a particular datum is bad but also characterize how it is inaccurate.
"Our thought was to organize a sequence of data as an image and apply image processing techniques to identify a failure unambiguously," Weekley said. "We thought that, by using image processing, we could teach the system to detect inconsistencies, somewhat like a person would."
“[This] is a radical departure from the usual techniques found in the time series literature," says Kent Goodrich, a CU professor of mathematics and a co-author of the research. "The image processing and other techniques are not new, but the use of these images and techniques together in a time series application is new, and IODA is able to characterize good and bad points very well in some commonly encountered situations."
When the research team tested IODA, they applied the algorithm to wind readings from anemometers in Alaska that contained faulty errors due to a loose nut, which left the anemometers unable to consistently measure gusts in high-wind situations.
"This technique has very broad implications," Weekley says. "Virtually all control systems rely on time series data at some level, and the ability to identify suspect data along with the possible failure is very useful in creating systems that are more robust. We think it is a powerful methodology that could be applied to almost all sequences of measurements that vary over time."
IODA was funded by The National Science Foundation.
See the stories that matter in your inbox every morning