Return to search

Environmental Sensor Anomaly Detection Using Learning Machines

The problem of quality assurance/quality control (QA/QC) for real-time measurements of environmental and water quality variables has been a field explored by many in recent years. The use of in situ sensors has become a common practice for acquiring real-time measurements that provide the basis for important natural resources management decisions. However, these sensors are susceptible to failure due to such things as human factors, lack of necessary maintenance, flaws on the transmission line or any part of the sensor, and unexpected changes in the sensors' surrounding conditions. Two types of machine learning techniques were used in this study to assess the detection of anomalous data points on turbidity readings from the Paradise site on the Little Bear River, in northern Utah: Artificial Neural Networks (ANNs) and Relevance Vector Machines (RVMs). ANN and RVM techniques were used to develop regression models capable of predicting upcoming Paradise site turbidity measurements and estimating confidence intervals associated with those predictions, to be later used to determine if a real measurement is an anomaly. Three cases were identified as important to evaluate as possible inputs for the regression models created: (1) only the reported values from the sensor from previous time steps, (2) reported values from the sensor from previous time steps and values of other water types of sensors from the same site as the target sensor, and (3) adding as inputs the previous readings from sensors from upstream sites. The decision of which of the models performed the best was made based on each model's ability to detect anomalous data points that were identified in a QA/QC analysis that was manually performed by a human technician. False positive and false negative rates for a range of confidence intervals were used as the measure of performance of the models. The RVM models were able to detect more anomalous points within narrower confidence intervals than the ANN models. At the same time, it was shown that incorporating as inputs measurements from other sensors at the same site as well as measurements from upstream sites can improve the performance of the models.

Identiferoai:union.ndltd.org:UTAHS/oai:digitalcommons.usu.edu:etd-2071
Date01 December 2011
CreatorsConde, Erick F.
PublisherDigitalCommons@USU
Source SetsUtah State University
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceAll Graduate Theses and Dissertations
RightsCopyright for this work is held by the author. Transmission or reproduction of materials protected by copyright beyond that allowed by fair use requires the written permission of the copyright owners. Works not in the public domain cannot be commercially exploited without permission of the copyright owner. Responsibility for any use rests exclusively with the user. For more information contact Andrew Wesolek (andrew.wesolek@usu.edu).

Page generated in 0.002 seconds