Thesis: S.M., Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2016. / Cataloged from PDF version of thesis. / Includes bibliographical references (pages [209]-215). / Air pollution is responsible for :1/8 of deaths around the world. While the importance of air quality has led to a boom in inexpensive air sensors, studies have shown that the status quo of sparse, fixed sensors cannot accurately capture personal exposure levels of nearby populations. Especially in urban landscapes, pollutant concentrations can vary over just a few seconds or a few meters. Unfortunately, the portable monitors that are capable of accurately measuring these pollutants cost thousands of dollars. That hasn't stopped a deluge of cheap, portable consumer devices from entering the market. These solutions frequently claim better accuracy, but universally fail under real-world validation. Instead of competing to build a more accurate sensor, we take the approach of trying to predict when we can trust the cheap sensor we have, based on ambient conditions and measurements. Well-designed, sub-$100 sensors have recently started to perform with high precision and accuracy. While their fundamental operation is sound, these affordable sensors cannot incorporate costly, industry standard techniques for mitigating issues like cross-sensitivity, dynamic airflow, or high humidity. Fortunately, if the core principles of the device are robust, machine learning techniques should be able to predict systematic measurement failure based on a handful of related indicators. In this thesis, we test and demonstrate the potential for logistic regression machine learning techniques to predict and classify sensor measurements as 'correct' or 'incorrect' with high reliability. These techniques are also useful for quantifying sensor precision as well as cross-seasonal prediction strength. After demonstrating the value of this approach, we implement a scalable database solution using a semantic web technology know as ChainAPI. The tools developed for this framework allow automatic learning algorithms to crawl through the database, access the most recent data, update their training model, and populate the database with the processed data for other crawling scripts to interact with. This backend has implications for air quality data storage, interaction, and exchange. Finally, we build a portable, Bluetooth enabled air quality device that connects to ChainAPI through a mobile phone app, and takes advantage of the machine learning algorithms running in its backend. This device improves the reliability of sensor data compared with similar-cost systems. The learnAir device empowers individuals to trust their personal air quality data, and provokes a dialog about sensor reliability in the citizen sensing community. Its novel database architecture promotes new ways of interacting with large, dynamic datasets, and new tools to characterize affordable sensors and devices. Finally, applied logistic regression algorithms assure the accuracy of cheap, distributed sensor data- creating a trusted way for researchers to collaborate with citizen scientists from around the world. / by David B. Ramsay. / S.M.
Identifer | oai:union.ndltd.org:MIT/oai:dspace.mit.edu:1721.1/107555 |
Date | January 2016 |
Creators | Ramsay, David B. (David Bradford) |
Contributors | Joseph A. Paradiso., Program in Media Arts and Sciences (Massachusetts Institute of Technology), Program in Media Arts and Sciences (Massachusetts Institute of Technology) |
Publisher | Massachusetts Institute of Technology |
Source Sets | M.I.T. Theses and Dissertation |
Language | English |
Detected Language | English |
Type | Thesis |
Format | 215 pages, application/pdf |
Rights | MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission., http://dspace.mit.edu/handle/1721.1/7582 |
Page generated in 0.0018 seconds