Global ETD Search

681	Classification of Faults in Railway Ties Using Computer Vision and Machine Learning Kulkarni, Amruta Kiran 30 June 2017 (has links) This work focuses on automated classification of railway ties based on their condition using aerial imagery. Four approaches are explored and compared to achieve this goal - handcrafted features, HOG features, transfer learning and proposed CNN architecture. Mean test accuracy per class and Quadratic Weighted Kappa score are used as performance metrics, particularly suited for the ordered classification in this work. Transfer learning approach outperforms the handcrafted features and HOG features by a significant margin. The proposed CNN architecture caters to the unique nature of the railway tie images and their defects. The performance of this approach is superior to the handcrafted and HOG features. It also shows a significant reduction in the number of parameters as compared to the transfer learning approach. Data augmentation boosts the performance of all approaches. The problem of label noise is also analyzed. The techniques proposed in this work will help in reducing the time, cost and dependency on experts involved in traditional railway tie inspections and will facilitate efficient documentation and planning for maintenance of railway ties. / Master of Science Computer Vision Machine Learning Railway Ties
682	Natural Language Driven Image Edits using a Semantic Image Manipulation Language Mohapatra, Akrit 04 June 2018 (has links) Language provides us with a powerful tool to articulate and express ourselves! Understanding and harnessing the expressions of natural language can open the doors to a vast array of creative applications. In this work we explore one such application - natural language based image editing. We propose a novel framework to go from free-form natural language commands to performing fine-grained image edits. Recent progress in the field of deep learning has motivated solving most tasks using end-to-end deep convolutional frameworks. Such methods have shown to be very successful even achieving super-human performance in some cases. Although such progress has shown significant promise for the future we believe there is still progress to be made before their effective application to a task like fine-grained image editing. We approach the problem by dissecting the inputs (image and language query) and focusing on understanding the language input utilizing traditional natural language processing (NLP) techniques. We start by parsing the input query to identify the entities, attributes and relationships and generate a command entity representation. We define our own high-level image manipulation language that serves as an intermediate programming language connecting natural language requests that represent a creative intent over an image into the lower-level operations needed to execute them. The semantic command entity representations are mapped into this high- level language to carry out the intended execution. / Master of Science Machine learning Natural language Processing Computer Vision
683	Synthesizing a Hybrid Benchmark Suite with BenchPrime Wu, Xiaolong 09 October 2018 (has links) This paper presents BenchPrime, an automated benchmark analysis toolset that is systematic and extensible to analyze the similarity and diversity of benchmark suites. BenchPrime takes multiple benchmark suites and their evaluation metrics as inputs and generates a hybrid benchmark suite comprising only essential applications. Unlike prior work, BenchPrime uses linear discriminant analysis rather than principal component analysis, as well as selects the best clustering algorithm and the optimized number of clusters in an automated and metric-tailored way, thereby achieving high accuracy. In addition, BenchPrime ranks the benchmark suites in terms of their application set diversity and estimates how unique each benchmark suite is compared to other suites. As a case study, this work for the first time compares the DenBench with the MediaBench and MiBench using four different metrics to provide a multi-dimensional understanding of the benchmark suites. For each metric, BenchPrime measures to what degree DenBench applications are irreplaceable with those in MediaBench and MiBench. This provides means for identifying an essential subset from the three benchmark suites without compromising the application balance of the full set. The experimental results show that the necessity of including DenBench applications varies across the target metrics and that significant redundancy exists among the three benchmark suites. / Master of Science / Representative benchmarks are widely used in the research area to achieve an accurate and fair evaluation of hardware and software techniques. However, the redundant applications in the benchmark set can skew the average towards redundant characteristics overestimating the benefit of any proposed research. This work proposes a machine learning-based framework BenchPrime to generates a hybrid benchmark suite comprising only essential applications. In addition, BenchPrime ranks the benchmark suites in terms of their application set diversity and estimates how unique each benchmark suite is compared to other suites. Algorithm Machine learning BenchPrime Benchmark subseting
684	Sensitivity of Feedforward Neural Networks to Harsh Computing Environments Arechiga, Austin Podoll 08 August 2018 (has links) Neural Networks have proven themselves very adept at solving a wide variety of problems, in particular they accel at image processing. However, it remains unknown how well they perform under memory errors. This thesis focuses on the robustness of neural networks under memory errors, specifically single event upset style errors where single bits flip in a network's trained parameters. The main goal of these experiments is to determine if different neural network architectures are more robust than others. Initial experiments show that MLPs are more robust than CNNs. Within MLPs, deeper MLPs are more robust and for CNNs larger kernels are more robust. Additionally, the CNNs displayed bimodal failure behavior, where memory errors would either not affect the performance of the network, or they would degrade its performance to be on par with random guessing. VGG16, ResNet50, and InceptionV3 were also tested for their robustness. ResNet50 and InceptionV3 were both more robust than VGG16. This could be due to their use of Batch Normalization or the fact that ResNet50 and InceptionV3 both use shortcut connections in their hidden layers. After determining which networks were most robust, some estimated error rates from neutrons were calculated for space environments to determine if these architectures were robust enough to survive. It was determined that large MLPs, ResNet50, and InceptionV3 could survive in Low Earth Orbit on commercial memory technology and only use software error correction. / Master of Science / Neural networks are a new kind of algorithm that are revolutionizing the field of computer vision. Neural networks can be used to detect and classify objects in pictures or videos with accuracy on par with human performance. Neural networks achieve such good performance after a long training process during which many parameters are adjusted until the network can correctly identify objects such as cats, dogs, trucks, and more. These trained parameters are then stored in a computers memory and then recalled whenever the neural network is used for a computer vision task. Some computer vision tasks are safety critical, such as a self-driving car’s pedestrian detector. An error in that detector could lead to loss of life, so neural networks must be robust against a wide variety of errors. This thesis will focus on a specific kind of error: bit flips in the parameters of a neural networks stored in a computer’s memory. The main goal of these bit flip experiments is to determine if certain kinds of neural networks are more robust than others. Initial experiments show that MLP (Multilayer Perceptions) style networks are more robust than CNNs (Convolutional Neural Network). For MLP style networks, making the network deeper with more layers increases the accuracy and the robustness of the network. However, for the CNNs increasing the depth only increased the accuracy, not the robustness. The robustness of the CNNs displayed an interesting trend of bimodal failure behavior, where memory errors would either not affect the performance of the network, or they would degrade its performance to be on par with random guessing. A second set of experiments were run to focus more on CNN robustness because CNNs are much more capable than MLPs. The second set of experiments focused on the robustness of VGG16, ResNet50, and InceptionV3. These CNNs are all very large and have very good performance on real world datasets such as ImageNet. Bit flip experiments showed that ResNet50 and InceptionV3 were both more robust than VGG16. This could be due to their use of Batch Normalization or the fact that ResNet50 and InceptionV3 both use shortcut connections within their network architecture. However, all three networks still displayed the bimodal failure mode seen previously. After determining which networks were most robust, some estimated error rates were calculated for a real world environment. The chosen environment was the space environment because it naturally causes a high amount of bit flips in memory, so if NASA were to use neural networks on any rovers they would need to make sure the neural networks are robust enough to survive. It was determined that large MLPs, ResNet50, and InceptionV3 could survive in Low Earth Orbit on commercial memory technology and only use software error correction. Using only software error correction will allow satellite makers to build more advanced satellites without paying extra money for radiation-hardened electronics. Machine Learning Fault Tolerance Single Event Upsets
685	ALJI: Active Listening Journal Interaction Sullivan, Patrick Ryan 29 October 2019 (has links) Depression is a crippling burden on a great many people, and it is often well hidden. Mental health professionals are able to treat depression, but the general public is not well versed in recognizing depression symptoms or assessing their own mental health. Active Listening Journal Interaction (ALJI) is a computer program that seeks to identify and refer people suffering with depression to mental health support services. It does this through analyzing personal journal entries using machine learning, and then privately responding to the author with proper guidance. In this thesis, we focus on determining the feasibility and usefulness of the machine learning models that drive ALJI. With heavy data limitations, we cautiously report that with a single journal entry, our model detects when a person's symptoms warrant professional intervention with a 61% accuracy. A great amount of discussion on the proposed solution, methods, results, and future directions of ALJI is included. / Master of Science / An incredibly large number of people suffer from depression, and they can rightfully feel trapped or imprisoned by this illness. A very simple way to understand depression is to first imagine looking at the most beautiful sunset you've ever seen, and then imagine feeling absolutely nothing while looking that same sunset, and you can't explain why. When a person is depressed, they are likely to feel like a burden to those around them. This causes them to avoid social gathering and friends, making them isolated away from people that could support them. This worsens their depression and a terrible cycle begins. One of the best ways out of this cycle is to reveal the depression to a doctor or psychologist, and to ask them for guidance. However, many people don't see or realize this excellent option is open to them, and will continue to suffer with depression for far longer than needed. This thesis describes an idea called the Active Listening Journal Interaction, or ALJI. ALJI acts just like someone's personal journal or diary, but it also has some protections from illnesses like depression. First, ALJI searches a journal entry for indicators about the author's health, then ALJI asks the author a few questions to better understand the author, and finally ALJI gives that author information and guidance on improving their health. We are starting to create a computer program of ALJI by first building and testing the detector for the author's health. Instead of making the detector directly, we show the computer some examples of the health indicators from journals we know very well, and then let the computer focus on finding the pattern that would reveal those health indicators from any journal. This is called machine learning, and in our case, ALJI's machine learning is going to be difficult because we have very few example journals where we know all of the health indicators. However, we believe that fixing this issue would solve the first step of ALJI. The end of this thesis also discusses the next steps going forward with ALJI. Machine learning Expressive Writing Computational Social Science
686	Credential Theft Powered Unauthorized Login Detection through Spatial Augmentation Burch, Zachary Campbell 29 October 2018 (has links) Credential theft is a network intrusion vector that subverts traditional defenses of a campus network, with a malicious login being the act of an attacker using those stolen credentials to access the target network. Historically, this approach is simple for an attacker to conduct and hard for a defender to detect. Alternative mitigation strategies require an in depth view of the network hosts, an untenable proposition in a campus network. We introduce a method of spatial augmentation of login events, creating a user and source IP trajectory for each event. These location mappings, built using user wireless activity and network state information, provide features needed for login classification. From this, we design and build a real time data collection, augmentation, and classification system for generating alerts on malicious events. With a relational database for data processing and a trained weighted random forests ensemble classifier, generated alerts are both timely and few enough to allow human analyst review of all generated events. We evaluate this design for three levels of attacker ability with a defined threat model. We evaluate our approach with a proof of concept system on weeks of live data collected from the Virginia Tech campus, under an IRB approved research protocol. / Master of Science / For a computer network, a common mode of access is a login; the entering of a valid username and password for authentication. Attackers use a variety of methods to steal user login credentials and several of these approaches are unnoticeable by network defenders. Providing further complications, a higher educational campus network, such as Virginia Tech, inherently has less information about the state of the network, since students and teachers bring their privately owned devices. To prevent this attack method, we determine the class, authorized or unauthorized, of login events using data that can be consistently provided by a campus network. After classification, alerts are generated for security analysts, helping to further defend the network. Spatial augmentation is a process we introduce to allow login classification with machine learning algorithms. For every login event at the campus, a history of user locations and source event locations can be provided, using data collected from the campus network infrastructure. Location data provides stronger classification of login events, since studies show attackers inherently have a physical distance between the normal user of an account when performing an unauthorized login. For evaluation, we build a system to augment and classify login events, while limiting the number of false alerts to a useable level. Security Machine learning Login Classification Spatial Augmentation
687	Bounded Expectation of Label Assignment: Dataset Annotation by Supervised Splitting with Bias-Reduction Techniques Herbst, Alyssa Kathryn 20 January 2020 (has links) Annotating large unlabeled datasets can be a major bottleneck for machine learning applications. We introduce a scheme for inferring labels of unlabeled data at a fraction of the cost of labeling the entire dataset. We refer to the scheme as Bounded Expectation of Label Assignment (BELA). BELA greedily queries an oracle (or human labeler) and partitions a dataset to find data subsets that have mostly the same label. BELA can then infer labels by majority vote of the known labels in each subset. BELA makes the decision to split or label from a subset by maximizing a lower bound on the expected number of correctly labeled examples. BELA improves upon existing hierarchical labeling schemes by using supervised models to partition the data, therefore avoiding reliance on unsupervised clustering methods that may not accurately group data by label. We design BELA with strategies to avoid bias that could be introduced through this adaptive partitioning. We evaluate BELA on labeling of four datasets and find that it outperforms existing strategies for adaptive labeling. / Master of Science / Most machine learning classifiers require data with both features and labels. The features of the data may be the pixel values for an image, the words in a text sample, the audio of a voice clip, and more. The labels of a dataset define the data. They place the data into one of several categories, such as determining whether a image is of a cat or dog, or adding subtitles to Youtube videos. The labeling of a dataset can be expensive, and usually requires a human to annotate. Human labeled data can be moreso expensive if the data requires an expert labeler, as in the labeling of medical images, or when labeling data is particularly time consuming. We introduce a scheme for labeling data that aims to lessen the cost of human labeled data by labeling a subset of an entire dataset and making an educated guess on the labels of the remaining unlabeled data. The labeled data generated from our approach may be then used towards the training of a classifier, or an algorithm that maps the features of data to a guessed label. This is based off of the intuition that data with similar features will also have similar labels. Our approach uses a game-like process of, at any point, choosing between one of two possible actions: we may either label a new data point, thus learning more about the dataset, or we may split apart the dataset into multiple subsets of data. We will eventually guess the labels of the unlabeled data by assigning each unlabeled data point the majority label of the data subset that it belongs to. The novelty in our approach is that we use supervised classifiers, or splitting techniques that use both the features and the labels of data, to split a dataset into new subsets. We use bias reduction techniques that enable us to use supervised splitting. Active Learning Machine learning Dataset Annotation
688	Voltage Regulation of Smart Grids using Machine Learning Tools Jalali, Mana 23 September 2019 (has links) Smart inverters have been considered the primary fast solution for voltage regulation in power distribution systems. Optimizing the coordination between inverters can be computationally challenging. Reactive power control using fixed local rules have been shown to be subpar. Here, nonlinear inverter control rules are proposed by leveraging machine learning tools. The designed control rules can be expressed by a set of coefficients. These control rules can be nonlinear functions of both remote and local inputs. The proposed control rules are designed to jointly minimize the voltage deviation across buses. By using the support vector machines, control rules with sparse representations are obtained which decrease the communication between the operator and the inverters. The designed control rules are tested under different grid conditions and compared with other reactive power control schemes. The results show promising performance. / With advent of renewable energies into the power systems, innovative and automatic monitoring and control techniques are required. More specifically, voltage regulation for distribution grids with solar generation is a can be a challenging task. Moreover, due to frequency and intensity of the voltage changes, traditional utility-owned voltage regulation equipment are not useful in long term. On the other hand, smart inverters installed with solar panels can be used for regulating the voltage. Smart inverters can be programmed to inject or absorb reactive power which directly influences the voltage. Utility can monitor, control and sync the inverters across the grid to maintain the voltage within the desired limits. Machine learning and optimization techniques can be applied for automation of voltage regulation in smart grids using the smart inverters installed with solar panels. In this work, voltage regulation is addressed by reactive power control. machine Learning Optimization Voltage regulation Distribution Systems
689	Quantifying the Role of Vulnerability in Hurricane Damage via a Machine Learning Case Study Szczyrba, Laura Danielle 10 June 2020 (has links) Pre-disaster damage predictions and post-disaster damage assessments are challenging because they result from complicated interactions between multiple drivers, including exposure to various hazards as well as differing levels of community resiliency. Certain societal characteristics, in particular, can greatly magnify the impact of a natural hazard, however they are frequently ignored in disaster management because they are difficult to incorporate into quantitative analyses. In order to more accurately identify areas of greatest need in the wake of a disaster, both the hazards and the vulnerabilities need to be carefully assessed since they have been shown to be positively correlated with damage patterns. This study evaluated the contribution of eight drivers of structural damage from Hurricane Mar'ia in Puerto Rico, leveraging machine learning algorithms to determine the role that societal factors played. Random Forest and Stochastic Gradient Boosting Trees algorithms analyzed a diverse set of data including wind, flooding, landslide, and vulnerability measures. These data trained models to predict the structural damage caused by Hurricane Mar'ia in Puerto Rico and the importance of each predictive feature was calculated. Results indicate that vulnerability measures are the leading predictors of damage in this case study, followed by wind, flood, and landslide measures. Each predictive variable exhibits unique, often nonlinear, relationships with damage. These results demonstrate that societal-driven vulnerabilities play critical roles in damage pattern analysis and that targeted, pre-disaster mitigation efforts should be enacted to reinforce household resiliency in socioeconomically vulnerable areas. Recovery programs may need to be reworked to focus on the highly impacted vulnerable populations to avoid the persistence, or potential enhancement, of preexisting social inequalities in the wake of a disaster. / Master of Science / Disasters are not entirely natural phenomena. Rather, they occur when natural hazards interact with the man-made environment and negatively impact society. Most risk and impact assessment studies focus on natural hazards (processes beyond human control) and do not incorporate the role of societal circumstances (within human agency). However, it has been shown that certain socioeconomic, demographic, and structural characteristics increase the severity of disaster impacts. These characteristics define the the susceptibility of a community to negative disaster impacts, known as vulnerability. This study quantifies the role of vulnerability via a case study of Hurricane Mar'ia. A variety of statistical modeling, known as machine learning, analyzed flood, wind, and landslide hazards along with the aforementioned vulnerabilities. These variables were correlated with a damage assessment database and the model calculated the strength of each variable's relationship with damage. Results indicate that vulnerability measures exhibit the strongest predictive correlations with the damage caused by Hurricane Mar'ia, followed by wind, flood, and landslide measures, respectively, suggesting that efforts to improve societal equality and improvements to infrastructure in vulnerable areas can mitigate the impacts of future hazardous events. In addition, societal information is critical to include in future risk and impact assessment efforts in order to prioritize areas of greatest need and allocate resources to those who would benefit from them most. Vulnerability Impact Damage Machine learning Hurricane María
690	Data Imputation For Loss Reserving Zhai, Yilong January 2024 (has links) This master thesis delves into machine learning predictive modelling to predict missing values in loss reserving, focusing on predicting missing values for individual features (age, accident year, etc) and annual insurance payments. Leveraging machine learning techniques such as random forest and decision trees, we explore their performance for missing value prediction compared to traditional regression models. Moreover, the study transforms individual payments into run-off triangle versions. It uses the imputed dataset and complete dataset to compare the performance of different data imputation models by the loss reserves estimation from the Mack and GLM reserves model. By evaluating the performance of these diverse techniques, this research aims to contribute valuable insights to the evolving landscape of predictive analytics in insurance, guiding industry practices toward more accurate and efficient modelling approaches. / Thesis / Master of Science (MSc) Actuarial Science Loss Reserving Machine Learning

Search results