Spelling suggestions: "subject:"computerscience"" "subject:"composerscience""
21 |
Pytracks: A Tool for Visualizing Fish Movement Tracks on Different ScalesFossum, Ross 08 March 2016 (has links)
A fundamental problem in conservation biology and fisheries management is the ability to make educated decisions based on the data collected. Fish populations and their spatial distributions need to be represented accurately for conversation efforts and management decisions. Methods such as modeling, surveying, and tracking can all be used to collect data on a particular fishery. To include the movement patterns in conservation and management, one needs to work with and process fish tracking data or data exported from fish movement simulation models. This data can often be difficult to process. This topic is becoming increasingly popular as technology to accurately track and log fish did not exist in the past. With all of this data being generated, real or simulated, tools need to be developed to efficiently process it all, as many do not exist. Pytracks attempts to fill a currently existing gap and help programmers who work with simulated and observed simulation data by allowing them to visualize and analyze their data more efficiently. Pytracks, as presented in this thesis, is a tool written in Python which wraps raw data files from field observations or simulation models with an easy to use API. This allows programmers to spend less time on trivial raw file processing and more time on data visualization and computation. The code to visualize sample data can also be much shorter and easier to interpret. In this thesis, pytracks was used to help solve a problem related to interpreting different movement algorithms. This work has a focus on fish movement models, but can also be relevant for any other type of animal if the data is compatible. Many examples have been included in this thesis to justify the effectiveness of pytracks. Additional online documentation has been written as well to show how to further utilize pytracks.
|
22 |
Learning from Access Log to Mitigate Insider ThreatsZhang, Wen 17 March 2016 (has links)
As the quantity of data collected, stored, and processed in information systems has grown, so too have insider threats. This type of threat is realized when authorized individuals misuse their privileges to violate privacy or security policies. Over the past several decades, various technologies have been introduced to mitigate the insider threat, which can be roughly partitioned into two categories: 1) prospective and 2) retrospective. Prospective technologies are designed to specify and manage a userâs rights, such that misuse can be detected and prevented before it transpires. Conversely, retrospective technologies permit users to invoke privileges aim, but investigate the legitimacy of such actions after the fact.
Despite the existence of such strategies, administrators need to answer several critical questions to put them into practice. First, given a specific circumstance, which type of strategy (i.e., prospective vs. retrospective) should be adopted? Second, given the type of strategy, which is the best approach to support it in an operational manner? Existing approaches addressing them neglect that the data captured by information systems may be able to inform the decision making. As such, the overarching goal of this dissertation is to investigate how best to answer these questions using data-driven approaches.
This dissertation makes three technical contributions. The first contribution is in the introduction of a novel approach to quantify tradeoffs for prospective and retrospective strategies, under which each strategy is translated into a classification model, whereby the misclassification costs for each model are compared to facilitate decision support. This dissertation then introduces several data-driven approaches to realize the strategies. The second contribution is for prospective strategies, with a specific focus on role-based access control (RBAC). This dissertation introduces an approach to evolve an existing RBAC based on evidence in an access log, which relies on a strategy to promote roles from candidates. The third contribution is for retrospective strategies, whereby this dissertation introduces an auditing framework that can leverage workflow information to facilitate misuse detection. These methods are empirically validated in three months of access log (million accesses) derived from a real-world information system.
|
23 |
Visual Representations for Fine-grained CategorizationZhang, Ning 08 April 2016 (has links)
<p> In contrast to basic-level object recognition, fine-grained categorization aims to distinguish between subordinate categories, such as different animal breeds or species, plant species or man-made product models. The problem can be extremely challenging due to the subtle differences in the appearance of certain parts across related categories and often requires distinctions that must be conditioned on the object pose for reliable identification. Discriminative markings are often highly localized, leading traditional object recognition approaches to struggle with the large pose variations often present in these domains. Face recognition is the classic case of fine-grained recognition, and it is noteworthy that the best face recognition methods jointly discover facial landmarks and extract features from those locations. We propose pose-normalized representations, which align training exemplars, either piecewise by part or globally for the whole object, effectively factoring out differences in pose and in camera viewing angle.</p><p> I first present the methods of using the idea of pose-normalization for two related applications: human attribute classification and person recognition beyond frontal face. Following the recent success of deep learning, we use deep convolutional features as feature representations. Next, I will introduce the part-based RCNN method as an extension of state-of-art object detection method RCNN for fine-grained categorization. The model learns both whole-object and part detectors, and enforces learned geometric constraints between them. I will also show the results of using the recent compact bilinear features to generate the pose-normalized representations. However, bottom-up region proposals is limited by hand-engineered features and in the final work, I will present a fully convolution deep network, trained end-to-end for part localization and fine-grained classification.</p>
|
24 |
Deep Learning for Brain Tumor ClassificationPaul, Justin Stuart 13 April 2016 (has links)
Deep learning has been used successfully in supervised classification tasks in order to learn complex patterns. The purpose of the study is to apply this machine learning technique to classifying images of brains with different types of tumors: meningioma, glioma, and pituitary. The image dataset contains 233 patients with a total of 3064 brain images with either meningioma, glioma, or pituitary tumors. The images are T1-weighted contrast enhanced MRI (CE-MRI) images of either axial (transverse plane), coronal (frontal plane), or sagittal (lateral plane) planes. This research focuses on the axial images, and expands upon this dataset with the addition of axial images of brains without tumors in order to increase the number of images provided to the neural network. Training neural networks over this data has proven to be accurate in its classifications an average five-fold cross validation of 91.43%.
|
25 |
Visual analytics techniques for exploration of spatiotemporal dataFerreira, Nivan 26 March 2016 (has links)
<p> Spatial and temporal interactions are central and fundamental in pretty much all activities in our world and society. Every day, people and goods travel around the world at different speeds and scales; migratory animals engage in long-distance travels that demonstrate the biological integration around the globe; weather phenomena, like typhoons and hurricanes, form and move around the Earth and may have large social-economic impact. In all these examples, proper understanding of the underlying phenomena can produce insights with the potential to shape the future development in those domains. </p><p> The rapid development of acquisition technology and the popularization of GPS enabled mobile devices as resulted in spatiotemporal data being produced at massive rates. These create opportunities for data-driven analysis that can highly influence decision making in a diverse set of domains. In order to take advantage of all these data and realize their potential, it is crucial to be able to extract knowledge from them. Interactive visualization systems are acknowledged to be important tools in this scenario: it leverages the human cognitive system and the power of interactive graphic tools to enable quick hypothesis testing and exploration. However, the volume and inherent complexity of spatiotemporal data makes designing such systems a difficult problem. In fact, such complex data collections pose challenges in both managing the data for interactive exploration as well as in designing visual metaphors that enable effective for data exploration. Also, such visual metaphors are limited by constraints imposed by the display and data dimensions, often resulting in extremely cluttered visualizations that are hard to interpret. While, filtering and aggregation strategies are often applied to eliminate clutter, they might hide interesting patterns. Therefore, purely visual/interaction methods need to be complemented with techniques that help in the process of pattern discovery. This dissertation presents novel visual analytics contributions for the analysis of spatiotemporal data to attack the challenges aforementioned. Visual analytics combine interactive visualization with efficient pattern mining techniques to enable analysts to explore large amounts of complex data. The first contribution is the design of the TaxiVis visual exploration system. This system couples together a novel visual query model with an efficient custom-built data layer. These two components enable easy query composition via visual methods as well as interactive query response times. TaxiVis also makes use of coordinated views and rendering strategies to generate informative visual summaries for query results even when those are large. </p><p> The remaining of the contributions in this thesis consists of two pattern mining techniques that help in the navigation through the data via pattern discovery. These two techniques have the goal of enhancing the analytical power of tools such as TaxiVis. Furthermore, these techniques have in common the use of concepts and techniques widely applied in scientific visualization and computer graphics. This approach allows us to have novel perspectives on the problems of finding patterns in spatiotemporal data that, to the best of our knowledge, have not been considered in the machine learning and data mining fields. The first technique consists of a topology-based technique whose main objective is to help users to find the ``needle in the hay stack'', i.e., guide users towards interesting slices (spatiotemporal regions) of the data. We call this process event guided exploration. The overall idea behind this technique is to treat topological features of time-varying scalar functions derived from spatiotemporal data as treated as events. Via visual exploration of the collection of extreme points extracted over time, important events of the data can be found with relatively a small amount of work by the user. The second pattern mining technique consists of a novel model based clustering technique designed for trajectory datasets. This technique, called Vector Field K-Means, models trajectories as streamlines of vector fields. One important feature of this modeling strategy is that it tries to avoid overlapping trajectories to have discrepant directions at their intersections. Clustering is achieved by using the spatial component of trajectories to fit a collection of vector fields to the given trajectories. This technique achieves richness and expressivity of features, simplicity of implementation and analysis, and computational efficiency. Furthermore, the obtained vector fields serve as a visual summary of the movement patterns in each cluster. Finally, Vector Field K-Means can be naturally generalized to also consider trajectories with attributes. This is achieved by using a different modeling strategy based on scalar fields, which we call Attribute Field K-Means.</p>
|
26 |
Fostering Synergistic Learning of Computational Thinking and Middle School Science in Computer-based Intelligent Learning EnvironmentsBasu, Satabdi 08 April 2016 (has links)
Recent advances in computing are transforming our lives at an astonishing pace. Computational Thinking (CT) is a term used to describe the representational practices and behaviors involved in formulating problems and their solutions so that the solutions can be carried out by a computer or a computing agent. Driven by the needs of a 21st century workforce, there is currently a great emphasis on teaching students to think computationally from an early age. Computer science education is gradually being incorporated into K-12 curricula, but a more feasible approach to make CT accessible to all students may be to integrate it with components of existing K-12 curricula. While CT is considered a vital ingredient of science learning, successfully leveraging the synergy between the two in middle school classrooms is non-trivial. This dissertation research presents Computational Thinking using Simulation and Modeling (CTSiM), a computer-based environment that integrates learning of CT concepts and practices with middle school science curricula. CTSiM combines the use of an agent-based visual language for conceptual and computational modeling of science topics, hypertext resources for information acquisition, and simulation tools to study and analyze the behaviors of the modeled science topics. We discuss assessments metrics developed to study the computational artifacts students build and the CT practices and learning strategies they employ in the CTSiM environment. These metrics can be used online to interpret studentsâ behavior and performance, and provide the framework for adaptively scaffolding students based on their observed deficiencies. Results from a classroom study with ninety-eight middle school students demonstrate the effectiveness of the CTSiM environment and the adaptive scaffolding framework. Students display better understanding of important science and CT concepts, improve their modeling performance over time, adopt useful modeling behaviors, and are able to transfer their modeling skills to new scenarios. In addition, studentsâ modeling performance and use of CT practices during modeling are significantly correlated with their science learning, demonstrating the synergy between CT and science learning.
|
27 |
A Novel Recurrent Convolutional Neural Network for Ocean and Weather ForecastingFirth, Robert James 12 May 2016 (has links)
Numerical weather prediction is a computationally expensive task that requires not only the numerical solution to a complex set of non-linear partial differential equations, but also the creation of a parameterization scheme to estimate sub-grid scale phenomenon.
The proposed method is an alternative approach to developing a mesoscale meteorological model a modified recurrent convolutional neural network that learns to simulate the solution to these equations.
Along with an appropriate time integration scheme and learning algorithm, this method can be used to create multi-day forecasts for a large region. The learning method presented is an extended form of Backpropagation Through Time for a recurrent network with outputs that feed back through as inputs only after undergoing a fixed transformation.
An initial implementation of this approach has been created that forecasts for 2,744 locations across the southeastern United States at 36 vertical levels of the atmosphere, and 119,000 locations across the Atlantic Ocean at 39 vertical levels. These models, called LM3 and LOM, forecast wind speed, temperature, geopotential height, and rainfall for weather forecasting and water current speed, temperature, and salinity for ocean forecasting.
Experimental results show that the new approach is 3.6 times more efficient at forecasting the ocean and 16 times more efficient at forecasting the atmosphere.
The new approach showed forecast skill by beating the accuracy of two models, persistence and climatology, and was more accurate than the Navy NCOM model on 16 of the first 17 layers of the ocean below the surface (2 meters to 70 meters) for forecasting salinity and 15 of the first 17 layers for forecasting temperature. The new approach was also more accurate than the RAP model at forecasting wind speed on 7 layers, specific humidity on 7 layers, relative humidity on 6 layers, and temperature on 3 layers, with competitive results elsewhere.
|
28 |
Secure system simulation - Internet of ThingsVerma, Yukti 08 July 2016 (has links)
<p>Internet of Things (IoT) can be defined as a collection of smart devices interacting with each other unanimously to fulfill a common goal. The real world data collected from the Internet of Things can be made as an integral part of web known as Web of Things (WoT). With the help of Web of Things architecture, the users can leverage simple web mechanisms such as browsing, searching and caching to interact with the smart devices. This thesis aims to create an entire system simulating the Web of Things architecture including sensors, edge routers, web interfaces, endpoints to the IoT network and access control. Several technologies such as CoAP, 6LoWPAN, IEEE 802.15.4, contiki and DTLS have been evaluated before inclusion in the implementation. A complete web portal utilizing Californium framework and Role Based Access Control has been created for accessing and interacting with the sensors and their data. This thesis provides an end-to-end approach towards IoT device security by implementing Constrained Application Protocol (CoAP) over Datagram Transport Layer Security (DTLS) in the system. The performance of secured system is analyzed in a constrained environment based on which it is observed that DLTS implementation increases the RAM usage, code size, packet overhead and power consumption by a significant value. Finally, the future work that needs to considered in order to iterate towards better security is specified. </p>
|
29 |
Prioritized Grammar Enumeration| A novel method for symbolic regressionWorm, Anthony 14 July 2016 (has links)
<p> The main thesis of this work is that computers can be programmed to derive mathematical formula and relationships from data in an efficient, reproducible, and interpretable way. This problem is known as Symbolic Regression, the data driven search for mathematical relations as performed by a computer. In essence, this is a search over all possible equations to find those which best model the data on hand. </p><p> We propose Prioritized Grammar Enumeration (PGE) as a deterministic machine learning algorithm for solving Symbolic Regression. PGE works with a grammar’s rules and input data to prioritize the enumeration of expressions in that language. By making large reductions to the search space and introducing mechanisms for memoization, PGE can explore the space of all equations efficiently. Most notably, PGE provides reproducibility, a key aspect to any system used by scientists at large. </p><p> We then enhance the PGE algorithm in several ways. We enrich the equation equation types and application domains PGE can operate on. We deepen equation abstractions and relationships, add configuration to search operaters, and enrich the fitness metrics. We enable PGE to scale by decoupling the subroutines into a set of services. </p><p> Our algorithm experiments cover a range of problem types from a multitude of domains. Our experiments cover a variety of architectural and parameter configurations. Our results show PGE to have great promise and efficacy in automating the discovery of equations at the scales needed by tomorrow's scientific data problems. </p><p> Additionally, reproducibility has been a significant factor in the formulation and development of PGE. All supplementary materials, codes, and data can be found at github.com/verdverm/pypge.</p>
|
30 |
Secure learning in adversarial environmentsLi, Bo 14 July 2016 (has links)
Machine learning has become ubiquitous in the modern world, varying from enterprise applications to personal use cases and from image annotation and text recognition to speech captioning and machine translation. Its capabilities in inferring patterns from data have found great success in the domains of prediction and decision making, including in security sensitive applications, such as intrusion detection, virus detection, biometric identity recognition, and spam filtering. However, strengths of such learning systems of traditional machine learning are based on the distributional stationarity assumption, and can become their vulnerabilities when there are adversarial manipulations during the training process (poisoning attack) or the testing process (evasion attack).
Considering the fact that the traditional learning strategies are potentially vulnerable to security faults, there is a need for machine learning techniques that are secure against sophisticated adversaries in order to fill the gap between the distributional stationarity assumption and deliberate adversarial manipulations. These techniques will be referred to as secure learning throughout this thesis.
To conduct systematic research for this secure learning problem, my study is based on three components. First, I model different kinds of attacks against the learning systems by evaluating the adversariesâ capabilities, goals and cost models. Second, I study the secure learning algorithms that counter any targeted malicious attacks by considering the specific goals of the learners and their resource and capability limitations theoretically. Concretely, I model the interactions between the defender (learning system) and attackers as different forms of games. Based on the game theoretic analysis, I evaluate the utilities and constraints for both participants, as well as optimize the secure learning system with respect to adversarial responses. Third, I design and implement practical algorithms to efficiently defend against multi-adversarial attack strategies.
My thesis focuses on examining and answering theoretical questions about the limits of classifier evasion (evasion attack), adversarial contamination (poisoning attack) and privacy preserving problem in adversarial environments, as well as how to design practical resilient learning algorithms for a wide range of applications, including spam filters, malware detection, network intrusion detection, recommendation systems, etc. In my study, I tailor my approaches for building scalable machine learning systems, which are demanded by modern big data applications.
|
Page generated in 0.0935 seconds