Spelling suggestions: "subject:"[een] DEEP NEURAL NETWORKS"" "subject:"[enn] DEEP NEURAL NETWORKS""
51 |
Novel Learning-Based Task Schedulers for Domain-Specific SoCsJanuary 2020 (has links)
abstract: This Master’s thesis includes the design, integration on-chip, and evaluation of a set of imitation learning (IL)-based scheduling policies: deep neural network (DNN)and decision tree (DT). We first developed IL-based scheduling policies for heterogeneous systems-on-chips (SoCs). Then, we tested these policies using a system-level domain-specific system-on-chip simulation framework [11]. Finally, we transformed them into efficient code using a cloud engine [1] and implemented on a user-space emulation framework [61] on a Unix-based SoC. IL is one area of machine learning (ML) and a useful method to train artificial intelligence (AI) models by imitating the decisions of an expert or Oracle that knows the optimal solution. This thesis's primary focus is to adapt an ML model to work on-chip and optimize the resource allocation for a set of domain-specific wireless and radar systems applications. Evaluation results with four streaming applications from wireless communications and radar domains show how the proposed IL-based scheduler approximates an offline Oracle expert with more than 97% accuracy and 1.20× faster execution time. The models have been implemented as an add-on, making it easy to port to other SoCs. / Dissertation/Thesis / Masters Thesis Computer Engineering 2020
|
52 |
Detekce objektů pro kamerový dohled pomocí SSD přístupu / Object detection for video surveillance using the SSD approachDobranský, Marek January 2019 (has links)
The surveillance cameras serve various purposes ranging from security to traffic monitoring and marketing. However, with the increasing quantity of utilized cameras, manual video monitoring has become too laborious. In re- cent years, a lot of development in artificial intelligence has been focused on processing the video data automatically and then outputting the desired no- tifications and statistics. This thesis studies the state-of-the-art deep learning models for object detection in a surveillance video and takes an in-depth look at SSD architecture. We aim to enhance the performance of SSD by updating its underlying feature extraction network. We propose to replace the initially used VGG model by a selection of modern ResNet, Xception and NASNet classifica- tion networks. The experiments show that the ResNet50 model offers the best trade-off between speed and precision, while significantly outperforming VGG. With a series of modifications, we improved the Xception model to match the ResNet performance. On top of the architecture-based improvements, we ana- lyze the relationship between SSD and a number of detected classes and their selection. We also designed and implemented a new detector with the use of temporal context provided by the video frames. This detector delivers enhanced precision while...
|
53 |
Efektivní implementace hlubokých neuronových sítí / Efficient implementation of deep neural networksKopál, Jakub January 2020 (has links)
In recent years, algorithms in the area of object detection have constantly been improving. The success of these algorithms has reached a level, where much of the development is focused on increasing speed at the expense of accuracy. As a result of recent improvements in the area of deep learning and new hardware architectures optimized for deep learning models, it is possible to detect objects in an image several hundreds times per second using only embedded and mobile devices. The main objective of this thesis is to study and summarize the most important methods in the area of effective object detection and apply them to a given real-world problem. By using state-of- the-art methods, we developed a traction-by-detection algorithm, which is based on our own object detection models that track transport vehicles in real-time using embedded and mobile devices. 1
|
54 |
On the Use of Model-Agnostic Interpretation Methods as Defense Against Adversarial Input Attacks on Tabular DataKanerva, Anton, Helgesson, Fredrik January 2020 (has links)
Context. Machine learning is a constantly developing subfield within the artificial intelligence field. The number of domains in which we deploy machine learning models is constantly growing and the systems using these models spread almost unnoticeably in our daily lives through different devices. In previous years, lots of time and effort has been put into increasing the performance of these models, overshadowing the significant risks of attacks targeting the very core of the systems, the trained machine learning models themselves. A specific attack with the aim of fooling the decision-making of a model, called the adversarial input attack, has almost exclusively been researched for models processing image data. However, the threat of adversarial input attacks stretches beyond systems using image data, to e.g the tabular domain which is the most common data domain used in the industry. Methods used for interpreting complex machine learning models can help humans understand the behavior and predictions of these complex machine learning systems. Understanding the behavior of a model is an important component in detecting, understanding and mitigating vulnerabilities of the model. Objectives. This study aims to reduce the research gap of adversarial input attacks and defenses targeting machine learning models in the tabular data domain. The goal of this study is to analyze how model-agnostic interpretation methods can be used in order to mitigate and detect adversarial input attacks on tabular data. Methods. The goal is reached by conducting three consecutive experiments where model interpretation methods are analyzed and adversarial input attacks are evaluated as well as visualized in terms of perceptibility. Additionally, a novel method for adversarial input attack detection based on model interpretation is proposed together with a novel way of defensively using feature selection to reduce the attack vector size. Results. The adversarial input attack detection showed state-of-the-art results with an accuracy over 86%. The proposed feature selection-based mitigation technique was successful in hardening the model from adversarial input attacks by reducing their scores by 33% without decreasing the performance of the model. Conclusions. This study contributes with satisfactory and useful methods for adversarial input attack detection and mitigation as well as methods for evaluating and visualizing the imperceptibility of attacks on tabular data. / Kontext. Maskininlärning är ett område inom artificiell intelligens som är under konstant utveckling. Mängden domäner som vi sprider maskininlärningsmodeller i växer sig allt större och systemen sprider sig obemärkt nära inpå våra dagliga liv genom olika elektroniska enheter. Genom åren har mycket tid och arbete lagts på att öka dessa modellers prestanda vilket har överskuggat risken för sårbarheter i systemens kärna, den tränade modellen. En relativt ny attack, kallad "adversarial input attack", med målet att lura modellen till felaktiga beslutstaganden har nästan uteslutande forskats på inom bildigenkänning. Men, hotet som adversarial input-attacker utgör sträcker sig utom ramarna för bilddata till andra datadomäner som den tabulära domänen vilken är den vanligaste datadomänen inom industrin. Metoder för att tolka komplexa maskininlärningsmodeller kan hjälpa människor att förstå beteendet hos dessa komplexa maskininlärningssystem samt de beslut som de tar. Att förstå en modells beteende är en viktig komponent för att upptäcka, förstå och mitigera sårbarheter hos modellen. Syfte. Den här studien försöker reducera det forskningsgap som adversarial input-attacker och motsvarande försvarsmetoder i den tabulära domänen utgör. Målet med denna studie är att analysera hur modelloberoende tolkningsmetoder kan användas för att mitigera och detektera adversarial input-attacker mot tabulär data. Metod. Det uppsatta målet nås genom tre på varandra följande experiment där modelltolkningsmetoder analyseras, adversarial input-attacker utvärderas och visualiseras samt där en ny metod baserad på modelltolkning föreslås för detektion av adversarial input-attacker tillsammans med en ny mitigeringsteknik där feature selection används defensivt för att minska attackvektorns storlek. Resultat. Den föreslagna metoden för detektering av adversarial input-attacker visar state-of-the-art-resultat med över 86% träffsäkerhet. Den föreslagna mitigeringstekniken visades framgångsrik i att härda modellen mot adversarial input attacker genom att minska deras attackstyrka med 33% utan att degradera modellens klassifieringsprestanda. Slutsats. Denna studie bidrar med användbara metoder för detektering och mitigering av adversarial input-attacker såväl som metoder för att utvärdera och visualisera svårt förnimbara attacker mot tabulär data.
|
55 |
Identifying signatures in scanned paperdocuments : A proof-of-concept at BolagsverketNorén, Björn January 2022 (has links)
Bolagsverket, a Swedish government agency receives cases both in paper form via mail, document form via e-mail and also digital forms. These cases may be about registering people in a company, changing the share capital, etc. However, handling and confirming all these papers can be time consuming, and it would be beneficial for Bolagsverket if this process could be automated with as little human input as possible. This thesis investigates if it is possible to identify whether a paper contains a signature or not by using artificial intelligence (AI) and convolutional neural networks (CNN), and also if it is possible to determine how many signatures a given paper has. If these problems prove to be solvable, it could potentially lead to a great benefit for Bolagsverket. In this paper, a residual neural network (ResNet) was implemented which later was trained on sample data provided by Bolagsverket. The results demonstrate that it is possible to determine whether a paper has a signature or not with a 99% accuracy, which was tested on 1000 images where the model was trained on 8787 images. A second ResNet architecture was implemented to identify the number of signatures, and the result shows that this was possible with an accuracy score of 94.6%.
|
56 |
Towards provably safe and robust learning-enabled systemsFan, Jiameng 26 August 2022 (has links)
Machine learning (ML) has demonstrated great success in numerous complicated tasks. Fueled by these advances, many real-world systems like autonomous vehicles and aircraft are adopting ML techniques by adding learning-enabled components. Unfortunately, ML models widely used today, like neural networks, lack the necessary mathematical framework to provide formal guarantees on safety, causing growing concerns over these learning-enabled systems in safety-critical settings. In this dissertation, we tackle this problem by combining formal methods and machine learning to bring provable safety and robustness to learning-enabled systems.
We first study the robustness verification problem of neural networks on classification tasks. We focus on providing provable safety guarantees on the absence of failures under arbitrarily strong adversaries. We propose an efficient neural network verifier LayR to compute a guaranteed and overapproximated range for the output of a neural network given an input set which contains all possible adversarially perturbed inputs. LayR relaxes nonlinear units in neural networks using linear bounds and refines such relaxations with mixed integer linear programming (MILP) to iteratively improve the approximation precision, which achieves tighter output range estimations compared to prior neural network verifiers. However, the neural network verifier focuses more on analyzing a trained neural network but less on learning provably safe neural networks. To tackle this problem, we study verifiable training that incorporates verification into training procedures to train provably safe neural networks and scale to larger models and datasets. We propose a novel framework, AdvIBP, for combining adversarial training and provable robustness verification. We show that the proposed framework can learn provable robust neural networks at a sublinear convergence rate.
In the second part of the dissertation, we study the verification of system-level properties in neural-network controlled systems (NNCS). We focus on proving bounded-time safety properties by computing reachable sets. We first introduce two efficient NNCS verifiers ReachNN* and POLAR that construct polynomial-based overapproximations of neural-network controllers. We transfer NNCSs to tractable closed-loop systems with approximated polynomial controllers for computing reachable sets using existing reachability analysis tools of dynamical systems. The combination of polynomial overapproximations and reachability analysis tools opens promising directions for NNCS verification. We also include a survey and experimental study of existing NNCS verification methods, including combining state-of-the-art neural network verifiers with reachability analysis tools, to discuss what overapproximation is suitable for NNCS reachability analysis. While these verifiers enable proving safety properties of NNCS, the nonlinearity of neural-network controllers is the main bottleneck that limits their efficiency and scalability. We propose a novel framework of knowledge distillation to control “the degree of nonlinearity” of NN controllers to ease NNCS verification which improves provable safety of NNCSs especially when they are safe but cannot be verified due to their complexity. For the verification community, this opens up the possibility of reducing verification complexity by influencing how a system is trained.
Though NNCS verification can prove safety when system models are known, modern deep learning, e.g., deep reinforcement learning (DRL), often targets tasks with unknown system models, also known as the model-free setting. To tackle this issue, we first focus on safe exploration of DRL and propose a novel Lyapunov-inspired method. Our method uses Gaussian Process models to provide probabilistic guarantees on the policies, and guide the exploration of the unknown environment in a safe fashion. Then, we study learning robust visual control policies in DRL to enhance the robustness against visual changes that were not seen during training. We propose a method DRIBO, which can learn robust state representations for RL via a novel contrastive version of the Multi-View Information Bottleneck (MIB). This approach enables us to train high-performance visual policies that are robust to visual distractions, and can generalize well to unseen environments.
|
57 |
[en] SEISMIC IMAGE SUPER RESOLUTION / [pt] SUPER RESOLUÇÃO DE IMAGENS SÍSMICASPEDRO FERREIRA ALVES PINTO 06 December 2022 (has links)
[pt] A super resolução (SR) é um tema de suma importância em domínios
de conhecimentos variados, como por exemplo a área médica, de monitoramento e de segurança. O uso de redes neurais profundas para a resolução
desta tarefa é algo extremamente recente no universo da sísmica, tendo poucas referências, as quais começaram a ser divulgadas há menos de 2 anos.
Todavia, a literatura apresenta uma vasta gama de métodos, que utilizam redes neurais para a super resolução de imagens naturais. Tendo isto em vista,
o objetivo deste trabalho é explorar tais abordagens aplicadas em dados sísmicos sintéticos de reservatórios. Para isto, foram empregados modelos de
importância cronológica na literatura e foram comparados com um método
clássico de interpolação e com os modelos da literatura de super resolução
de imagens sísmicas. São estes modelos: o SRCNN, o RDN, a abordagem do
Deep Image Prior e o SAN. Por fim, os resultados apresentam que o PSNR
obtido por arquiteturas de projetos no domínio da sísmica equivale a 38.23
e o melhor resultado das arquiteturas propostas 38.62, mostrando o avanço
que tais modelos trazem ao campo da sísmica. / [en] Super resolution (SR) is a topic of notable importance in domains of
assorted knowledge, such as the medical, monitoring, and security areas.
The use of deep neural networks to solve this task is something extremely
recent in the seismic field, with few references, which began to be published
less than 2 years ago. However, the literature presents a wide range of
methods, using neural networks for the super resolution of natural images.
With this in mind, the objective of this work is to explore such approaches
applied to synthetic seismic data from reservoirs. For this, models of
chronological importance in the literature were used and compared with
a classic interpolation method and with models of the literature of super
resolution of seismic images. These models are: SRCNN, RDN, the Deep
Image Prior approach and SAN. The results show that the PSNR obtained
by architectures developed for the seismic domain is equivalent to 38.23 and
the best result of the proposed architectures is 38.62, showing the progress
that such models bring to the seismic domain.
|
58 |
A Deep Learning-based Dynamic Demand Response FrameworkHaque, Ashraful 02 September 2021 (has links)
The electric power grid is evolving in terms of generation, transmission and distribution network architecture. On the generation side, distributed energy resources (DER) are participating at a much larger scale. Transmission and distribution networks are transforming to a decentralized architecture from a centralized one. Residential and commercial buildings are now considered as active elements of the electric grid which can participate in grid operation through applications such as the Demand Response (DR). DR is an application through which electric power consumption during the peak demand periods can be curtailed. DR applications ensure an economic and stable operation of the electric grid by eliminating grid stress conditions. In addition to that, DR can be utilized as a mechanism to increase the participation of green electricity in an electric grid.
The DR applications, in general, are passive in nature. During the peak demand periods, common practice is to shut down the operation of pre-selected electrical equipment i.e., heating, ventilation and air conditioning (HVAC) and lights to reduce power consumption. This approach, however, is not optimal and does not take into consideration any user preference. Furthermore, this does not provide any information related to demand flexibility beforehand. Under the broad concept of grid modernization, the focus is now on the applications of data analytics in grid operation to ensure an economic, stable and resilient operation of the electric grid. The work presented here utilizes data analytics in DR application that will transform the DR application from a static, look-up-based reactive function to a dynamic, context-aware proactive solution.
The dynamic demand response framework presented in this dissertation performs three major functionalities: electrical load forecast, electrical load disaggregation and peak load reduction during DR periods. The building-level electrical load forecasting quantifies required peak load reduction during DR periods. The electrical load disaggregation provides equipment-level power consumption. This will quantify the available building-level demand flexibility. The peak load reduction methodology provides optimal HVAC setpoint and brightness during DR periods to reduce the peak demand of a building. The control scheme takes user preference and context into consideration. A detailed methodology with relevant case studies regarding the design process of the network architecture of a deep learning algorithm for electrical load forecasting and load disaggregation is presented. A case study regarding peak load reduction through HVAC setpoint and brightness adjustment is also presented. To ensure the scalability and interoperability of the proposed framework, a layer-based software architecture to replicate the framework within a cloud environment is demonstrated. / Doctor of Philosophy / The modern power grid, known as the smart grid, is transforming how electricity is generated, transmitted and distributed across the US. In a legacy power grid, the utilities are the suppliers and the residential or commercial buildings are the consumers of electricity. However, the smart grid considers these buildings as active grid elements which can contribute to the economic, stable and resilient operation of an electric grid.
Demand Response (DR) is a grid application that reduces electrical power consumption during peak demand periods. The objective of DR application is to reduce stress conditions of the electric grid. The current DR practice is to shut down pre-selected electrical equipment i.e., HVAC, lights during peak demand periods. However, this approach is static, pre-fixed and does not consider any consumer preference. The proposed framework in this dissertation transforms the DR application from a look-up-based function to a dynamic context-aware solution.
The proposed dynamic demand response framework performs three major functionalities: electrical load forecasting, electrical load disaggregation and peak load reduction. The electrical load forecasting quantifies building-level power consumption that needs to be curtailed during the DR periods. The electrical load disaggregation quantifies demand flexibility through equipment-level power consumption disaggregation. The peak load reduction methodology provides actionable intelligence that can be utilized to reduce the peak demand during DR periods. The work leverages functionalities of a deep learning algorithm to increase forecasting accuracy. An interoperable and scalable software implementation is presented to allow integration of the framework with existing energy management systems.
|
59 |
Deep Learning for Ordinary Differential Equations and Predictive UncertaintyYijia Liu (17984911) 19 April 2024 (has links)
<p dir="ltr">Deep neural networks (DNNs) have demonstrated outstanding performance in numerous tasks such as image recognition and natural language processing. However, in dynamic systems modeling, the tasks of estimating and uncovering the potentially nonlinear structure of systems represented by ordinary differential equations (ODEs) pose a significant challenge. In this dissertation, we employ DNNs to enable precise and efficient parameter estimation of dynamic systems. In addition, we introduce a highly flexible neural ODE model to capture both nonlinear and sparse dependent relations among multiple functional processes. Nonetheless, DNNs are susceptible to overfitting and often struggle to accurately assess predictive uncertainty despite their widespread success across various AI domains. The challenge of defining meaningful priors for DNN weights and characterizing predictive uncertainty persists. In this dissertation, we present a novel neural adaptive empirical Bayes framework with a new class of prior distributions to address weight uncertainty.</p><p dir="ltr">In the first part, we propose a precise and efficient approach utilizing DNNs for estimation and inference of ODEs given noisy data. The DNNs are employed directly as a nonparametric proxy for the true solution of the ODEs, eliminating the need for numerical integration and resulting in significant computational time savings. We develop a gradient descent algorithm to estimate both the DNNs solution and the parameters of the ODEs by optimizing a fidelity-penalized likelihood loss function. This ensures that the derivatives of the DNNs estimator conform to the system of ODEs. Our method is particularly effective in scenarios where only a set of variables transformed from the system components by a given function are observed. We establish the convergence rate of the DNNs estimator and demonstrate that the derivatives of the DNNs solution asymptotically satisfy the ODEs determined by the inferred parameters. Simulations and real data analysis of COVID-19 daily cases are conducted to show the superior performance of our method in terms of accuracy of parameter estimates and system recovery, and computational speed.</p><p dir="ltr">In the second part, we present a novel sparse neural ODE model to characterize flexible relations among multiple functional processes. This model represents the latent states of the functions using a set of ODEs and models the dynamic changes of these states utilizing a DNN with a specially designed architecture and sparsity-inducing regularization. Our new model is able to capture both nonlinear and sparse dependent relations among multivariate functions. We develop an efficient optimization algorithm to estimate the unknown weights for the DNN under the sparsity constraint. Furthermore, we establish both algorithmic convergence and selection consistency, providing theoretical guarantees for the proposed method. We illustrate the efficacy of the method through simulation studies and a gene regulatory network example.</p><p dir="ltr">In the third part, we introduce a class of implicit generative priors to facilitate Bayesian modeling and inference. These priors are derived through a nonlinear transformation of a known low-dimensional distribution, allowing us to handle complex data distributions and capture the underlying manifold structure effectively. Our framework combines variational inference with a gradient ascent algorithm, which serves to select the hyperparameters and approximate the posterior distribution. Theoretical justification is established through both the posterior and classification consistency. We demonstrate the practical applications of our framework through extensive simulation examples and real-world datasets. Our experimental results highlight the superiority of our proposed framework over existing methods, such as sparse variational Bayesian and generative models, in terms of prediction accuracy and uncertainty quantification.</p>
|
60 |
On the use of $\alpha$-stable random variables in Bayesian bridge regression, neural networks and kernel processes.pdfJorge E Loria (18423207) 23 April 2024 (has links)
<p dir="ltr">The first chapter considers the l_α regularized linear regression, also termed Bridge regression. For α ∈ (0, 1), Bridge regression enjoys several statistical properties of interest such</p><p dir="ltr">as sparsity and near-unbiasedness of the estimates (Fan & Li, 2001). However, the main difficulty lies in the non-convex nature of the penalty for these values of α, which makes an</p><p dir="ltr">optimization procedure challenging and usually it is only possible to find a local optimum. To address this issue, Polson et al. (2013) took a sampling based fully Bayesian approach to this problem, using the correspondence between the Bridge penalty and a power exponential prior on the regression coefficients. However, their sampling procedure relies on Markov chain Monte Carlo (MCMC) techniques, which are inherently sequential and not scalable to large problem dimensions. Cross validation approaches are similarly computation-intensive. To this end, our contribution is a novel non-iterative method to fit a Bridge regression model. The main contribution lies in an explicit formula for Stein’s unbiased risk estimate for the out of sample prediction risk of Bridge regression, which can then be optimized to select the desired tuning parameters, allowing us to completely bypass MCMC as well as computation-intensive cross validation approaches. Our procedure yields results in a fraction of computational times compared to iterative schemes, without any appreciable loss in statistical performance.</p><p><br></p><p dir="ltr">Next, we build upon the classical and influential works of Neal (1996), who proved that the infinite width scaling limit of a Bayesian neural network with one hidden layer is a Gaussian process, when the network weights have bounded prior variance. Neal’s result has been extended to networks with multiple hidden layers and to convolutional neural networks, also with Gaussian process scaling limits. The tractable properties of Gaussian processes then allow straightforward posterior inference and uncertainty quantification, considerably simplifying the study of the limit process compared to a network of finite width. Neural network weights with unbounded variance, however, pose unique challenges. In this case, the classical central limit theorem breaks down and it is well known that the scaling limit is an α-stable process under suitable conditions. However, current literature is primarily limited to forward simulations under these processes and the problem of posterior inference under such a scaling limit remains largely unaddressed, unlike in the Gaussian process case. To this end, our contribution is an interpretable and computationally efficient procedure for posterior inference, using a conditionally Gaussian representation, that then allows full use of the Gaussian process machinery for tractable posterior inference and uncertainty quantification in the non-Gaussian regime.</p><p><br></p><p dir="ltr">Finally, we extend on the previous chapter, by considering a natural extension to deep neural networks through kernel processes. Kernel processes (Aitchison et al., 2021) generalize to deeper networks the notion proved by Neal (1996) by describing the non-linear transformation in each layer as a covariance matrix (kernel) of a Gaussian process. In this way, each succesive layer transforms the covariance matrix in the previous layer by a covariance function. However, the covariance obtained by this process loses any possibility of representation learning since the covariance matrix is deterministic. To address this, Aitchison et al. (2021) proposed deep kernel processes using Wishart and inverse Wishart matrices for each layer in deep neural networks. Nevertheless, the approach they propose requires using a process that does not emerge from the limit of a classic neural network structure. We introduce α-stable kernel processes (α-KP) for learning posterior stochastic covariances in each layer. Our results show that our method is much better than the approach proposed by Aitchison et al. (2021) in both simulated data and the benchmark Boston dataset.</p>
|
Page generated in 0.0445 seconds