191 |
Earthquake Detection using Deep Learning Based ApproachesAudretsch, James 17 March 2020 (has links)
Earthquake detection is an important task, focusing on detecting seismic events in past data or in real time from seismic time series. In the past few decades, due to the increasing amount of available seismic data, research in seismic event detection shows remarkable success using neural networks and other machine learning techniques. However, creating high quality labeled data sets is still a manual process that demands tremendous amount of time and expert knowledge, and is stifling big data innovation. When compiling a data set, it is unclear how many earthquakes and noise are mislabeled. Another challenge is how to promote the general applicability of the machine learning based models to different geographical regions. The models trained by data sets from one location should be applicable to the detection at other locations. This thesis explores the most popular deep learning model, convolutional neural networks (CNN), to build a single location detection model. In addition, we build more robust generalized earthquake detection models using transfer learning and meta learning. We also introduce a process for generating high quality labeled datasets. Our technique achieves high detection accuracy even on low signal to noise ratio events.
The AI techniques explored in this research have potential to be transferred to other domains that utilize signal processing. There are a myriad of potential applications, with audio processing probably being one of the most directly relevant. Any field that deals with waveforms (e.g. seismic, audio, light) can utilize the developed techniques.
|
192 |
Domain adaptive learning with disentangled featuresPeng, Xingchao 18 February 2021 (has links)
Recognizing visual information is crucial for many real artificial-intelligence-based applications, ranging from domestic robots to autonomous vehicles. However, the success of deep learning methods on visual recognition tasks is highly dependent on access to large-scale labeled datasets, which are expensive and cumbersome to collect. Transfer learning provides a way to alleviate the burden of annotating data, which transfers the knowledge learned from a rich-labeled source domain to a scarce-labeled target domain. However, the performance of deep learning models degrades significantly when testing on novel domains due to the presence of domain shift. To tackle the domain shift, conventional domain adaptation methods diminish the domain shift between two domains with a distribution matching loss or adversarial loss. These models align the domain-specific feature distribution and the domain-invariant feature distribution simultaneously, which is sub-optimal towards solving deep domain adaptation tasks, given that deep neural networks are known to extract features in which multiple hidden factors are highly entangled.
This thesis explores how to learn effective transferable features by disentangling the deep features. The following questions are studied: (1) how to disentangle the deep features into domain-invariant and domain-specific features? (2) how would feature disentanglement help to learn transferable features under a synthetic-to-real domain adaptation scenario? (3) how would feature disentanglement facilitate transfer learning with multiple source or target domains? (4) how to leverage feature disentanglement to boost the performance in a federated system?
To address these needs, this thesis proposes deep adversarial feature disentanglement: a class/domain identifier is trained on the labeled source domain and the disentangler generates features to fool the class/domain identifier. Extensive experiments and empirical analysis demonstrate the effectiveness of the feature disentanglement method on many real-world domain adaptation tasks. Specifically, the following three unsupervised domain adaptation scenarios are explored: (1) domain agnostic learning with disentangled representations, (2) unsupervised federated domain adaptation, (3) multi-source domain adaptation.
|
193 |
An intelligent flood evacuation model based on deep learning of various flood scenarios / 様々な洪水シナリオに対する深層学習に基づく水害避難行動モデルLi, Mengtong 23 March 2021 (has links)
京都大学 / 新制・課程博士 / 博士(工学) / 甲第23173号 / 工博第4817号 / 新制||工||1753(附属図書館) / 京都大学大学院工学研究科都市社会工学専攻 / (主査)教授 堀 智晴, 教授 田中 茂信, 教授 角 哲也 / 学位規則第4条第1項該当 / Doctor of Philosophy (Engineering) / Kyoto University / DFAM
|
194 |
Efficient serverless resource scheduling for distributed deep learning.Sundkvist, Johan January 2021 (has links)
Stemming from the growth and increased complexity of computer vision, natural language processing, and speech recognition algorithms; the need for scalability and fault tolerance of machine learning systems has risen. In order to comply with these demands many have turned their focus towards implementing machine learning on distributed systems. When running time demanding and resource intensive tasks like machine learning training on a cluster, resource efficiency is very important to keep training time low. To achieve efficient resource allocation a cluster scheduler is used. Standard scheduling frameworks are however not designed for deep learning, due to their static resource allocation. Most frameworks also do not make use of a serverless architecture, despite its ease of management and rapid scalability making it a fitting choice for deep learning tasks. Therefore we present Coach, a serverless job scheduler specialized for parameter server based deep learning models. Coach makes decisions to maximize resource efficiency and minimize training time through use of regression techniques to fit functions to data from previous training epochs. With Coach we attempt to answer three questions concerning the training speed (epochs/second) of deep learning models on a distributed system when using a serverless architecture. The three questions are as follows. One: does the addition of more workers and parameter servers have a positive impact on the training speed when running a varying number of concurrent training jobs? Two: can we see improved performance in regards to the training speed, when training is done in a distributed manner on a cluster with limited resources, compared to when it is done on a singular node? Three: how accurate are predictions made using fitted functions of previous training data at estimating the optimal number of workers and parameter servers to use during training, in order to maximize training speed? Due to limitations with the cluster used for testing we see that a minimal setup of a singular worker and server is almost always optimal. With results indicating that an additional server can have slight positive effects in some situations and an additional worker only appears positive in high variance situation where there are many jobs running at the same time. Which is theorized to be caused by choices made by the Kubernetes scheduler.
|
195 |
A Deep Learning Approach to Detect Alzheimer’s Disease Based on the Dementia Level in Brain MRI ImagesPellakur Rajasekaran, Shrish 04 October 2021 (has links)
No description available.
|
196 |
Classification of glomerular pathological findings using deep learning and nephrologist-AI collective intelligence approach / 深層学習および腎臓内科医と人工知能との集合知アプローチを用いた糸球体病理所見の分類Uchino, Eiichiro 24 September 2021 (has links)
京都大学 / 新制・論文博士 / 博士(医学) / 乙第13440号 / 論医博第2239号 / 新制||医||1054(附属図書館) / 京都大学大学院医学研究科医学専攻 / (主査)教授 黒田 知宏, 教授 松田 道行, 教授 長船 健二 / 学位規則第4条第2項該当 / Doctor of Medical Science / Kyoto University / DFAM
|
197 |
TRACE: A Differentiable Approach to Line-Level Stroke Recovery for Offline Handwritten TextArchibald, Taylor Neil 01 December 2020 (has links)
Stroke order and velocity are helpful features in the fields of signature verification, handwriting recognition, and handwriting synthesis. Recovering these features from offline handwritten text is a challenging and well-studied problem. We propose a new model called TRACE (Trajectory Recovery by an Adaptively-trained Convolutional Encoder). TRACE is a differentiable approach using a convolutional recurrent neural network (CRNN) to infer temporal stroke information from long lines of offline handwritten text with many characters. TRACE is perhaps the first system to be trained end-to-end on entire lines of text of arbitrary width and does not require the use of dynamic exemplars. Moreover, the system does not require images to undergo any pre-processing, nor do the predictions require any post-processing. Consequently, the recovered trajectory is differentiable and can be used as a loss function for other tasks, including synthesizing offline handwritten text. We demonstrate that temporal stroke information recovered by TRACE from offline data can be used for handwriting synthesis and establish the first benchmarks for a stroke trajectory recovery system trained on the IAM online database.
|
198 |
DEEP LEARNING FOR STATISTICAL DATA ANALYSIS: DIMENSION REDUCTION AND CAUSAL STRUCTURE INFERENCESiqi Liang (11799653) 19 December 2021 (has links)
<div>During the past decades, deep learning has been proven to be an important tool for statistical data analysis. Motivated by the promise of deep learning in tackling the curse of dimensionality, we propose three innovative methods which apply deep learning techniques to high-dimensional data analysis in this dissertation.</div><div><br></div><div>Firstly, we propose a nonlinear sufficient dimension reduction method, the so-called split-and-merge deep neural networks (SM-DNN), which employs the split-and-merge technique on deep neural networks to obtain nonlinear sufficient dimension reduction of the input data and then learn a deep neural network on the dimension reduced data. We show that the DNN-based dimension reduction is sufficient for data drawn from exponential family, which retains all information on response contained in the explanatory data. Our numerical experiments indicate that the SM-DNN method can lead to significant improvement in phenotype prediction for a variety of real data examples. In particular, with only rare variants, we achieved a remarkable prediction accuracy of over 74\% for the Early-Onset Myocardial Infarction (EOMI) exome sequence data. </div><div><br></div><div>Secondly, we propose another nonlinear SDR method based on a new type of stochastic neural network under a rigorous probabilistic framework and show that it can be used for sufficient dimension reduction for high-dimensional data. The proposed stochastic neural network can be trained using an adaptive stochastic gradient Markov chain Monte Carlo algorithm. Through extensive experiments on real-world classification and regression problems, we show that the proposed method compares favorably with the existing state-of-the-art sufficient dimension reduction methods and is computationally more efficient for large-scale data.</div><div><br></div><div>Finally, we propose a structure learning method for learning the causal structure hidden in the high-dimensional data, which consists of two stages:</div><div>we first conduct Bayesian sparse learning for variable screening to build a primary graph, and then we perform conditional independence tests to refine the primary graph. </div><div>Extensive numerical experiments and quantitative tests confirm the generality, effectiveness and power of the proposed methods.</div>
|
199 |
Accelerating Emerging Neural WorkloadsJacob R Stevens (11805797) 20 December 2021 (has links)
<div> Due to a combination of algorithmic advances, wide-spread availability of rich data sets, and tremendous growth in compute availability, Deep Neural Networks (DNNs) have seen considerable success in a wide variety of fields, achieving state-of-the art accuracy in a number of perceptual domains, such as text, video and audio processing. Recently, there have been many efforts to extend this success in the perceptual, Euclidean-based domain to non-perceptual tasks, such as task planning or reasoning, as well as to non-Euclidean domains, such as graphs. While several DNN accelerators have been proposed in the past decade, they largely focus on traditional DNN workloads, such as Multi-layer Perceptions (MLPs), Convolutional Neural Networks (CNNs), and Recurrent Neural Networks (RNNs). These accelerators are ill-suited to the unique computational needs of the emerging neural networks. In this dissertation, we aim to fix this gap by proposing novel hardware architectures that are specifically tailored to emerging neural workloads.</div><div><br></div><div>First, we consider memory-augmented neural networks (MANNs), a new class of neural networks that exhibits capabilities such as one-shot learning and task planning that are well beyond those of traditional DNNs. MANNs augment a traditional DNN with an external differentiable memory that is used to store dynamic state. This dissertation proposes a novel accelerator that targets the main bottleneck of MANNs: the soft reads and writes to this external memory, each of which requires access to all the memory locations.</div><div><br></div><div>We then focus on Transformer networks, which have become very popular for Natural Language Processing (NLP). A key to the success of these networks is a technique called self-attention, which employs a softmax operation. Softmax is poorly supported in modern, matrix multiply-focused accelerators since it accounts for a very small fraction of traditional DNN workloads. We propose a hardware/software co-design approach to realize softmax efficiently by utilize a suite of approximate computing techniques.</div><div><br></div><div>Next, we address graph neural networks (GNNs). GNNs are achieving state-of-the-art results in a variety of fields such as physics modeling, chemical synthesis, and electronic design automation. These GNNs are a hybrid between graph processing workloads and DNN workloads; they utilize DNN-based feature extractors to form hidden representations for each node in a graph and then combine these representations through some form of a graph traversal. As a result, existing hardware specialized for either graph processing workloads or DNN workloads is insufficient. Instead, we design a novel architecture that balances the needs of these two heterogeneous compute patterns. We also propose a novel feature dimension-blocking dataflow to further increase performance by mitigating the memory bottleneck.</div><div><br></div><div>Finally, we address the growing difficulty in tightly coupling new DNNs and a hardware platform. Given the extremely large DNN-HW design space consisting of DNN selection, hardware operating condition, and DNN-to-HW mapping, it is infeasible to exhaustively search this space by running each sample on a physical hardware device. This has led to the need for highly accurate, machine learning-based performance models which can \emph{predict} the latency/power/energy even faster than direct execution. We first present a taxonomy to characterize the possible approaches to these performance estimators. Based on the insights from this taxonomy, we present a new performance estimator that combines coarse-grained and fine-grained to achieve superior accuracy with a limited number of training samples. Finally, we propose a flexible framework for creating these DNN-HW performance estimators.</div><div><br></div><div>In summary, this dissertation identifies the growing gap between current hardware and new emerging neural networks. We first propose three novel hardware architectures that address this gap for MANNs, Transformers, and GNNs. We then propose a novel hardware-aware DNN estimator and framework to ease addressing this gap for new networks in the future.</div>
|
200 |
Deep Parameter Selection For Classic Computer Vision ApplicationsWhitney, Michael 13 December 2021 (has links)
A trend in computer vision today is to retire older, so-called "classic'' methods in favor of ones based on deep neural networks. This has led to tremendous improvements in many areas, but for some problems deep neural solutions may not yet exist or be of practical application. For this and other reasons, classic methods are still widely used in a variety of applications. This paper explores the possibility of using deep neural networks to improve these older methods instead of replace them. In particular, it addresses the issue of parameter selection in these algorithms by using a neural network to predict effective settings on a per-input basis. Specifically, we look at a straightforward and well-understood algorithm with one primary parameter: interactive graph-cut segmentation. This parameter balances region/boundary influences and heavily influences the resulting segmentation. Many approach tuning this parameter by using an ad hoc or empirically selected static setting, while others pre-analyze images to determine effective settings on a per-image basis. Tuning this parameter for each image, or even for each target selection within an image, is highly sensitive to properties of the image and object, suggesting that a network might be able to recognize these properties and predict settings that would improve performance. We employ a lightweight network with minimal layers to avoid adding significant computational overhead with this pre-analysis step. The network predicts the segmentation performance for each of a set of discretely sampled values for this parameter and selects the one with the highest predicted performance. Results demonstrate that this per-image prediction and tuning performs better than a single empirically selected setting.
|
Page generated in 0.0756 seconds