301 |
Sentiment Classification with Deep Neural NetworksKalogiras, Vasileios January 2017 (has links)
Attitydanalys är ett delfält av språkteknologi (NLP) som försöker analysera känslan av skriven text. Detta är ett komplext problem som medför många utmaningar. Av denna anledning har det studerats i stor utsträckning. Under de senaste åren har traditionella maskininlärningsalgoritmer eller handgjord metodik använts och givit utmärkta resultat. Men den senaste renässansen för djupinlärning har växlat om intresse till end to end deep learning-modeller.Å ena sidan resulterar detta i mer kraftfulla modeller men å andra sidansaknas klart matematiskt resonemang eller intuition för dessa modeller. På grund av detta görs ett försök i denna avhandling med att kasta ljus på nyligen föreslagna deep learning-arkitekturer för attitydklassificering. En studie av deras olika skillnader utförs och ger empiriska resultat för hur ändringar i strukturen eller kapacitet hos modellen kan påverka exaktheten och sättet den representerar och ''förstår'' meningarna. / Sentiment analysis is a subfield of natural language processing (NLP) that attempts to analyze the sentiment of written text.It is is a complex problem that entails different challenges. For this reason, it has been studied extensively. In the past years traditional machine learning algorithms or handcrafted methodologies used to provide state of the art results. However, the recent deep learning renaissance shifted interest towards end to end deep learning models. On the one hand this resulted into more powerful models but on the other hand clear mathematical reasoning or intuition behind distinct models is still lacking. As a result, in this thesis, an attempt to shed some light on recently proposed deep learning architectures for sentiment classification is made.A study of their differences is performed as well as provide empirical results on how changes in the structure or capacity of a model can affect its accuracy and the way it represents and ''comprehends'' sentences.
|
302 |
Homography Estimation using Deep Learning for Registering All-22 Football Video Frames / Homografiuppskattning med deep learning för registrering av bildrutor från video av amerikansk fotbollFristedt, Hampus January 2017 (has links)
Homography estimation is a fundamental task in many computer vision applications, but many techniques for estimation rely on complicated feature extraction pipelines. We extend research in direct homography estimation (i.e. without explicit feature extraction) by implementing a convolutional network capable of estimating homographies. Previous work in deep learning based homography estimation calculates homographies between pairs of images, whereas our network takes single image input and registers it to a reference view where no image data is available. The application of the work is registering frames from American football video to a top-down view of the field. Our model manages to register frames in a test set with an average corner error equivalent to less than 2 yards. / Homografiuppskattning är ett förkrav för många problem inom datorseende, men många tekniker för att uppskatta homografier bygger på komplicerade processer för att extrahera särdrag mellan bilderna. Vi bygger på tidigare forskning inom direkt homografiuppskattning (alltså, utan att explicit extrahera särdrag) genom att implementera ett Convolutional Neural Network (CNN) kapabelt av att direkt uppskatta homografier. Arbetet tillämpas för att registrera bilder från video av amerikansk fotball till en referensvy av fotbollsplanen. Vår modell registrerar bildramer från ett testset till referensvyn med ett snittfel i bildens hörn ekvivalent med knappt 2 yards.
|
303 |
Decoding Electrocorticography Signals by Deep Learning for Brain-Computer Interface / Deep learning-baserad avkodning av elektrokortikografiska signaler för ett hjärn-datorsgränssnittJUBIEN, Guillaume January 2019 (has links)
Brain-Computer Interface (BCI) offers the opportunity to paralyzed patients to control their movements without any neuromuscular activity. Signal processing of neuronal activity enables to decode movement intentions. Ability for patient to control an effector is closely linked to this decoding performance. In this study, I tackle a recent way to decode neuronal activity: Deep learning. The study is based on public data extracted by Schalk et al. for BCI Competition IV. Electrocorticogram (ECoG) data from three epileptic patients were recorded. During the experiment setup, the team asked subjects to move their fingers and recorded finger movements thanks to a data glove. An artificial neural network (ANN) was built based on a common BCI feature extraction pipeline made of successive convolutional layers. This network firstly mimics a spatial filtering with a spatial reduction of sources. Then, it realizes a time-frequency analysis and performs a log power extraction of the band-pass filtered signals. The first investigation was on the optimization of the network. Then, the same architecture was used on each subject and the decoding performances were computed for a 6-class classification. I especially investigated the spatial and temporal filtering. Finally, a preliminary study was conducted on prediction of finger movement. This study demonstrated that deep learning could be an effective way to decode brain signal. For 6-class classification, results stressed similar performances as traditional decoding algorithm. As spatial or temporal weights after training are slightly described in the literature, we especially worked on interpretation of weights after training. The spatial weight study demonstrated that the network is able to select specific ECoG channels notified in the literature as the most informative. Moreover, the network is able to converge to the same spatial solution, independently to the initialization. Finally, a preliminary study was conducted on prediction of movement position and gives encouraging results.
|
304 |
Domain-Independent Moving Object Depth Estimation using Monocular Camera / Domän-oberoende djupestimering av objekt i rörelse med monokulär kameraNassir, Cesar January 2018 (has links)
Today automotive companies across the world strive to create vehicles with fully autonomous capabilities. There are many benefits of developing autonomous vehicles, such as reduced traffic congestion, increased safety and reduced pollution, etc. To be able to achieve that goal there are many challenges ahead, one of them is visual perception. Being able to estimate depth from a 2D image has been shown to be a key component for 3D recognition, reconstruction and segmentation. Being able to estimate depth in an image from a monocular camera is an ill-posed problem since there is ambiguity between the mapping from colour intensity and depth value. Depth estimation from stereo images has come far compared to monocular depth estimation and was initially what depth estimation relied on. However, being able to exploit monocular cues is necessary for scenarios when stereo depth estimation is not possible. We have presented a novel CNN network, BiNet which is inspired by ENet, to tackle depth estimation of moving objects using only a monocular camera in real-time. It performs better than ENet in the Cityscapes dataset while adding only a small overhead to the complexity. / I dag strävar bilföretag över hela världen för att skapa fordon med helt autonoma möjligheter. Det finns många fördelar med att utveckla autonoma fordon, såsom minskad trafikstockning, ökad säkerhet och minskad förorening, etc. För att kunna uppnå det målet finns det många utmaningar framåt, en av dem är visuell uppfattning. Att kunna uppskatta djupet från en 2D-bild har visat sig vara en nyckelkomponent för 3D-igenkännande, rekonstruktion och segmentering. Att kunna uppskatta djupet i en bild från en monokulär kamera är ett svårt problem eftersom det finns tvetydighet mellan kartläggningen från färgintensitet och djupvärde. Djupestimering från stereobilder har kommit långt jämfört med monokulär djupestimering och var ursprungligen den metod som man har förlitat sig på. Att kunna utnyttja monokulära bilder är dock nödvändig för scenarier när stereodjupuppskattning inte är möjligt. Vi har presenterat ett nytt nätverk, BiNet som är inspirerat av ENet, för att ta itu med djupestimering av rörliga objekt med endast en monokulär kamera i realtid. Det fungerar bättre än ENet med datasetet Cityscapes och lägger bara till en liten kostnad på komplexiteten.
|
305 |
A New Era for Wireless Communications Physical Layer: A Data-Driven Learning-Based Approach.Al-Baidhani, Amer 23 August 2022 (has links)
No description available.
|
306 |
Spatio-Temporal Analysis of EEG using Deep LearningSudalairaj, Shivchander 22 August 2022 (has links)
No description available.
|
307 |
More is Better than One: The Effect of Ensembling on Deep Learning Performance in Biochemical Prediction ProblemsStern, Jacob A. 07 August 2023 (has links) (PDF)
This thesis presents two papers addressing important biochemical prediction challenges. The first paper focuses on accurate protein distance predictions and introduces updates to the ProSPr network. We evaluate its performance in the Critical Assessment of techniques for Protein Structure Prediction (CASP14) competition, investigating its accuracy dependence on sequence length and multiple sequence alignment depth. The ProSPr network, an ensemble of three convolutional neural networks (CNNs), demonstrates superior performance compared to individual networks. The second paper addresses the issue of accurate ligand ranking in virtual screening for drug discovery. We propose MILCDock, a machine learning consensus docking tool that leverages predictions from five traditional molecular docking tools. MILCDock, an ensemble of eight neural networks, outperforms single-network approaches and other consensus docking methods on the DUD-E dataset. However, we find that LIT-PCBA targets remain challenging for all methods tested. Furthermore, we explore the effectiveness of training machine learning tools on the biased DUD-E dataset, emphasizing the importance of mitigating its biases during training. Collectively, this work emphasizes the power of ensembling in deep learning-based biochemical prediction problems, highlighting improved performance through the combination of multiple models. Our findings contribute to the development of robust protein distance prediction tools and more accurate virtual screening methods for drug discovery.
|
308 |
Towards Explainable Event Detection and ExtractionMehta, Sneha 22 July 2021 (has links)
Event extraction refers to extracting specific knowledge of incidents from natural language text and consolidating it into a structured form. Some important applications of event extraction include search, retrieval, question answering and event forecasting. However, before events can be extracted it is imperative to detect events i.e. identify which documents from a large collection contain events of interest and from those extracting the sentences that might contain the event related information. This task is challenging because it is easier to obtain labels at the document level than finegrained annotations at the sentence level. Current approaches for this task are suboptimal because they directly aggregate sentence probabilities estimated by a classifier to obtain document probabilities resulting in error propagation. To alleviate this problem we propose to leverage recent advances in representation learning by using attention mechanisms. Specifically, for event detection we propose a method to compute document embeddings from sentence embeddings by leveraging attention and training a document classifier on those embeddings to mitigate the error propagation problem. However, we find that existing attention mechanisms are inept for this task, because either they are suboptimal or they use a large number of parameters. To address this problem we propose a lean attention mechanism which is effective for event detection. Current approaches for event extraction rely on finegrained labels in specific domains. Extending extraction to new domains is challenging because of difficulty of collecting finegrained data.
Machine reading comprehension(MRC) based approaches, that enable zero-shot extraction struggle with syntactically complex sentences and long-range dependencies. To mitigate this problem, we propose a syntactic sentence simplification approach that is guided by MRC model to improve its performance on event extraction. / Doctor of Philosophy / Event extraction is the task of extracting events of societal importance from natural language texts. The task has a wide range of applications from search, retrieval, question answering to forecasting population level events like civil unrest, disease occurrences with reasonable accuracy. Before events can be extracted it is imperative to identify the documents that are likely to contain the events of interest and extract the sentences that mention those events. This is termed as event detection. Current approaches for event detection are suboptimal. They assume that events are neatly partitioned into sentences and obtain document level event probabilities directly from predicted sentence level probabilities. In this dissertation, under the same assumption by leveraging representation learning we mitigate some of the shortcomings of the previous event detection methods. Current approaches to event extraction are only limited to restricted domains and require finegrained labeled corpora for their training. One way to extend event extraction to new domains in by enabling zero-shot extraction. Machine reading comprehension (MRC) based approach provides a promising way forward for zero-shot extraction. However, this approach suffers from the long-range dependency problem and faces difficulty in handling syntactically complex sentences with multiple clauses. To mitigate this problem we propose a syntactic sentence simplification algorithm that is guided by the MRC system to improves its performance.
|
309 |
Visual Analytics for High Dimensional Simulation EnsemblesDahshan, Mai Mansour Soliman Ismail 10 June 2021 (has links)
Recent advancements in data acquisition, storage, and computing power have enabled scientists from various scientific and engineering domains to simulate more complex and longer phenomena. Scientists are usually interested in understanding the behavior of a phenomenon in different conditions. To do so, they run multiple simulations with different configurations (i.e., parameter settings, boundary/initial conditions, or computational models), resulting in an ensemble dataset. An ensemble empowers scientists to quantify the uncertainty in the simulated phenomenon in terms of the variability between ensemble members, the parameter sensitivity and optimization, and the characteristics and outliers within the ensemble members, which could lead to valuable insight(s) about the simulated model.
The size, complexity, and high dimensionality (e.g., simulation input and output parameters) of simulation ensembles pose a great challenge in their analysis and exploration. Ensemble visualization provides a convenient way to convey the main characteristics of the ensemble for enhanced understanding of the simulated model. The majority of the current ensemble visualization techniques are mainly focused on analyzing either the ensemble space or the parameter space. Most of the parameter space visualizations are not designed for high-dimensional data sets or did not show the intrinsic structures in the ensemble. Conversely, ensemble space has been visualized either as a comparative visualization of a limited number of ensemble members or as an aggregation of multiple ensemble members omitting potential details of the original ensemble. Thus, to unfold the full potential of simulation ensembles, we designed and developed an approach to the visual analysis of high-dimensional simulation ensembles that merges sensemaking, human expertise, and intuition with machine learning and statistics.
In this work, we explore how semantic interaction and sensemaking could be used for building interactive and intelligent visual analysis tools for simulation ensembles. Specifically, we focus on the complex processes that derive meaningful insights from exploring and iteratively refining the analysis of high dimensional simulation ensembles when prior knowledge about ensemble features and correlations is limited or/and unavailable. We first developed GLEE (Graphically-Linked Ensemble Explorer), an exploratory visualization tool that enables scientists to analyze and explore correlations and relationships between non-spatial ensembles and their parameters. Then, we developed Spatial GLEE, an extension to GLEE that explores spatial data while simultaneously considering spatial characteristics (i.e., autocorrelation and spatial variability) and dimensionality of the ensemble. Finally, we developed Image-based GLEE to explore exascale simulation ensembles produced from in-situ visualization. We collaborated with domain experts to evaluate the effectiveness of GLEE using real-world case studies and experiments from different domains.
The core contribution of this work is a visual approach that enables the exploration of parameter and ensemble spaces for 2D/3D high dimensional ensembles simultaneously, three interactive visualization tools that explore search, filter, and make sense of non-spatial, spatial, and image-based ensembles, and usage of real-world cases from different domains to demonstrate the effectiveness of the proposed approach. The aim of the proposed approach is to help scientists gain insights by answering questions or testing hypotheses about the different aspects of the simulated phenomenon or/and facilitate knowledge discovery of complex datasets. / Doctor of Philosophy / Scientists run simulations to understand complex phenomena and processes that are expensive, difficult, or even impossible to reproduce in the real world. Current advancements in high-performance computing have enabled scientists from various domains, such as climate, computational fluid dynamics, and aerodynamics to run more complex simulations than before. However, a single simulation run would not be enough to capture all features in a simulated phenomenon. Therefore, scientists run multiple simulations using perturbed input parameters, initial and boundary conditions, or different models resulting in what is known as an ensemble. An ensemble empowers scientists to understand the model's behavior by studying relationships between and among ensemble members, the optimal parameter settings, and the influence of input parameters on the simulation output, which could lead to useful knowledge and insights about the simulated phenomenon.
To effectively analyze and explore simulation ensembles, visualization techniques play a significant role in facilitating knowledge discoveries through graphical representations. Ensemble visualization offers scientists a better way to understand the simulated model. Most of the current ensemble visualization techniques are designed to analyze or/and explore either the ensemble space or the parameter space. Therefore, we designed and developed a visual analysis approach for exploring and analyzing high-dimensional parameter and ensemble spaces simultaneously by integrating machine learning and statistics with sensemaking and human expertise.
The contribution of this work is to explore how to use semantic interaction and sensemaking to explore and analyze high-dimensional simulation ensembles. To do so, we designed and developed a visual analysis approach that manifested in an exploratory visualization tool, GLEE (Graphically-Linked Ensemble Explorer), that allowed scientists to explore, search, filter, and make sense of high dimensional 2D/3D simulations ensemble. GLEE's visualization pipeline and interaction techniques used deep learning, feature extraction, spatial regression, and Semantic Interaction (SI) techniques to support the exploration of non-spatial, spatial, and image-based simulation ensembles. GLEE different visualization tools were evaluated with domain experts from different fields using real-world case studies and experiments.
|
310 |
Neural Enhancement Strategies for Robust Speech ProcessingNawar, Mohamed Nabih Ali Mohamed 10 March 2023 (has links)
In real-world scenarios, speech signals are often contaminated with environmental noises, and reverberation, which degrades speech quality and intelligibility. Lately, the development of deep learning algorithms has marked milestones in speech- based research fields e.g. speech recognition, spoken language understanding, etc. As one of the crucial topics in the speech processing research area, speech enhancement aims to restore clean speech signals from noisy signals. In the last decades, many conventional speech enhancement statistical-based algorithms had been pro-
posed. However, the performance of these approaches is limited in non-stationary noisy conditions. The raising of deep learning-based approaches for speech enhancement has led to revolutionary advances in their performance. In this context, speech enhancement is formulated as a supervised learning problem, which tackles the open challenges introduced by the speech enhancement conventional approaches. In general, deep learning speech enhancement approaches are categorized into frequency-domain and time-domain approaches. In particular, we experiment with the performance of the Wave-U-Net model, a solid and superior time-domain approach for speech enhancement. First, we attempt to improve the performance of back-end speech-based classification tasks in noisy conditions. In detail, we propose a pipeline that integrates the Wave-U-Net (later this model is modified to the Dilated Encoder Wave-U-Net) as
a pre-processing stage for noise elimination with a temporal convolution network (TCN) for the intent classification task. Both models are trained independently from each other. Reported experimental results showed that the modified Wave-U-Net model not only improves the speech quality and intelligibility measured in terms of PESQ, and STOI metrics, but also improves the back-end classification accuracy. Later, it was observed that the dis-joint training approach often introduces signal distortion in the output of the speech enhancement module. Thus, it can deteriorate the back-end performance. Motivated by this, we introduce a set of fully time- domain joint training pipelines that combine the Wave-U-Net model with the TCN intent classifier. The difference between these architectures is the interconnections between the front-end and back-end. All architectures are trained with a loss function that combines the MSE loss as the front-end loss with the cross-entropy loss for the classification task. Based on our observations, we claim that the JT architecture with equally balancing both components’ contributions yields better classification
accuracy. Lately, the release of large-scale pre-trained feature extraction models has considerably simplified the development of speech classification and recognition algorithms. However, environmental noise and reverberation still negatively affect performance, making robustness in noisy conditions mandatory in real-world applications. One
way to mitigate the noise effect is to integrate a speech enhancement front-end that removes artifacts from the desired speech signals. Unlike the state-of-the-art enhancement approaches that operate either on speech spectrogram, or directly on time-domain signals, we study how enhancement can be applied directly on the speech embeddings, extracted using Wav2Vec, and WavLM models. We investigate a variety of training approaches, considering different flavors of joint and disjoint training of the speech enhancement front-end and of the classification/recognition
back-end. We perform exhaustive experiments on the Fluent Speech Commands and Google Speech Commands datasets, contaminated with noises from the Microsoft Scalable Noisy Speech Dataset, as well as on LibriSpeech, contaminated with noises from the MUSAN dataset, considering intent classification, keyword spotting, and speech recognition tasks respectively. Results show that enhancing the speech em-bedding is a viable and computationally effective approach, and provide insights about the most promising training approaches.
|
Page generated in 0.0996 seconds