Global ETD Search

21	Multimodal Affective Computing Using Temporal Convolutional Neural Network and Deep Convolutional Neural Networks Ayoub, Issa 24 June 2019 (has links) Affective computing has gained significant attention from researchers in the last decade due to the wide variety of applications that can benefit from this technology. Often, researchers describe affect using emotional dimensions such as arousal and valence. Valence refers to the spectrum of negative to positive emotions while arousal determines the level of excitement. Describing emotions through continuous dimensions (e.g. valence and arousal) allows us to encode subtle and complex affects as opposed to discrete emotions, such as the basic six emotions: happy, anger, fear, disgust, sad and neutral. Recognizing spontaneous and subtle emotions remains a challenging problem for computers. In our work, we employ two modalities of information: video and audio. Hence, we extract visual and audio features using deep neural network models. Given that emotions are time-dependent, we apply the Temporal Convolutional Neural Network (TCN) to model the variations in emotions. Additionally, we investigate an alternative model that combines a Convolutional Neural Network (CNN) and a Recurrent Neural Network (RNN). Given our inability to fit the latter deep model into the main memory, we divide the RNN into smaller segments and propose a scheme to back-propagate gradients across all segments. We configure the hyperparameters of all models using Gaussian processes to obtain a fair comparison between the proposed models. Our results show that TCN outperforms RNN for the recognition of the arousal and valence emotional dimensions. Therefore, we propose the adoption of TCN for emotion detection problems as a baseline method for future work. Our experimental results show that TCN outperforms all RNN based models yielding a concordance correlation coefficient of 0.7895 (vs. 0.7544) on valence and 0.8207 (vs. 0.7357) on arousal on the validation dataset of SEWA dataset for emotion prediction. Temporal Convolutional Neural Networks Recurrent Neural Networks Gaussian Processes Hyperparameter Optimization Convolutional Neural Networks
22	Sparse Gaussian process approximations and applications van der Wilk, Mark January 2019 (has links) Many tasks in machine learning require learning some kind of input-output relation (function), for example, recognising handwritten digits (from image to number) or learning the motion behaviour of a dynamical system like a pendulum (from positions and velocities now to future positions and velocities). We consider this problem using the Bayesian framework, where we use probability distributions to represent the state of uncertainty that a learning agent is in. In particular, we will investigate methods which use Gaussian processes to represent distributions over functions. Gaussian process models require approximations in order to be practically useful. This thesis focuses on understanding existing approximations and investigating new ones tailored to specific applications. We advance the understanding of existing techniques first through a thorough review. We propose desiderata for non-parametric basis function model approximations, which we use to assess the existing approximations. Following this, we perform an in-depth empirical investigation of two popular approximations (VFE and FITC). Based on the insights gained, we propose a new inter-domain Gaussian process approximation, which can be used to increase the sparsity of the approximation, in comparison to regular inducing point approximations. This allows GP models to be stored and communicated more compactly. Next, we show that inter-domain approximations can also allow the use of models which would otherwise be impractical, as opposed to improving existing approximations. We introduce an inter-domain approximation for the Convolutional Gaussian process - a model that makes Gaussian processes suitable to image inputs, and which has strong relations to convolutional neural networks. This same technique is valuable for approximating Gaussian processes with more general invariance properties. Finally, we revisit the derivation of the Gaussian process State Space Model, and discuss some subtleties relating to their approximation. We hope that this thesis illustrates some benefits of non-parametric models and their approximation in a non-parametric fashion, and that it provides models and approximations that prove to be useful for the development of more complex and performant models in the future.
23	Efficient Edge Intelligence in the Era of Big Data Wong, Jun Hua 08 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Smart wearables, known as emerging paradigms for vital big data capturing, have been attracting intensive attentions. However, one crucial problem is their power-hungriness, i.e., the continuous data streaming consumes energy dramatically and requires devices to be frequently charged. Targeting this obstacle, we propose to investigate the biodynamic patterns in the data and design a data-driven approach for intelligent data compression. We leverage Deep Learning (DL), more specifically, Convolutional Autoencoder (CAE), to learn a sparse representation of the vital big data. The minimized energy need, even taking into consideration the CAE-induced overhead, is tremendously lower than the original energy need. Further, compared with state-of-the-art wavelet compression-based method, our method can compress the data with a dramatically lower error for a similar energy budget. Our experiments and the validated approach are expected to boost the energy efficiency of wearables, and thus greatly advance ubiquitous big data applications in era of smart health. In recent years, there has also been a growing interest in edge intelligence for emerging instantaneous big data inference. However, the inference algorithms, especially deep learning, usually require heavy computation requirements, thereby greatly limiting their deployment on the edge. We take special interest in the smart health wearable big data mining and inference. Targeting the deep learning’s high computational complexity and large memory and energy requirements, new approaches are urged to make the deep learning algorithms ultra-efficient for wearable big data analysis. We propose to leverage knowledge distillation to achieve an ultra-efficient edge-deployable deep learning model. More specifically, through transferring the knowledge from a teacher model to the on-edge student model, the soft target distribution of the teacher model can be effectively learned by the student model. Besides, we propose to further introduce adversarial robustness to the student model, by stimulating the student model to correctly identify inputs that have adversarial perturbation. Experiments demonstrate that the knowledge distillation student model has comparable performance to the heavy teacher model but owns a substantially smaller model size. With adversarial learning, the student model has effectively preserved its robustness. In such a way, we have demonstrated the framework with knowledge distillation and adversarial learning can, not only advance ultra-efficient edge inference, but also preserve the robustness facing the perturbed input. / 2023-06-01 Smart wearables Artificial Intelligence Deep learning Convolutional Autoencoders Convolutional Neural Networks
24	Implementation of Parallel and Serial Concatenated Convolutional Codes Wu, Yufei 27 April 2000 (has links) Parallel concatenated convolutional codes (PCCCs), called "turbo codes" by their discoverers, have been shown to perform close to the Shannon bound at bit error rates (BERs) between 1e-4 and 1e-6. Serial concatenated convolutional codes (SCCCs), which perform better than PCCCs at BERs lower than 1e-6, were developed borrowing the same principles as PCCCs, including code concatenation, pseudorandom interleaving and iterative decoding. The first part of this dissertation introduces the fundamentals of concatenated convolutional codes. The theoretical and simulated BER performance of PCCC and SCCC are discussed. Encoding and decoding structures are explained, with emphasis on the Log-MAP decoding algorithm and the general soft-input soft-output (SISO) decoding module. Sliding window techniques, which can be employed to reduce memory requirements, are also briefly discussed. The second part of this dissertation presents four major contributions to the field of concatenated convolutional coding developed through this research. First, the effects of quantization and fixed point arithmetic on the decoding performance are studied. Analytic bounds and modular renormalization techniques are developed to improve the efficiency of SISO module implementation without compromising the performance. Second, a new stopping criterion, SDR, is discovered. It is found to perform well with lowest cost when evaluating its complexity and performance in comparison with existing criteria. Third, a new type-II code combining automatic repeat request (ARQ) technique is introduced which makes use of the related PCCC and SCCC. Fourth, a new code-assisted synchronization technique is presented, which uses a list approach to leverage the simplicity of the correlation technique and the soft information of the decoder. In particular, the variant that uses SDR criterion achieves superb performance with low complexity. Finally, the third part of this dissertation discusses the FPGA-based implementation of the turbo decoder, which is the fruit of cooperation with fellow researchers. / Ph. D. parallel concatenated convolutional code turbo codes serial concatenated convolutional code wireless communications channel coding
25	VITERBI DECODER FOR NASA’S SPACE SHUTTLE’S TELEMETRY DATA Mayer, Robert, McDaniels, James, Kalil, Lou F. 10 1900 (has links) International Telemetering Conference Proceedings / October 26-29, 1992 / Town and Country Hotel and Convention Center, San Diego, California / In the event of a NASA Space Shuttle mission landing at the While Sands Missile Range, White Sands, New Mexico, a data communications system for processing Shuttle’s telemetry data has been installed there in the Master Control Telemetry Station, JIG-56. This data system required a Viterbi decoder since the Shuttle’s data is convolutionally encoded. However, the Shuttle uses a nonstandard code, and the manufacturer which in the past has provided decoders for Shuttle support, no longer produces them. Since no other company produced a Viterbi decoder designed to decode the shuttle’s data, it was necessary to develop the required decoder. The purpose of this paper is to describe the functional performance requirements and design of this decoder. Viterbi decoder Convolutional Encoding Constraint Length Code Generating Polynomials
26	Face Recognition with Preprocessing and Neural Networks Habrman, David January 2016 (has links) Face recognition is the problem of identifying individuals in images. This thesis evaluates two methods used to determine if pairs of face images belong to the same individual or not. The first method is a combination of principal component analysis and a neural network and the second method is based on state-of-the-art convolutional neural networks. They are trained and evaluated using two different data sets. The first set contains many images with large variations in, for example, illumination and facial expression. The second consists of fewer images with small variations. Principal component analysis allowed the use of smaller networks. The largest network has 1.7 million parameters compared to the 7 million used in the convolutional network. The use of smaller networks lowered the training time and evaluation time significantly. Principal component analysis proved to be well suited for the data set with small variations outperforming the convolutional network which need larger data sets to avoid overfitting. The reduction in data dimensionality, however, led to difficulties classifying the data set with large variations. The generous amount of images in this set allowed the convolutional method to reach higher accuracies than the principal component method. CNN neural network convolutional neural network face recognition preprocessing eigenfaces
27	Situated face detection Espinosa-Romero, Arturo January 2001 (has links) In the last twenty years, important advances have been made in the field of automatic face processing, given the importance of human faces for personal identification, emotional expression and verbal and non verbal communication. The very first step in a face processing algorithm is the detection of faces; while this is a trivial problem in controlled environments, the detection of faces in real environments is still a challenging task. Until now, the most successful approaches for face detection represent the face as a grey-level pattern, and the problem itself is considered as the classification between "face" and "non-face" patterns. Satisfactory results have been achieved in this area. The main disadvantage is that an exhaustive search has to be done on each image in order to locate the faces. This search normally involves testing every single position on the image at different scales, and although this does not represent an important drawback in off-line face processing systems, in those cases where a real-time response is needed it is still a problem. In the different proposed methods for face detection, the "observer" is a disembodied entity, which holds no relationship with the observed scene. This thesis presents a framework for an efficient location of faces in real scenes, in which, by considering both the observer to be situated in the world, and the relationships that hold between the two, a set of constraints in the search space can be defined. The constraints rely on two main assumptions; first, the observer can purposively interact with the world (i.e. change its position relative to the observed scene) and second, the camera is fully calibrated. The first source constraint is the structural information about the observer environment, represented as a depth map of the scene in front of the camera. From this representation the search space can be constrained in terms of the range of scales where a face might be found as different positions in the image. The second source of constraint is the geometrical relationship between the camera and the scene, which allows us to project a model of the subject into the scene in order to eliminate those areas where faces are unlikely to be found. In order to test the proposed framework, a system based on the premises stated above was constructed. It is based on three different modules: a face/non-face classifier, a depth estimation module and a search module. The classifier is composed of a set of convolutional neural networks (CNN) that were trained to differentiate between face and non-face patterns, the depth estimation modules uses a multilevel algorithm to compute the scene depth map from a sequence of images captured the depth information and the subject model into the image where the search will be performed in order to constrain the search space. Finally, the proposed system was validated by running a set of experiments on the individual modules and then on the whole system. 502.85
28	Construction of ternary convolutional codes Ansari, Muhammad Khizar 14 August 2019 (has links) Error control coding is employed in modern communication systems to reliably transfer data through noisy channels. Convolutional codes are widely used for this purpose because they are easy to encode and decode and so have been employed in numerous communication systems. The focus of this thesis is a search for new and better ternary convolutional codes with large free distance so more errors can be detected and corrected. An algorithm is developed to obtain ternary convolutional codes (TCCs) with the best possible free distance. Tables are given of binary and ternary convolutional codes with the best free distance for rate 1/2 with encoder memory up to 14, rate 1/3 with encoder memory up to 9 and rate 1/4 with encoder memory up to 8. / Graduate error contol coding ternary convolutional codes free distance
29	vU-net: edge detection in time-lapse fluorescence live cell images based on convolutional neural networks Zhang, Xitong 23 April 2018 (has links) Time-lapse fluorescence live cell imaging has been widely used to study various dynamic processes in cell biology. As the initial step of image analysis, it is important to localize and segment cell edges with higher accuracy. However, fluorescence live-cell images usually have issues such as low contrast, noises, uneven illumination in comparison to immunofluorescence images. Deep convolutional neural networks, which learn features directly from training images, have successfully been applied in natural image analysis problems. However, the limited amount of training samples prevents their routine application in fluorescence live-cell image analysis. In this thesis, by exploiting the temporal coherence in time-lapse movies together with VGG-16 [1] pre-trained model, we demonstrate that we can train a deep neural network using a limited number of image frames to segment the entire time-lapse movies. We propose a novel framework, vU-net, which combines the advantages of VGG-16 [1] in feature extraction and U-net [2] in feature reconstruction. Moreover, we design an auxiliary convolutional block at the end of the architecture to enhance edge detection. We evaluate our framework using dice coefficient and the distance between the predicted edge and the ground truth on high-resolution image datasets of an adhesion marker, paxillin, acquired by a Total Internal Reflection Fluorescence (TIRF) microscope. Our results demonstrate that, on difficult datasets: (i) The testing dice coefficient of vU-net is 3.2% higher than U-net with the same amount of training images. (ii) vU-net can achieve the best prediction results of U-net with one third of training images needed by U-net. (iii) vU-net produces more robust prediction than U-net. Therefore, vU-net can be more practically applied to challenging live cell movies than U-net since it requires a small size of training sets and achieved accurate segmentation. deep convolutional networks image segmentation
30	Multi-scale convolutional neural networks for segmentation of pulmonary structures in computed tomography Gerard, Sarah E. 01 December 2018 (has links) Computed tomography (CT) is routinely used for diagnosing lung disease and developing treatment plans using images of intricate lung structure with submillimeter resolution. Automated segmentation of anatomical structures in such images is important to enable efficient processing in clinical and research settings. Convolution neural networks (ConvNets) are largely successful at performing image segmentation with the ability to learn discriminative abstract features that yield generalizable predictions. However, constraints in hardware memory do not allow deep networks to be trained with high-resolution volumetric CT images. Restricted by memory constraints, current applications of ConvNets on volumetric medical images use a subset of the full image; limiting the capacity of the network to learn informative global patterns. Local patterns, such as edges, are necessary for precise boundary localization, however, they suffer from low specificity. Global information can disambiguate structures that are locally similar. The central thesis of this doctoral work is that both local and global information is important for segmentation of anatomical structures in medical images. A novel multi-scale ConvNet is proposed that divides the learning task across multiple networks; each network learns features over different ranges of scales. It is hypothesized that multi-scale ConvNets will lead to improved segmentation performance, as no compromise needs to be made between image resolution, image extent, and network depth. Three multi-scale models were designed to specifically target segmentation of three pulmonary structures: lungs, fissures, and lobes. The proposed models were evaluated on a diverse datasets and compared to architectures that do not use both local and global features. The lung model was evaluated on humans and three animal species; the results demonstrated the multi-scale model outperformed single scale models at different resolutions. The fissure model showed superior performance compared to both a traditional Hessian filter and a standard U-Net architecture that is limited in global extent. The results demonstrated that multi-scale ConvNets improved pulmonary CT segmentation by incorporating both local and global features using multiple ConvNets within a constrained-memory system. Overall, the proposed pipeline achieved high accuracy and was robust to variations resulting from different imaging protocols, reconstruction kernels, scanners, lung volumes, and pathological alterations; demonstrating its potential for enabling high-throughput image analysis in clinical and research settings. Computed Tomography Convolutional Neural Networks Deep Learning Pulmonary Segmentation

Search results