Global ETD Search

441	Local Part Model for Action Recognition in Realistic Videos Shi, Feng January 2014 (has links) This thesis presents a framework for automatic recognition of human actions in uncontrolled, realistic video data such as movies, internet and surveillance videos. In this thesis, the human action recognition problem is solved from the perspective of local spatio-temporal feature and bag-of-features representation. The bag-of-features model only contains statistics of unordered low-level primitives, and any information concerning temporal ordering and spatial structure is lost. To address this issue, we proposed a novel multiscale local part model on the purpose of maintaining both structure information and ordering of local events for action recognition. The method includes both a coarse primitive level root feature covering event-content statistics and higher resolution overlapping part features incorporating local structure and temporal relationships. To extract the local spatio-temporal features, we investigated a random sampling strategy for efficient action recognition. We also introduced the idea of using very high sampling density for efficient and accurate classification. We further explored the potential of the method with the joint optimization of two constraints: the classification accuracy and its efficiency. On the performance side, we proposed a new local descriptor, called GBH, based on spatial and temporal gradients. It significantly improved the performance of the pure spatial gradient-based HOG descriptor on action recognition while preserving high computational efficiency. We have also shown that the performance of the state-of-the-art MBH descriptor can be improved with a discontinuity-preserving optical flow algorithm. In addition, a new method based on histogram intersection kernel was introduced to combine multiple channels of different descriptors. This method has the advantages of improving recognition accuracy with multiple descriptors and speeding up the classification process. On the efficiency side, we applied PCA to reduce the feature dimension which resulted in fast bag-of-features matching. We also evaluated the FLANN method on real-time action recognition. We conducted extensive experiments on real-world videos from challenging public action datasets. We showed that our methods achieved the state-of-the-art with real-time computational potential, thus highlighting the effectiveness and efficiency of the proposed methods. bag-of-features (BoF) action recognition random sampling local part model multi-channel SVM
442	Možnosti srovnávání obrázků v mobiních aplikacích / Possibilities of image comparison in mobile applications Jírů, Michaela January 2015 (has links) This thesis is about methods of image comparison. Goal is to create a mobile app that allows user to compare images in real time. In the first part there is a theoretical basis, in particular image similarity algorithms. The practical part contains the app implementation including use case analysis, user interface design and functional requirements. It is followed by source code samples a description of frameworks used. Last part is made of testing the implemented algorithms regarding speed and precision.
443	Predicting "Essential" Genes in Microbial Genomes: A Machine Learning Approach to Knowledge Discovery in Microbial Genomic Data Palaniappan, Krishnaveni 01 January 2010 (has links) Essential genes constitute the minimal gene set of an organism that is indispensable for its survival under most favorable conditions. The problem of accurately identifying and predicting genes essential for survival of an organism has both theoretical and practical relevance in genome biology and medicine. From a theoretical perspective it provides insights in the understanding of the minimal requirements for cellular life and plays a key role in the emerging field of synthetic biology; from a practical perspective, it facilitates efficient identification of potential drug targets (e.g., antibiotics) in novel pathogens. However, characterizing essential genes of an organism requires sophisticated experimental studies that are expensive and time consuming. The goal of this research study was to investigate machine learning methods to accurately classify/predict "essential genes" in newly sequenced microbial genomes based solely on their genomic sequence data. This study formulates the predication of essential genes problem as a binary classification problem and systematically investigates applicability of three different supervised classification methods for this task. In particular, Decision Tree (DT), Support Vector Machine (SVM), and Artificial Neural Network (ANN) based classifier models were constructed and trained on genomic features derived solely from gene sequence data of 14 experimentally validated microbial genomes whose essential genes are known. A set of 52 relevant genomic sequence derived features (including gene and protein sequence features, protein physio-chemical features and protein sub-cellular features) was used as input for the learners to learn the classifier models. The training and test datasets used in this study reflected between-class imbalance (i.e. skewed majority class vs. minority class) that is intrinsic to this data domain and essential genes prediction problem. Two imbalance reduction techniques (homology reduction and random under sampling of 50% of the majority class) were devised without artificially balancing the datasets and compromising classifier generalizability. The classifier models were trained and evaluated using 10-fold stratified cross validation strategy on both the full multi-genome datasets and its class imbalance reduced variants to assess their predictive ability of discriminating essential genes from non-essential genes. In addition, the classifiers were also evaluated using a novel blind testing strategy, called LOGO (Leave-One-Genome-Out) and LOTO (Leave-One-Taxon group-Out) tests on carefully constructed held-out datasets (both genome-wise (LOGO) and taxonomic group-wise (LOTO)) that were not used in training of the classifier models. Prediction performance metrics, accuracy, sensitivity, specificity, precision and area under the Receiver Operating Characteristics (AU-ROC) were assessed for DT, SVM and ANN derived models. Empirical results from 10 X 10-fold stratified cross validation, Leave-One-Genome-Out (LOGO) and Leave-One-Taxon group-Out (LOTO) blind testing experiments indicate SVM and ANN based models perform better than Decision Tree based models. On 10 X 10-fold cross validations, the SVM based models achieved an AU-ROC score of 0.80, while ANN and DT achieved 0.79 and 0.68 respectively. Both LOGO (genome-wise) and LOTO (taxonwise) blind tests revealed the generalization extent of these classifiers across different genomes and taxonomic orders. This study empirically demonstrated the merits of applying machine learning methods to predict essential genes in microbial genomes by using only gene sequence and features derived from it. It also demonstrated that it is possible to predict essential genes based on features derived from gene sequence without using homology information. LOGO and LOTO Blind test results reveal that the trained classifiers do generalize across genomes and taxonomic boundaries and provide first critical estimate of predictive performance on microbial genomes. Overall, this study provides a systematic assessment of applying DT, ANN and SVM to this prediction problem. An important potential application of this study will be to apply the resultant predictive model/approach and integrate it as a genome annotation pipeline method for comparative microbial genome and metagenome analysis resources such as the Integrated Microbial Genome Systems (IMG and IMG/M). Computational biology Essential Genes Genomic features Machine learning Microbial genomes Supervised learning Computer Sciences
444	Perceived features and similarity of images: An investigation into their relationships and a test of Tversky's contrast model. Rorissa, Abebe 05 1900 (has links) The creation, storage, manipulation, and transmission of images have become less costly and more efficient. Consequently, the numbers of images and their users are growing rapidly. This poses challenges to those who organize and provide access to them. One of these challenges is similarity matching. Most current content-based image retrieval (CBIR) systems which can extract only low-level visual features such as color, shape, and texture, use similarity measures based on geometric models of similarity. However, most human similarity judgment data violate the metric axioms of these models. Tversky's (1977) contrast model, which defines similarity as a feature contrast task and equates the degree of similarity of two stimuli to a linear combination of their common and distinctive features, explains human similarity judgments much better than the geometric models. This study tested the contrast model as a conceptual framework to investigate the nature of the relationships between features and similarity of images as perceived by human judges. Data were collected from 150 participants who performed two tasks: an image description and a similarity judgment task. Qualitative methods (content analysis) and quantitative (correlational) methods were used to seek answers to four research questions related to the relationships between common and distinctive features and similarity judgments of images as well as measures of their common and distinctive features. Structural equation modeling, correlation analysis, and regression analysis confirmed the relationships between perceived features and similarity of objects hypothesized by Tversky (1977). Tversky's (1977) contrast model based upon a combination of two methods for measuring common and distinctive features, and two methods for measuring similarity produced statistically significant structural coefficients between the independent latent variables (common and distinctive features) and the dependent latent variable (similarity). This model fit the data well for a sample of 30 (435 pairs of) images and 150 participants (χ2 =16.97, df=10, p = .07508, RMSEA= .040, SRMR= .0205, GFI= .990, AGFI= .965). The goodness of fit indices showed the model did not significantly deviate from the actual sample data. This study is the first to test the contrast model in the context of information representation and retrieval. Results of the study are hoped to provide the foundations for future research that will attempt to further test the contrast model and assist designers of image organization and retrieval systems by pointing toward alternative document representations and similarity measures that more closely match human similarity judgments. Image files. Information retrieval. image representation and retrieval perceived similarity image features contrast model
445	Machine Learning Methods for Visual Object Detection / Apprentissage machine pour la détection des objets Hussain, Sabit ul 07 December 2011 (has links) Le but de cette thèse est de développer des méthodes pratiques plus performantes pour la détection d'instances de classes d'objets de la vie quotidienne dans les images. Nous présentons une famille de détecteurs qui incorporent trois types d'indices visuelles performantes – histogrammes de gradients orientés (Histograms of Oriented Gradients, HOG), motifs locaux binaires (Local Binary Patterns, LBP) et motifs locaux ternaires (Local Ternary Patterns, LTP) – dans des méthodes de discrimination efficaces de type machine à vecteur de support latent (Latent SVM), sous deux régimes de réduction de dimension – moindres carrées partielles (Partial Least Squares, PLS) et sélection de variables par élagage de poids SVM (SVM Weight Truncation). Sur plusieurs jeux de données importantes, notamment ceux du PASCAL VOC2006 et VOC2007, INRIA Person et ETH Zurich, nous démontrons que nos méthodes améliorent l'état de l'art du domaine. Nos contributions principales sont : – Nous étudions l'indice visuelle LTP pour la détection d'objets. Nous démontrons que sa performance est globalement mieux que celle des indices bien établies HOG et LBP parce qu'elle permet d'encoder à la fois la texture locale de l'objet et sa forme globale, tout en étant résistante aux variations d'éclairage. Grâce à ces atouts, LTP fonctionne aussi bien pour les classes qui sont caractérisées principalement par leurs structures que pour celles qui sont caractérisées par leurs textures. En plus, nous démontrons que les indices HOG, LBP et LTP sont bien complémentaires, de sorte qu'un jeux d'indices étendu qui intègre tous les trois améliore encore la performance. – Les jeux d'indices visuelles performantes étant de dimension assez élevée, nous proposons deux méthodes de réduction de dimension afin d'améliorer leur vitesse et réduire leur utilisation de mémoire. La première, basée sur la projection moindres carrés partielles, diminue significativement le temps de formation des détecteurs linéaires, sans réduction de précision ni perte de vitesse d'exécution. La seconde, fondée sur la sélection de variables par l'élagage des poids du SVM, nous permet de réduire le nombre d'indices actives par un ordre de grandeur avec une réduction minime, voire même une petite augmentation, de la précision du détecteur. Malgré sa simplicité, cette méthode de sélection de variables surpasse toutes les autres approches que nous avons mis à l'essai. – Enfin, nous décrivons notre travail en cours sur une nouvelle variété d'indice visuelle – les « motifs locaux quantifiées » (Local Quantized Patterns, LQP). LQP généralise les indices existantes LBP / LTP en introduisant une étape de quantification vectorielle – ce qui permet une souplesse et une puissance analogue aux celles des approches de reconnaissance visuelle « sac de mots », qui sont basées sur la quantification des régions locales d'image considérablement plus grandes – sans perdre la simplicité et la rapidité qui caractérisent les approches motifs locales actuelles parce que les résultats de la quantification puissent être pré-compilés et stockés dans un tableau. LQP permet une augmentation considérable de la taille du support local de l'indice, et donc de sa puissance discriminatoire. Nos expériences indiquent qu'elle a la meilleure performance de toutes les indices visuelles testés, y compris HOG, LBP et LTP. / The goal of this thesis is to develop better practical methods for detecting common object classes in real world images. We present a family of object detectors that combine Histogram of Oriented Gradient (HOG), Local Binary Pattern (LBP) and Local Ternary Pattern (LTP) features with efficient Latent SVM classifiers and effective dimensionality reduction and sparsification schemes to give state-of-the-art performance on several important datasets including PASCAL VOC2006 and VOC2007, INRIA Person and ETHZ. The three main contributions are as follows. Firstly, we pioneer the use of Local Ternary Pattern features for object detection, showing that LTP gives better overall performance than HOG and LBP, because it captures both rich local texture and object shape information while being resistant to variations in lighting conditions. It thus works well both for classes that are recognized mainly by their structure and ones that are recognized mainly by their textures. We also show that HOG, LBP and LTP complement one another, so that an extended feature set that incorporates all three of them gives further improvements in performance. Secondly, in order to tackle the speed and memory usage problems associated with high-dimensional modern feature sets, we propose two effective dimensionality reduction techniques. The first, feature projection using Partial Least Squares, allows detectors to be trained more rapidly with negligible loss of accuracy and no loss of run time speed for linear detectors. The second, feature selection using SVM weight truncation, allows active feature sets to be reduced in size by almost an order of magnitude with little or no loss, and often a small gain, in detector accuracy. Despite its simplicity, this feature selection scheme outperforms all of the other sparsity enforcing methods that we have tested. Lastly, we describe work in progress on Local Quantized Patterns (LQP), a generalized form of local pattern features that uses lookup table based vector quantization to provide local pattern style pixel neighbourhood codings that have the speed of LBP/LTP and some of the flexibility and power of traditional visual word representations. Our experiments show that LQP outperforms all of the other feature sets tested including HOG, LBP and LTP. Détection des objets Apprentissage automa Indices visuelles Object detection Machine learning methods Features Sparsity
446	Articulation modelling of vowels in dysarthric and non-dysarthric speech Albalkhi, Rahaf 25 May 2020 (has links) People with motor function disorders that cause dysarthric speech find difficulty using state-of- the-art automatic speech recognition (ASR) systems. These systems are developed based on non- dysarthric speech models, which explains the poor performance when used by individuals with dysarthria. Thus, a solution is needed to compensate for the poor performance of these systems. This thesis examines the possibility of quantifying vowels of dysarthric and non-dysarthric speech into codewords regardless of inter-speaker variability and possible to be implemented on limited- processing-capability machines. I show that it is possible to model all possible vowels and vowel- like sounds that a North American speaker can produce if the frequencies of the first and second formants are used to encode these sounds. The proposed solution is aligned with the use of neural networks and hidden Markov models to build an acoustic model in conventional ASR systems. A secondary finding of this study includes the feasibility of reducing the set of ten most common vowels in North American English to eight vowels only. / Graduate / 2021-05-11 Dysarthric Speech Recognition Articulation Modelling Acoustic Model Automatic Speech Recognition ASR Articulatory Features
447	Quantitative Correlation Analysis of Motor and Dysphonia Features of Parkinsons Disease Koduri, Balaram 05 1900 (has links) The research reported here deals with the early characterization of Parkinson’s disease (PD), the second most common degenerative disease of the human motor system after Alzheimer’s. PD results from the death of dopaminergic neurons in the substantia nigra region of the brain. Its occurrence is highly correlated with the aging population whose numbers increase with the healthcare benefits of a longer life. Observation of motor control symptoms associated with PD, such as gait and speech analysis, is most often used to evaluate, detect, and diagnose PD. Since speech and some delicate motor functions have provided early detection signs of PD, reliable analysis of these features is a promising objective diagnostic technique for early intervention with any remedial measures. We implement and study here three PD diagnostic methods and their correlation between each other’s results and with the motor functions in subjects diagnosed with and without PD. One initial test documented well in the literature deals with feature analysis of voice during phonation to determine dysphonia measures. Features of the motor function of two fingers were extracted in tests titled “Motor function of alternating finger tapping on a computer keyboard” and “Motor function of the index and thumb finger tapping with an accelerometer”, that we objectively scripted. The voice dysphonia measures were extracted using various software packages like PRAAT, Wavesurfer, and Matlab. In the initial test, several robust feature selection algorithms were used to obtain an optimally selected subset of features. We were able to program distance classifiers, support vector machine (SVM), and hierarchical clustering discrimination approaches for the dichotomous identification of non-PD control subjects and people with Parkinson’s (PWP). Validation tests were implemented to verify the accuracy of the classification processes. We determined the extent of functional agreement between voice and motor functions by correlating test results. Parkinson's disease correlation dysphonia features Parkinson's disease -- Diagnosis. Motor ability. Speech disorders.
448	Neural Network Based Automatic Essay Scoring for Swedish / Neurala nätverk för automatisk bedömning av uppsatser i nationella prov i svenska Ruan, Rex Dajun January 2020 (has links) This master thesis work presents a novel method of automatic essay scoring for Swedish national tests written by upper secondary high school students by deploying neural network architectures and linguistic feature extraction in the framework of Swegram. There are four sorts of linguistic aspects involved in our feature extraction: count-based,lexical morphological and syntactic. One of the three variants of recurrent network, vanilla RNN, GRU and LSTM, together with the specific model parameter setting, is implemented in the Automatic Essay Scoring (AES) modelling with extracted features measuring the linguistic complexity as text representation. The AES model is evaluated through interrater agreement with human assigned grade as target label in terms of quadratic weighted kappa (QWK) and exact percent agreement. Our best observed averaged QWK and averaged exact percent agreement is 0.50 and 52% over 10 folds among our all experimented models. Automatic Essay Scoring Swedish Linguistic Features Machine Learning
449	Simplification of CAD Geometries to perform CEM Simulations Bashir, Farrukh January 2021 (has links) The purpose of this study is to investigate small features in CAD geometries which have high influence on computational time and memory consumption during electromagnetic simulations using CAE software. Computational Electromagnetics (CEM) simulations of complex CAD geometries like passenger cars crash often simulations remain incomplete due to a lot of irrelevant details. The key to eliminating the limitations enumerated above is to remove irrelevant details in CAD geometries within the acceptable margin of accuracy. The maximum 5% percentage of error in accuracy during Computational Electromagnetics (CEM) simulations is compromised. CAD geometry is transferred into CAE geometry. The modification of geometry is done by removing irrelevant details to get optimal mesh and improve the computational time and memory consumption during simulations with ANSA and HYPERMESH software at their best potential. At the end, electromagnetic simulations are done on original and simplified CAD models in CST and in COMSOL. Only magnetic flux density distribution across the modified and unmodified CAD model by cut points 3D on different coordinate positions is analysed. The results are compared with quality of mesh in terms of accuracy and reduction in computational time and memory. The features in CAD geometries are identified, these features can be removed and computational time and memory consumption reduced with minimum loss of accuracy during simulation. Defeaturing Modification Blending Features FEA CEM CAD CAE. Other Mechanical Engineering Annan maskinteknik
450	Leveraging the multimodal information from video content for video recommendation Almeida, Adolfo Ricardo Lopes De January 2021 (has links) Since the popularisation of media streaming, a number of video streaming services are continually buying new video content to mine the potential profit. As such, newly added content has to be handled appropriately to be recommended to suitable users. In this dissertation, the new item cold-start problem is addressed by exploring the potential of various deep learning features to provide video recommendations. The deep learning features investigated include features that capture the visual-appearance, as well as audio and motion information from video content. Different fusion methods are also explored to evaluate how well these feature modalities can be combined to fully exploit the complementary information captured by them. Experiments on a real-world video dataset for movie recommendations show that deep learning features outperform hand crafted features. In particular, it is found that recommendations generated with deep learning audio features and action-centric deep learning features are superior to Mel-frequency cepstral coefficients (MFCC) and state-of-the-art improved dense trajectory (iDT) features. It was also found that the combination of various deep learning features with textual metadata and hand-crafted features provide significant improvement in recommendations, as compared to combining only deep learning and hand-crafted features. / Dissertation (MEng (Computer Engineering))--University of Pretoria, 2021. / The MultiChoice Research Chair of Machine Learning at the University of Pretoria / UP Postgraduate Masters Research bursary / Electrical, Electronic and Computer Engineering / MEng (Computer Engineering) / Unrestricted Video recommendation item cold-start deep learning features multimodal feature fusion matrix scaling

Search results