• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 43
  • 5
  • 4
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 70
  • 70
  • 70
  • 21
  • 18
  • 16
  • 14
  • 13
  • 12
  • 12
  • 10
  • 10
  • 9
  • 9
  • 9
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

Genomic sequence processing: gene finding in eukaryotes

Akhtar, Mahmood, Electrical Engineering & Telecommunications, Faculty of Engineering, UNSW January 2008 (has links)
Of the many existing eukaryotic gene finding software programs, none are able to guarantee accurate identification of genomic protein coding regions and other biological signals central to pathway from DNA to the protein. Eukaryotic gene finding is difficult mainly due to noncontiguous and non-continuous nature of genes. Existing approaches are heavily dependent on the compositional statistics of the sequences they learn from and are not equally suitable for all types of sequences. This thesis firstly develops efficient digital signal processing-based methods for the identification of genomic protein coding regions, and then combines the optimum signal processing-based non-data-driven technique with an existing data-driven statistical method in a novel system demonstrating improved identification of acceptor splice sites. Most existing well-known DNA symbolic-to-numeric representations map the DNA information into three or four numerical sequences, potentially increasing the computational requirement of the sequence analyzer. Proposed mapping schemes, to be used for signal processing-based gene and exon prediction, incorporate DNA structural properties in the representation, in addition to reducing complexity in subsequent processing. A detailed comparison of all DNA representations, in terms of computational complexity and relative accuracy for the gene and exon prediction problem, reveals the newly proposed ?paired numeric? to be the best DNA representation. Existing signal processing-based techniques rely mostly on the period-3 behaviour of exons to obtain one dimensional gene and exon prediction features, and are not well equipped to capture the complementary properties of exonic / intronic regions and deal with the background noise in detection of exons at their nucleotide levels. These issues have been addressed in this thesis, by proposing six one-dimensional and three multi-dimensional signal processing-based gene and exon prediction features. All one-dimensional and multi-dimensional features have been evaluated using standard datasets such as Burset/Guigo1996, HMR195, and the GENSCAN test set. This is the first time that different gene and exon prediction features have been compared using substantial databases and using nucleotide-level metrics. Furthermore, the first investigation of the suitability of different window sizes for period-3 exon detection is performed. Finally, the optimum signal processing-based gene and exon prediction scheme from our evaluations is combined with a data-driven statistical technique for the recognition of acceptor splice sites. The proposed DSP-statistical hybrid is shown to achieve 43% reduction in false positives over WWAM, as used in GENSCAN.
42

Markov Random Field Based Road Network Extraction From High Resoulution Satellite Images

Ozturk, Mahir 01 February 2013 (has links) (PDF)
Road Networks play an important role in various applications such as urban and rural planning, infrastructure planning, transportation management, vehicle navigation. Extraction of Roads from Remote Sensed satellite images for updating road database in geographical information systems (GIS) is generally done manually by a human operator. However, manual extraction of roads is time consuming and labor intensive process. In the existing literature, there are a great number of researches published for the purpose of automating the road extraction process. However, automated processes still yield some erroneous and incomplete results and human intervention is still required. The aim of this research is to propose a framework for road network extraction from high spatial resolution multi-spectral imagery (MSI) to improve the accuracy of road extraction systems. The proposed framework begins with a spectral classification using One-class Support Vector Machines (SVM) and Gaussian Mixture Models (GMM) classifiers. Spectral Classification exploits the spectral signature of road surfaces to classify road pixels. Then, an iterative template matching filter is proposed to refine spectral classification results. K-medians clustering algorithm is employed to detect candidate road centerline points. Final road network formation is achieved by Markov Random Fields. The extracted road network is evaluated against a reference dataset using a set of quality metrics.
43

Improved GMM-Based Classification Of Music Instrument Sounds

Krishna, A G 05 1900 (has links)
This thesis concerns with the recognition of music instruments from isolated notes. Music instrument recognition is a relatively nascent problem fast gaining importance not only because of the academic value the problem provides, but also for the potential it has in being able to realize applications like music content analysis, music transcription etc. Line spectral frequencies are proposed as features for music instrument recognition and shown to perform better than Mel filtered cepstral coefficients and linear prediction cepstral coefficients. Assuming a linear model of sound production, features based on the prediction residual, which represents the excitation signal, is proposed. Four improvements are proposed for classification using Gaussian mixture model (GMM) based classifiers. One of them involves characterizing the regions of overlap between classes in the feature space to improve classification. Applications to music instrument recognition and speaker recognition are shown. An experiment is proposed for discovering the hierarchy in music instrument in a data-driven manner. The hierarchy thus discovered closely corresponds to the hierarchy defined by musicians and experts and therefore shows that the feature space has successfully captured the required features for music instrument characterization.
44

Human Action Recognition In Video Data For Surveillance Applications

Gurrapu, Chaitanya January 2004 (has links)
Detecting human actions using a camera has many possible applications in the security industry. When a human performs an action, his/her body goes through a signature sequence of poses. To detect these pose changes and hence the activities performed, a pattern recogniser needs to be built into the video system. Due to the temporal nature of the patterns, Hidden Markov Models (HMM), used extensively in speech recognition, were investigated. Initially a gesture recognition system was built using novel features. These features were obtained by approximating the contour of the foreground object with a polygon and extracting the polygon's vertices. A Gaussian Mixture Model (GMM) was fit to the vertices obtained from a few frames and the parameters of the GMM itself were used as features for the HMM. A more practical activity detection system using a more sophisticated foreground segmentation algorithm immune to varying lighting conditions and permanent changes to the foreground was then built. The foreground segmentation algorithm models each of the pixel values using clusters and continually uses incoming pixels to update the cluster parameters. Cast shadows were identified and removed by assuming that shadow regions were less likely to produce strong edges in the image than real objects and that this likelihood further decreases after colour segmentation. Colour segmentation itself was performed by clustering together pixel values in the feature space using a gradient ascent algorithm called mean shift. More robust features in the form of mesh features were also obtained by dividing the bounding box of the binarised object into grid elements and calculating the ratio of foreground to background pixels in each of the grid elements. These features were vector quantized to reduce their dimensionality and the resulting symbols presented as features to the HMM to achieve a recognition rate of 62% for an event involving a person writing on a white board. The recognition rate increased to 80% for the &quotseen" person sequences, i.e. the sequences of the person used to train the models. With a fixed lighting position, the lack of a shadow removal subsystem improved the detection rate. This is because of the consistent profile of the shadows in both the training and testing sequences due to the fixed lighting positions. Even with a lower recognition rate, the shadow removal subsystem was considered an indispensable part of a practical, generic surveillance system.
45

Genomic sequence processing: gene finding in eukaryotes

Akhtar, Mahmood, Electrical Engineering & Telecommunications, Faculty of Engineering, UNSW January 2008 (has links)
Of the many existing eukaryotic gene finding software programs, none are able to guarantee accurate identification of genomic protein coding regions and other biological signals central to pathway from DNA to the protein. Eukaryotic gene finding is difficult mainly due to noncontiguous and non-continuous nature of genes. Existing approaches are heavily dependent on the compositional statistics of the sequences they learn from and are not equally suitable for all types of sequences. This thesis firstly develops efficient digital signal processing-based methods for the identification of genomic protein coding regions, and then combines the optimum signal processing-based non-data-driven technique with an existing data-driven statistical method in a novel system demonstrating improved identification of acceptor splice sites. Most existing well-known DNA symbolic-to-numeric representations map the DNA information into three or four numerical sequences, potentially increasing the computational requirement of the sequence analyzer. Proposed mapping schemes, to be used for signal processing-based gene and exon prediction, incorporate DNA structural properties in the representation, in addition to reducing complexity in subsequent processing. A detailed comparison of all DNA representations, in terms of computational complexity and relative accuracy for the gene and exon prediction problem, reveals the newly proposed ?paired numeric? to be the best DNA representation. Existing signal processing-based techniques rely mostly on the period-3 behaviour of exons to obtain one dimensional gene and exon prediction features, and are not well equipped to capture the complementary properties of exonic / intronic regions and deal with the background noise in detection of exons at their nucleotide levels. These issues have been addressed in this thesis, by proposing six one-dimensional and three multi-dimensional signal processing-based gene and exon prediction features. All one-dimensional and multi-dimensional features have been evaluated using standard datasets such as Burset/Guigo1996, HMR195, and the GENSCAN test set. This is the first time that different gene and exon prediction features have been compared using substantial databases and using nucleotide-level metrics. Furthermore, the first investigation of the suitability of different window sizes for period-3 exon detection is performed. Finally, the optimum signal processing-based gene and exon prediction scheme from our evaluations is combined with a data-driven statistical technique for the recognition of acceptor splice sites. The proposed DSP-statistical hybrid is shown to achieve 43% reduction in false positives over WWAM, as used in GENSCAN.
46

Foreground Segmentation of Moving Objects

Molin, Joel January 2010 (has links)
Foreground segmentation is a common first step in tracking and surveillance applications.  The purpose of foreground segmentation is to provide later stages of image processing with an indication of where interesting data can be found.  This thesis is an investigation of how foreground segmentation can be performed in two contexts: as a pre-step to trajectory tracking and as a pre-step in indoor surveillance applications. Three methods are selected and detailed: a single Gaussian method, a Gaussian mixture model method, and a codebook method.  Experiments are then performed on typical input video using the methods.  It is concluded that the Gaussian mixture model produces the output which yields the best trajectories when used as input to the trajectory tracker.  An extension is proposed to the Gaussian mixture model which reduces shadow, improving the performance of foreground segmentation in the surveillance context.
47

Phoneme duration modelling for speaker verification

Van Heerden, Charl Johannes 26 June 2009 (has links)
Higher-level features are considered to be a potential remedy against transmission line and cross-channel degradations, currently some of the biggest problems associated with speaker verification. Phoneme durations in particular are not altered by these factors; thus a robust duration model will be a particularly useful addition to traditional cepstral based speaker verification systems. In this dissertation we investigate the feasibility of phoneme durations as a feature for speaker verification. Simple speaker specific triphone duration models are created to statistically represent the phoneme durations. Durations are obtained from an automatic hidden Markov model (HMM) based automatic speech recognition system and are modeled using single mixture Gaussian distributions. These models are applied in a speaker verification system (trained and tested on the YOHO corpus) and found to be a useful feature, even when used in isolation. When fused with acoustic features, verification performance increases significantly. A novel speech rate normalization technique is developed in order to remove some of the inherent intra-speaker variability (due to differing speech rates). Speech rate variability has a negative impact on both speaker verification and automatic speech recognition. Although the duration modelling seems to benefit only slightly from this procedure, the fused system performance improvement is substantial. Other factors known to influence the duration of phonemes are incorporated into the duration model. Utterance final lengthening is known be a consistent effect and thus “position in sentence” is modeled. “Position in word” is also modeled since triphones do not provide enough contextual information. This is found to improve performance since some vowels’ duration are particularly sensitive to its position in the word. Data scarcity becomes a problem when building speaker specific duration models. By using information from available data, unknown durations can be predicted in an attempt to overcome the data scarcity problem. To this end we develop a novel approach to predict unknown phoneme durations from the values of known phoneme durations for a particular speaker, based on the maximum likelihood criterion. This model is based on the observation that phonemes from the same broad phonetic class tend to co-vary strongly, but that there is also significant cross-class correlations. This approach is tested on the TIMIT corpus and found to be more accurate than using back-off techniques. / Dissertation (MEng)--University of Pretoria, 2009. / Electrical, Electronic and Computer Engineering / unrestricted
48

Probabilistic Diagnostic Model for Handling Classifier Degradation in Machine Learning

Gustavo A. Valencia-Zapata (8082655) 04 December 2019 (has links)
Several studies point out different causes of performance degradation in supervised machine learning. Problems such as class imbalance, overlapping, small-disjuncts, noisy labels, and sparseness limit accuracy in classification algorithms. Even though a number of approaches either in the form of a methodology or an algorithm try to minimize performance degradation, they have been isolated efforts with limited scope. This research consists of three main parts: In the first part, a novel probabilistic diagnostic model based on identifying signs and symptoms of each problem is presented. Secondly, the behavior and performance of several supervised algorithms are studied when training sets have such problems. Therefore, prediction of success for treatments can be estimated across classifiers. Finally, a probabilistic sampling technique based on training set diagnosis for avoiding classifier degradation is proposed<br>
49

Speech to Text for Swedish using KALDI / Tal till text, utvecklandet av en svensk taligenkänningsmodell i KALDI

Kullmann, Emelie January 2016 (has links)
The field of speech recognition has during the last decade left the re- search stage and found its way in to the public market. Most computers and mobile phones sold today support dictation and transcription in a number of chosen languages.  Swedish is often not one of them. In this thesis, which is executed on behalf of the Swedish Radio, an Automatic Speech Recognition model for Swedish is trained and the performance evaluated. The model is built using the open source toolkit Kaldi.  Two approaches of training the acoustic part of the model is investigated. Firstly, using Hidden Markov Model and Gaussian Mixture Models and secondly, using Hidden Markov Models and Deep Neural Networks. The later approach using deep neural networks is found to achieve a better performance in terms of Word Error Rate. / De senaste åren har olika tillämpningar inom människa-dator interaktion och främst taligenkänning hittat sig ut på den allmänna marknaden. Många system och tekniska produkter stöder idag tjänsterna att transkribera tal och diktera text. Detta gäller dock främst de större språken och sällan finns samma stöd för mindre språk som exempelvis svenskan. I detta examensprojekt har en modell för taligenkänning på svenska ut- vecklas. Det är genomfört på uppdrag av Sveriges Radio som skulle ha stor nytta av en fungerande taligenkänningsmodell på svenska. Modellen är utvecklad i ramverket Kaldi. Två tillvägagångssätt för den akustiska träningen av modellen är implementerade och prestandan för dessa två är evaluerade och jämförda. Först tränas en modell med användningen av Hidden Markov Models och Gaussian Mixture Models och slutligen en modell där Hidden Markov Models och Deep Neural Networks an- vänds, det visar sig att den senare uppnår ett bättre resultat i form av måttet Word Error Rate.
50

Understanding people movement and detecting anomalies using probabilistic generative models / Att förstå personförflyttningar och upptäcka anomalier genom att använda probabilistiska generativa modeller

Hansson, Agnes January 2020 (has links)
As intelligent access solutions begin to dominate the world, the statistical learning methods to answer for the behavior of these needs attention, as there is no clear answer to how an algorithm could learn and predict exactly how people move. This project aims at investigating if, with the help of unsupervised learning methods, it is possible to distinguish anomalies from normal events in an access system, and if the most probable choice of cylinder to be unlocked by a user can be calculated.Given to do this is a data set of the previous events in an access system, together with the access configurations - and the algorithms that were used consisted of an auto-encoder and a probabilistic generative model.The auto-encoder managed to, with success, encode the high-dimensional data set into one of significantly lower dimension, and the probabilistic generative model, which was chosen to be a Gaussian mixture model, identified clusters in the data and assigned a measure of unexpectedness to the events.Lastly, the probabilistic generative model was used to compute the conditional probability of which the user, given all the details except which cylinder that was chosen during an event, would choose a certain cylinder. The result of this was a correct guess in 65.7 % of the cases, which can be seen as a satisfactory number for something originating from an unsupervised problem. / Allt eftersom att intelligenta åtkomstlösningar tar över i samhället, så är det nödvändigt att ägna de statistiska inlärnings-metoderna bakom dessa tillräckligt med uppmärksamhet, eftersom det inte finns något självklart svar på hur en algoritm ska kunna lära sig och förutspå människors exakta rörelsemönster.Det här projektet har som mål att, med hjälp av oövervakad inlärning, undersöka huruvida det är möjligt att urskilja anomalier från normala iakttagelser, och om den låscylinder med högst sannolikhet att en användare väljer att försöka låsa upp går att beräknda.Givet för att genomföra detta projekt är en datamängd där händelser från ett åtkomstsystem finns, tillsammans med tillhörande åtkomstkonfig-urationer. Algoritmerna som användes i projektet har bestått av en auto-encoder och en probabilistisk generativ modell.Auto-encodern lyckades, med tillfredsställande resultat, att koda det hög-dimensionella datat till ett annat med betydligt lägre dimension, och den probabilistiska generativa modellen, som valdes till en Gaussisk mixtur-modell, lyckades identifiera kluster i datat och med att tilldela varje observation ett mått på dess otrolighet.Till slut så användes den probabilistiska generativa modellen för att beräkna en villkorad sannolikhet, för vilken användaren, given alla attribut för en händelse utom just vilken låscylinder som denna försökte öppna, skulle välja.Resultatet av dessa var en korrekt gissning i 65,7 % av fallen, vilket kan ses som en tillfredställande siffra för något som härrör från ett oövervakat problem.

Page generated in 0.0887 seconds