Global ETD Search

1	Pattern recognition systems design on parallel GPU architectures for breast lesions characterisation employing multimodality images Sidiropoulos, Konstantinos January 2014 (has links) The aim of this research was to address the computational complexity in designing multimodality Computer-Aided Diagnosis (CAD) systems for characterising breast lesions, by harnessing the general purpose computational potential of consumer-level Graphics Processing Units (GPUs) through parallel programming methods. The complexity in designing such systems lies on the increased dimensionality of the problem, due to the multiple imaging modalities involved, on the inherent complexity of optimal design methods for securing high precision, and on assessing the performance of the design prior to deployment in a clinical environment, employing unbiased system evaluation methods. For the purposes of this research, a Pattern Recognition (PR)-system was designed to provide highest possible precision by programming in parallel the multiprocessors of the NVIDIA’s GPU-cards, GeForce 8800GT or 580GTX, and using the CUDA programming framework and C++. The PR-system was built around the Probabilistic Neural Network classifier and its performance was evaluated by a re-substitution method, for estimating the system’s highest accuracy, and by the external cross validation method, for assessing the PR-system’s unbiased accuracy to new, “unseen” by the system, data. Data comprised images of patients with histologically verified (benign or malignant) breast lesions, who underwent both ultrasound (US) and digital mammography (DM). Lesions were outlined on the images by an experienced radiologist, and textural features were calculated. Regarding breast lesion classification, the accuracies for discriminating malignant from benign lesions were, 85.5% using US-features alone, 82.3% employing DM-features alone, and 93.5% combining US and DM features. Mean accuracy to new “unseen” data for the combined US and DM features was 81%. Those classification accuracies were about 10% higher than accuracies achieved on a single CPU, using sequential programming methods, and 150-fold faster. In addition, benign lesions were found smoother, more homogeneous, and containing larger structures. Additionally, the PR-system design was adapted for tackling other medical problems, as a proof of its generalisation. These included classification of rare brain tumours, (achieving 78.6% for overall accuracy (OA) and 73.8% for estimated generalisation accuracy (GA), and accelerating system design 267 times), discrimination of patients with micro-ischemic and multiple sclerosis lesions (90.2% OA and 80% GA with 32-fold design acceleration), classification of normal and pathological knee cartilages (93.2% OA and 89% GA with 257-fold design acceleration), and separation of low from high grade laryngeal cancer cases (93.2% OA and 89% GA, with 130-fold design acceleration). The proposed PR-system improves breast-lesion discrimination accuracy, it may be redesigned on site when new verified data are incorporated in its depository, and it may serve as a second opinion tool in a clinical environment. 610.28
2	Automação e Otimização de Controle via MQ e RNA para Redução das Emissões de Gases Causadores de Efeito Estufa (GHG) Geradas por Plantas de Alumínio. / Automation and optimization of control to consider MQ and RNA for Reducing greenhouse gases emissions (GHG) Generated by aluminum plants. NAGEM, Nilton Freixo 06 February 2009 (has links) Submitted by Maria Aparecida (cidazen@gmail.com) on 2017-08-16T13:36:27Z No. of bitstreams: 1 Nagem.pdf: 4780552 bytes, checksum: b5eb1b41dce8fc9e855f1bb49bfad2fa (MD5) / Made available in DSpace on 2017-08-16T13:36:27Z (GMT). No. of bitstreams: 1 Nagem.pdf: 4780552 bytes, checksum: b5eb1b41dce8fc9e855f1bb49bfad2fa (MD5) Previous issue date: 2009-02-06 / Nowadays the regulatory restrictions and global concern with the environment are leading the aluminum industry to develop a sustainable model production, with propose to reduce the environmental impacts of its economic activity. Thus, becomes necessary improvements in the operational and control standards for the aluminium production. These needs have major objectives, decrease green house gases (GHG) energy consumption and increase in productive. As technological alternatives such as smart feeders for Point Feeders pots and the development of new control for automatic adjust of the number of manifolds to be broke in the next cycle for Side Break pots will help to improve the decrease of Green Houses Gases. The smart feeders had a significant decrease in the anode effect frequency and consequently a decrease in anode effect time too. For the VSS Side Break pots were possible to create a decision matrix using the Least Square estimation (LS) of the resistance slope and curvature to adjust the number of manifolds. Another approach that showed promising results in the simulation was the neuronal networks for pattern recognition, especial class knows by probabilistic neural network. / Atualmente a maior regulamentação e preocupação mundial com o ambiente estão levando as indústrias de alumínio ao desenvolvimento de um modelo sustentável de produção, com o escopo de reduzir os impactos ambientais de sua atividade econômica. Assim, tornam-se imprescindíveis melhorias nas práticas operacionais e de controle de sua produção. Tais necessidades têm como foco principal a redução dos gases de efeito estufa (Green Houses Gases - GHG), redução do consumo de energia e aumento de produtividade. Como alternativas tecnológicas para mitigar o problema ambiental de Green Houses Gases, os “alimentadores inteligentes” para as cubas com alimentação Point Feeder e o desenvolvimento de novos controles para o ajuste automático da quantidade de “manifolds” a serem quebrados durante a alimentação para cubas Side Break são soluções viáveis. Os alimentadores “inteligentes” mostram uma redução da freqüência de efeito anódico e conseqüentemente no tempo em que a cuba fica em efeito anódico. Para as cubas VSS Side Break foi possível criar uma matriz de decisão através dos valores dos estimadores MQ utilizando a inclinação e curvatura da resistência para o ajuste de “manifolds”. Outra abordagem foi a utilização de redes neuronais para determinar a forma da curva de resistência, com a utilização de redes neuronais probabilísticas. Controle Ambiental
3	Μελέτη με MRI μετακτινικών αλλοιώσεων στα οστά ασθενών με μεταστατικούς ή πρωτοπαθείς όγκους που υποβάλλονται σε ακτινοθεραπεία Ρωμανός, Οδυσσεύς 10 June 2014 (has links) Ο μυελός των οστών επηρεάζεται από λεμφοϋπερπλαστικές διαταραχές, μεταστατική νόσο, αλλά και από διάφορες θεραπευτικές προσεγγίσεις. Η μαγνητική τομογραφία είναι η πιο κατάλληλη μέθοδος για την ανίχνευση των μεταστάσεων και την παρακολούθηση μετά τη θεραπεία. Τεχνικές ανάλυσης εικόνας χρησιμοποιούνται επιπλέον προκειμένου να αντλήσουμε πρόσθετες διαγνωστικές πληροφορίες. Η παρούσα μελέτη επικεντρώνεται στις πρώιμες αλλαγές που προκαλούνται στον οστικό μυελό μετά από ακτινοβόληση και συγκρίνει καθιερωμένες μεθόδους για την ταυτοποίηση και τον χαρακτηρισμό αυτών των βλαβών με τη χρήση ενός αυτοματοποιημένου συστήματος ταξινόμησης. ΜΕΘΟΔΟΙ: 36 ασθενείς με ιστολογικά επιβεβαιωμένη πρωτοπαθή κακοήθεια και οστικές μεταστάσεις συμπεριλήφθηκαν στη μελέτη. Όλοι οι ασθενείς υποβλήθηκαν σε ακττινοθεραπεία για την αντιμετώπιση οστικών μεταστάσεων στη σπονδυλική στήλη ή τη λεκάνη. Η μαγνητική τομογραφία πραγματοποιήθηκε ακριβώς πριν, 12 έως 18 ημέρες και 3 μήνες μετά την έναρξη της ακτινοθεραπείας. Ελήφθησαν εικόνες εντός, πλησίον και εκτός του πεδίου ακτινοβόλησης. Η ποιοτική αξιολόγηση πραγματοποιήθηκε ανεξάρτητα από δύο έμπειρους ακτινολόγους. Για την ποσοτική αξιολόγηση, συγκεκριμένες μετρήσεις επιλέχθηκαν και αξιολογήθηκαν με τη μέθοδο της περιοχής ενδιαφέροντος. Επιπλέον, χαρακτηριστικά υφής 1ης και 2ης τάξης εξήχθησαν και τοποθετήθηκαν σε ένα πιθανοτικό νευρωνικό δίκτυο, προκειμένου να δημιουργηθεί ένα σύστημα αυτόματης ταξινόμησης των βλαβών. ΑΠΟΤΕΛΕΣΜΑΤΑ: Σύμφωνα με την ποιοτική και ποσοτική αξιολόγηση, εντός του πεδίου ακτινοβολίας 22.22% και 33.33% των ασθενών αντίστοιχα παρουσίασε λιπώδη μεταστροφή του μυελού, 19.44% και 16.67% των ασθενών παρουσίασε αιμορραγία, ενώ 11.11% και 16.67% των ασθενών εμφάνισε οίδημα του οστικού μυελού. Παρακείμενα του πεδίου ακτινοβόλησης 11.11% και 19.44% των ασθενών παρουσίασε λιπώδη μεταστροφή, 8.33% παρουσίασε αιμορραγία, ενώ 2.78% και 8.33% έδειξε οίδημα του μυελού των οστών. Εκτός του πεδίου ακτινοβολίας 5.56% των ασθενών παρουσίασαν αλλαγές συμβατές με λιπώδη μεταστροφή, ενώ το υπόλοιπο 94.44% δεν έδειξε σημαντικές μεταβολές. Δεν υπήρξε στατιστικά σημαντική μεταβολή του δείκτη σκιαγραφικής ενίσχυσης μετά τη χορήγηση γαδολινίου. Με βάση την πολυπαραγοντική ανάλυση, καμία από τις παραμέτρους που μελετήθηκαν δεν φάνηκε να επηρεάζει στατιστικά σημαντικά την εμφάνιση οποιασδήποτε από τις μετακτινικές αλλοιώσεις. Η μέγιστη συνολική ακρίβεια ταξινόμησης του συστήματός μας, ως προς τη διάκριση μεταξύ προ και μετακτινικών εικόνων ήταν 93.02%, με χρήση του συστήματος ταξινόμησης LSFT - PNN και της μεθόδου ECV. Η ακρίβεια του συστήματος στη διάκριση μεταξύ των τριών κυρίων τύπων των μετακτινικών βλαβών ήταν 86.67% . ΣΥΜΠΕΡΑΣΜΑΤΑ: Η παρούσα μελέτη δείχνει ότι σημαντικό ποσοστό των ασθενών που υποβάλλονται σε ακτινοθεραπεία θα εμφανίσει τουλάχιστον μία από τις κοινές μετακτινικές μεταβολές του οστικού μυελού. Η λιπώδης μεταστροφή του μυελού είναι η πιο συχνά εμφανιζόμενη πρώιμη μεταβολή. Η ποιοτική ανάλυση των εικόνων μαγνητικής τομογραφίας υστερεί σε ευαισθησία σε σύγκριση με τις ποσοτικές μετρήσεις. Το βασζόμενο σε νευρικό δίκτυο προτεινόμενο σύστημα ταξινόμησης μπορεί να αποδειχθεί χρήσιμο εργαλείο για το χαρακτηρισμό αυτών των βλαβών. / Bone marrow can be affected by lymphoproliferative disorders and metastatic disease but also by several therapeutic approaches. MRI is the most suitable method for the detection of metastases and post-treatment follow-up. Image analysis techniques are now used to extract additional diagnostic information. This study focuses on the early radiation-induced changes that can be detected by MRI and compares the established methods for the identification and characterization of these lesions with an automated classification system. METHODS: 36 patients with histologically confirmed primary malignancy and associated bone metastases were included in the study. All patients underwent radiation therapy (RT) to treat bone metastases to the spinal column or the pelvis. Magnetic resonance imaging (MRI) was performed just before the start of RT, 12 to 18 days and up to 3 months after the start of RT. Images were obtained within, adjacent and outside the radiation field. Qualitative assessment was performed independently by two experienced radiologists. For quantitative assessment, specific measurements were selected and evaluated by the method of the region of interest (ROI). In addition, textural features of 1st and 2nd class were exported and inserted into a probabilistic neural network classifier, in order to create an automatic classification system for these lesions. RESULTS: Following qualitative and quantitative assessment, within the radiation field, 22.22% and 33.33% of patients respectively showed fatty conversion of the bone marrow, 19.44% and 16.67% of patients showed haemorrhage, while 11.11% and 16.67% of the patients demonstrated bone marrow oedema. Adjacent to the radiation field, 11.11% and 19.44% of patients showed fatty conversion, 8.33% showed haemorrhage, while 2.78% and 8.33% demonstrated bone marrow oedema. Outside of the radiation field, 5.56% of patients showed changes compatible with fatty conversion, while the remaining 94.44% showed no significant change. There was no statistically significant change of the enhancement index after gadolinium administration. In multivariate analysis, none of the studied parameters did not appear to affect significantly the appearance of any of the radiation-induced lesions. The largest overall classification accuracy of the system designed to distinguish between the pre- radiation and radiation-induced images was 93.02% using the LSFT-PNN classification system of multiple sequences and the ECV method. Discrimination accuracy of the classification system designed to distinguish between the three main types of post-radiation lesions was 86.67%. CONCLUSIONS: This study shows that a significant proportion of patients undergoing RT will experience at least one of the common radiation-induced bone marrow changes. Fatty marrow conversion is the most often featured change in the examined period. Qualitative analysis of the MRI images lacks sensitivity comparing to quantitative measurements. The proposed classification system, based on the neural network, can be used as a very helpful tool for the characterization of these lesions. Οστικές μεταστάσεις Ακτινοθεραπεία Μαγνητική τομογραφία 616.994 410 642 Bone metastases Radiation therapy Magnetic resonance imaging Pattern recognition Probabilistic neural networks
4	Στατιστική και υπολογιστική νοημοσύνη Γεωργίου, Βασίλειος 12 April 2010 (has links) Η παρούσα διατριβή ασχολείται με τη μελέτη και την ανάπτυξη μοντέλων ταξινόμησης τα οποία βασίζονται στα Πιθανοτικά Νευρωνικά Δίκτυα (ΠΝΔ). Τα προτεινόμενα μοντέλα αναπτύχθηκαν ενσωματώνοντας στατιστικές μεθόδους αλλά και μεθόδους από διάφορα πεδία της Υπολογιστικής Νοημοσύνης (ΥΝ). Συγκεκριμένα, χρησιμοποιήθηκαν οι Διαφοροεξελικτικοί αλγόριθμοι βελτιστοποίησης και η Βελτιστοποίηση με Σμήνος Σωματιδίων (ΒΣΣ) για την αναζήτηση βέλτιστων τιμών των παραμέτρων των ΠΝΔ. Επιπλέον, ενσωματώθηκε η τεχνική bagging για την ανάπτυξη συστάδας μοντέλων ταξινόμησης. Μια άλλη προσέγγιση ήταν η ανάπτυξη ενός Μπεϋζιανού μοντέλου για την εκτίμηση των παραμέτρων του ΠΝΔ χρησιμοποιώντας τον δειγματολήπτη Gibbs. Επίσης, ενσωματώθηκε μια Ασαφή Συνάρτηση Συμμετοχής για την καλύτερη στάθμιση των τεχνητών νευρώνων του ΠΝΔ καθώς και ένα νέο σχήμα διάσπασης του συνόλου εκπαίδευσης σε προβλήματα ταξινόμησης πολλαπλών κλάσεων όταν ο ταξινομητής μπορεί να επιτύχει ταξινόμηση δύο κλάσεων.Τα προτεινόμενα μοντέλα ταξινόμησης εφαρμόστηκαν σε μια σειρά από πραγματικά προβλήματα από διάφορες επιστημονικές περιοχές με ενθαρρυντικά αποτελέσματα. / The present thesis is dealing with the study and the development of classification models that are based on Probabilistic Neural Networks (PNN). The proposed models were developed by the incorporation of statistical methods as well as methods from several fields of Computational Intelligence (CI) into PNNs. In particular, the Differential Evolutionary optimization algorithms and Particle Swarm Optimization algorithms are employed for the search of promising values of PNNs’ parameters. Moreover, the bagging technique was incorporated for the development of an ensemble of classification models. Another approach was the construction of a Bayesian model for the estimation of PNN’s parameters utilizing the Gibbs sampler. Furthermore, a Fuzzy Membership Function was incorporated to achieve an improved weighting of PNN’s neurons. A new decomposition scheme is proposed for multi-class classification problems when a two-class classifier is employed. The proposed classification models were applied to a series of real-world problems from several scientific areas with encouraging results. Μπεϋζιανή ανάλυση 519 Probabilistic neural networks Particle swarm optimization Bayesian analysis Fuzzy membership function
5	Independent component analysis of evoked potentials for the classification of psychiatric patients and normal controls / Ανάλυση ανεξάρτητων συνιστώσων προκλητών δυναμικών για ταξινόμηση ψυχιατρικών ασθενών και υγιών μαρτύρων Κοψαύτης, Νικόλαος Ι. 18 February 2009 (has links) The last twenty years presented increased interest for the study of cerebral processes caused by external events (stimuli). One of the most significant endogenous components of Evoked Potentials is the P600 component. The P600 component may be defined as the most positive peak in the time window between 500 and 800 msec after an eliciting stimulus. This component is thought to reflect the response selection stage of information processing. P600 component is usually less pronounced compared to other components, such as the N100 or the P300. Frequently the P600 component appears as a not-easily discernible secondary peak overlying the ascending negative-going slope of the P300 waveform. In our study we used ERP data from various groups of patients and healthy controls. Patients were recruited from the outpatient university clinic of Eginition Hospital of the University of Athens. The controls were recruited from hospital staff and local volunteer groups. The aim of the study is the implementation of classification systems for these groups, using P600 features. This is usually not achieved well using as features the ERPs amplitude and latency. So for that reason, in our study, we want to extract new features using advanced techniques for processing the original ERPs, such as the Independent Component Analysis (ICA) method. However as a precursor of ICA, is considered the Principal Component Analysis (PCA) method, which we used for comparison reasons to ICA. In the application of ICA we achieve the decomposition of the recorded signals in ICs, supposing temporally independent components and propose ICs selection techniques in order to recompose the P600 component. The next stage was the use of a classification method based on the features extracted using the original data, data extracted through PCA processing and ICA-processed data. First we applied Kolmogorov-Smirnov test to check the normality of the distribution of the features, then we used the Logistic Regression method for classification and finally we have done two implementations of classification using Probabilistic Neural Networks. The first implementation was done with the creation of 15 features from the P600 peak amplitudes from the subjects’ data and the second implementation was done with the creation of four meta-features from the subjects’ P600 amplitude data. The results show that the application of ICA, combined with the logistic regression classification technique, provides notable improvement, compared to the classification performance based on the original ERPs. The main merit of the application is that classification is based on single parameters, i.e. amplitude of the P600 component, or its latency or its termination latency, which are directly related to the brain mechanisms related to ERP generation and pathological processes. / Τα τελευταία 20 χρόνια παρουσιάζεται αυξημένο ενδιαφέρον για την μελέτη εγκεφαλικών επεξεργασιών που προκλήθηκαν από εξωτερικά γεγονότα (ερέθισμα). Ένα από τα πιο σημαντικά ενδογενή συστατικά των Προκλητών Δυναμικών είναι το συστατικό P600. Το συστατικό P600 μπορεί να οριστεί σαν η πιο θετική αιχμή στο χρονικό διάστημα μεταξύ 500 και 800 msec μετά από ένα εκλυτικό ερέθισμα. Το συστατικό αυτό θεωρείται ότι απεικονίζει το στάδιο επιλογής απόκρισης της επεξεργασίας πληροφορίας. Το συστατικό P600 είναι συνήθως λιγότερο έντονο συγκρίνοντας το με άλλα συστατικά, όπως το N100 ή το P300. Συχνά το συστατικό P600 εμφανίζεται ως μια δυσδιάκριτη δεύτερη αιχμή, επικαλύπτοντας την ανοδική αρνητική κλίση της κυματομορφής του P300. Στη μελέτη μας χρησιμοποιήσαμε δεδομένα ΠΔ από ποικίλες ομάδες ασθενών και υγιών μαρτύρων. Οι ασθενείς συλλέχθησαν από τη πανεπιστημιακή κλινική του Αιγηνήτειου Νοσοκομείου του Πανεπιστημίου Αθηνών. Οι υγιείς συλλέχθησαν από το προσωπικό του νοσοκομείου και ομάδες εθελοντών. Ο σκοπός της μελέτης είναι η εφαρμογή συστημάτων ταξινόμησης για αυτές τις ομάδες, χρησιμοποιώντας χαρακτηριστικά του P600. Αυτό συνήθως δεν επιτυγχάνεται καλά χρησιμοποιώντας σαν χαρακτηριστικά το πλάτος και τον λανθάνοντα χρόνο των ΠΔ. Για αυτό το λόγο, στην μελέτη μας, θέλουμε να εξάγουμε νέα χαρακτηριστικά χρησιμοποιώντας προηγμένες τεχνικές για επεξεργασία των αρχικών ΠΔ, όπως τη μέθοδο Ανάλυσης Ανεξαρτήτων Συνιστωσών (ICA). Εντούτοις ως πρόδρομο της ICA, θεωρείται η μέθοδος Ανάλυσης Κύριων Συνιστωσών (PCA), την οποία χρησιμοποιήσαμε για συγκριτικούς λόγους με την ICA. Στην εφαρμογή της ICA προχωρήσαμε στην αποσύνθεση των καταγραφόμενων σημάτων σε Ανεξάρτητες Συνιστώσες και διερευνήσαμε τρεις τεχνικές επιλογής ανεξαρτήτων συνιστωσών μέσω των οποίων επανασυνθέσαμε το συστατικό P600. Το επόμενο βήμα ήταν η χρήση μεθόδου ταξινόμησης βασισμένης στα χαρακτηριστικά που εξάχθηκαν χρησιμοποιώντας τα αρχικά δεδομένα, τα δεδομένα με επεξεργασία PCA και τα δεδομένα με επεξεργασία ICA. Πρώτα εφαρμόσαμε το τεστ Kolmogorov-Smirnov για τον έλεγχο της κανονικότητας της κατανομής των χαρακτηριστικών, μετά χρησιμοποιήσαμε τη μέθοδο Λογαριθμικής Παλινδρόμησης (Logistic Regression) για ταξινόμηση και τελικά πραγματοποιήσαμε δύο εφαρμογές ταξινόμησης χρησιμοποιώντας Πιθανοκρατικά Νευρωνικά Δίκτυα (Probabilistic Neural Networks). Η πρώτη εφαρμογή έγινε με την δημιουργία 15 χαρακτηριστικών από τα πλάτη των αιχμών του P600 από τα δεδομένα των ομάδων και η δεύτερη εφαρμογή έγινε με την δημιουργία τεσσάρων μετά-χαρακτηριστικών από τα δεδομένα των πλατών των ομάδων. Τα αποτελέσματα δείχνουν ότι η εφαρμογή της ICA, συνδυασμένη με την τεχνική ταξινόμησης λογαριθμικής παλινδρόμησης, παρέχει αξιοσημείωτη βελτίωση, συγκριτικά με την απόδοση ταξινόμησης βάση των αρχικών ΠΔ. Η κύρια αξία της εφαρμογής είναι ότι η ταξινόμηση πετυχαίνει ποσοστά μεγαλύτερα του 80% βασιζόμενη σε μία μόνο κάθε φορά παράμετρο, π.χ. το πλάτος του συστατικού P600, ή τον λανθάνοντα χρόνο του ή τον λανθάνοντα χρόνο τερματισμού του, οι οποίες σχετίζονται άμεσα με τους μηχανισμούς του εγκεφάλου σχετικούς με την παραγωγή ΠΔ και τις παθολογικές διαδικασίες. Independent Component Analysis(ICA) Evoked potentials (EPs) P600 component 616.804 754 7 Προκλητά δυναμικά Συνιστώσα P600 Logistic regression Probabilistic Neural Networks (PNN)
6	Αναγνώριση ομιλητή / Speaker recognition Ganchev, Todor 25 June 2007 (has links) Η παρούσα διατριβή πραγματεύεται την αναγνώριση ομιλητή σε πραγματικές συνθήκες. Τα κύρια σημεία της εργασίας είναι: (1) αξιολόγηση διαφόρων προσεγγίσεων εξαγωγής χαρακτηριστικών παραμέτρων ομιλίας, (2) μείωση της ισχύος της περιβαλλοντικής επίδρασης στην απόδοση της αναγνώρισης ομιλητή, και (3) μελέτη τεχνικών κατηγοριοποίησης, εναλλακτικών προς τις υπάρχουσες. Συγκεκριμένα, στο (1), προτείνεται μια νέα δομή εξαγωγής παραμέτρων ομιλίας βασισμένη σε πακέτα κυματομορφών, κατάλληλα σχεδιασμένη για αναγνώριση ομιλητή. Εξάγεται με ένα αντικειμενικό τρόπο σε σχέση με την απόδοση αναγνώρισης ομιλητή, σε αντίθεση με την MFCC προσέγγιση, που βασίζεται στην προσέγγιση της αντίληψης της ανθρώπινης ακοής. Έπειτα, στο (2), δίνεται μια δομή για την εξαγωγή παραμέτρων βασισμένη στα MFCC, ανεκτική στο θόρυβο, για την βελτίωση της απόδοσης της αναγνώρισης ομιλητή σε πραγματικό περιβάλλον. Συνοπτικά, μια τεχνική μείωσης του θορύβου βασισμένη σε μοντέλο προσαρμοσμένη στο πρόβλημα της επιβεβαίωσης ομιλητή ενσωματώνεται απευθείας στη δομή υπολογισμού των MFCC. Αυτή η προσέγγιση επέδειξε σημαντικό πλεονέκτημα σε πραγματικό και ταχέως μεταβαλλόμενο περιβάλλον. Τέλος, στο (3), εισάγονται δύο νέοι κατηγοριοποιητές που αναφέρονται ως Locally Recurrent Probabilistic Neural Network (LR PNN), και Generalized Locally Recurrent Probabilistic Neural Network (GLR PNN). Είναι υβρίδια μεταξύ των Recurrent Neural Network (RNN) και Probabilistic Neural Network (PNN) και συνδυάζουν τα πλεονεκτήματα των γεννετικών και διαφορικών προσσεγγίσεων κατηγοριοποίησης. Επιπλέον, τα νέα αυτά νευρωνικά δίκτυα είναι ευαίσθητα σε παροδικές και ειδικές συσχετίσεις μεταξύ διαδοχικών εισόδων, και έτσι, είναι κατάλληλα για να αξιοποιήσουν την συσχέτιση παραμέτρων ομιλίας μεταξύ πλαισίων ομιλίας. Κατά την εξαγωγή των πειραμάτων, διαφάνηκε ότι οι αρχιτεκτονικές LR PNN και GLR PNN παρέχουν καλύτερη απόδοση, σε σχέση με τα αυθεντικά PNN. / This dissertation dials with speaker recognition in real-world conditions. The main accent falls on: (1) evaluation of various speech feature extraction approaches, (2) reduction of the impact of environmental interferences on the speaker recognition performance, and (3) studying alternative to the present state-of-the-art classification techniques. Specifically, within (1), a novel wavelet packet-based speech features extraction scheme fine-tuned for speaker recognition is proposed. It is derived in an objective manner with respect to the speaker recognition performance, in contrast to the state-of-the-art MFCC scheme, which is based on approximation of human auditory perception. Next, within (2), an advanced noise-robust feature extraction scheme based on MFCC is offered for improving the speaker recognition performance in real-world environments. In brief, a model-based noise reduction technique adapted for the specifics of the speaker verification task is incorporated directly into the MFCC computation scheme. This approach demonstrated significant advantage in real-world fast-varying environments. Finally, within (3), two novel classifiers referred to as Locally Recurrent Probabilistic Neural Network (LR PNN), and Generalized Locally Recurrent Probabilistic Neural Network (GLR PNN) are introduced. They are hybrids between Recurrent Neural Network (RNN) and Probabilistic Neural Network (PNN) and combine the virtues of the generative and discriminative classification approaches. Moreover, these novel neural networks are sensitive to temporal and special correlations among consecutive inputs, and therefore, are capable to exploit the inter-frame correlations among speech features derived for successive speech frames. In the experimentations, it was demonstrated that the LR PNN and GLR PNN architectures provide benefit in terms of performance, when compared to the original PNN. Αναγνώριση ομιλητή Επιβεβαίωση ομιλητή Παράμετροι ομιλίας Πακέτα κυματομορφών Καταστολή θορύβου 006.454 Speaker recognition Speaker verification Hybrid classifiers Probabilistic neural networks Recurrent neural networks Speech features Wavelet packets Noise suppression

1

Page generated in 0.0881 seconds