Global ETD Search

1	The automatic and unconstrained segmentation of speech into subword units Van Vuuren, Van Zyl 03 1900 (has links) Thesis (MEng)--Stellenbosch University, 2014. / ENGLISH ABSTRACT: We develop and evaluate several algorithms that segment a speech signal into subword units without using phone or orthographic transcripts. These segmentation algorithms rely on a scoring function, termed the local score, that is applied at the feature level and indicates where the characteristics of the audio signal change. The predominant approach in the literature to segmentation is to apply a threshold to the local score, and local maxima (peaks) that are above the threshold result in the hypothesis of a segment boundary. Scoring mechanisms of a select number of such algorithms are investigated, and it is found that these local scores frequently exhibit clusters of peaks near phoneme transitions that cause spurious segment boundaries. As a consequence, very short segments are sometimes postulated by the algorithms. To counteract this, ad-hoc remedies are proposed in the literature. We propose a dynamic programming (DP) framework for speech segmentation that employs a probabilistic segment length model in conjunction with the local scores. DP o ers an elegant way to deal with peak clusters by choosing only the most probable segment length and local score combinations as boundary positions. It is shown to o er a clear performance improvement over selected methods from the literature serving as benchmarks. Multilayer perceptrons (MLPs) can be trained to generate local scores by using groups of feature vectors centred around phoneme boundaries and midway between phoneme boundaries in suitable training data. The MLPs are trained to produce a high output value at a boundary, and a low value at continuity. It was found that the more accurate local scores generated by the MLP, which rarely exhibit clusters of peaks, made the additional application of DP less e ective than before. However, a hybrid approach in which DP is used only to resolve smaller, more ambiguous peaks in the local score was found to o er a substantial improvement on all prior methods. Finally, restricted Boltzmann machines (RBMs) were applied as features detectors. This provided a means of building multi-layer networks that are capable of detecting highly abstract features. It is found that when local score are estimated by such deep networks, additional performance gains are achieved. / AFRIKAANSE OPSOMMING: Ons ontwikkel en evalueer verskeie algoritmes wat 'n spraaksein in sub-woord eenhede segmenteer sonder om gebruik te maak van ortogra ese of fonetiese transkripsies. Dié algoritmes maak gebruik van 'n funksie, genaamd die lokale tellingsfunksie, wat 'n waarde produseer omtrent die lokale verandering in 'n spraaksein. In die literatuur is daar gevind dat die hoofbenadering tot segmentasie gebaseer is op 'n grenswaarde, waarbo alle lokale maksima (pieke) in die lokale telling lei tot 'n skeiding tussen segmente. 'n Selektiewe groep segmentasie algoritmes is ondersoek en dit is gevind dat lokale tellings geneig is om groeperings van pieke te hê naby aan die skeidings tussen foneme. As gevolg hiervan, word baie kort segmente geselekteer deur die algoritmes. Om dit teen te werk, word ad-hoc metodes voorgestel in die literatuur. Ons stel 'n alternatiewe metode voor wat gebaseer is op dinamiese programmering (DP), wat 'n statistiese verspreiding van lengtes van segmente inkorporeer by segmentasie. DP bied 'n elegante manier om groeperings van pieke te hanteer, deurdat net kombinasies van hoë lokale tellings en segmentwaarskynlikheid, met betrekking tot die lengte van die segment, tot 'n skeiding lei. Daar word gewys dat DP 'n duidelike verbetering in segmentasie akkuraatheid toon bo 'n paar gekose algoritmes uit die literatuur. Meervoudige lae perseptrone (MLPe) kan opgelei word om 'n lokale telling te genereer deur gebruik te maak van groepe eienskapsvektore gesentreerd rondom en tussen foneem skeidings in geskikte opleidingsdata. Die MLPe word opgelei om 'n groot waarde te genereer as 'n foneem skeiding voorkom en 'n klein waarde andersins. Dit is gevind dat die meer akkurate lokale tellings wat deur die MLPe gegenereer word minder groeperings van pieke het, wat dan die addisionele toepassing van die DP minder e ektief maak. 'n Hibriede toepassing, waar DP net tussen kleiner en minder duidelike pieke in die lokale telling kies, lei egter tot 'n groot verbetering bo-op alle vorige metodes. As 'n nale stap het ons beperkte Boltzmann masjiene (BBMe) gebruik om patrone in data te identi- seer. Sodoende, verskaf BBMe 'n manier om meervoudige lae netwerke op te bou waar die boonste lae baie komplekse patrone in die data identi seer. Die toepassing van dié dieper netwerke tot die generasie van 'n lokale telling het tot verdere verbeteringe in segmentasie-akkuraatheid gelei. / National Research Foundation (NRF) Speech segmentation Segmentation algorithms UCTD
2	Biomedical Image Segmentation and Object Detection Using Deep Convolutional Neural Networks Liming Wu (6622538) 11 June 2019 (has links) <p>Quick and accurate segmentation and object detection of the biomedical image is the starting point of most disease analysis and understanding of biological processes in medical research. It will enhance drug development and advance medical treatment, especially in cancer-related diseases. However, identifying the objects in the CT or MRI images and labeling them usually takes time even for an experienced person. Currently, there is no automatic detection technique for nucleus identification, pneumonia detection, and fetus brain segmentation. Fortunately, as the successful application of artificial intelligence (AI) in image processing, many challenging tasks are easily solved with deep convolutional neural networks. In light of this, in this thesis, the deep learning based object detection and segmentation methods were implemented to perform the nucleus segmentation, lung segmentation, pneumonia detection, and fetus brain segmentation. The semantic segmentation is achieved by the customized U-Net model, and the instance localization is achieved by Faster R-CNN. The reason we choose U-Net is that such a network can be trained end-to-end, which means the architecture of this network is very simple, straightforward and fast to train. Besides, for this project, the availability of the dataset is limited, which makes U-Net a more suitable choice. We also implemented the Faster R-CNN to achieve the object localization. Finally, we evaluated the performance of the two models and further compared the pros and cons of them. The preliminary results show that deep learning based technique outperforms all existing traditional segmentation algorithms. </p> Computer Engineering deep learning image segmentation algorithms object detection algorithms biomedical image processing
3	Classificação de imagens de plâncton usando múltiplas segmentações / Plankton image classification using multiple segmentations Fernandez, Mariela Atausinchi 27 March 2017 (has links) Plâncton são organismos microscópicos que constituem a base da cadeia alimentar de ecossistemas aquáticos. Eles têm importante papel no ciclo do carbono pois são os responsáveis pela absorção do carbono na superfície dos oceanos. Detectar, estimar e monitorar a distribuição das diferentes espécies são atividades importantes para se compreender o papel do plâncton e as consequências decorrentes de alterações em seu ambiente. Parte dos estudos deste tipo é baseada no uso de técnicas de imageamento de volumes de água. Devido à grande quantidade de imagens que são geradas, métodos computacionais para auxiliar no processo de análise das imagens estão sob demanda. Neste trabalho abordamos o problema de identificação da espécie. Adotamos o pipeline convencional que consiste dos passos de detecção de alvo, segmentação (delineação de contorno), extração de características, e classificação. Na primeira parte deste trabalho abordamos o problema de escolha de um algoritmo de segmentação adequado. Uma vez que a avaliação de resultados de segmentação é subjetiva e demorada, propomos um método para avaliar algoritmos de segmentação por meio da avaliação da classificação no final do pipeline. Experimentos com esse método mostraram que algoritmos de segmentação distintos podem ser adequados para a identificação de espécies de classes distintas. Portanto, na segunda parte do trabalho propomos um método de classificação que leva em consideração múltiplas segmentações. Especificamente, múltiplas segmentações são calculadas e classificadores são treinados individualmente para cada segmentação, os quais são então combinados para construir o classificador final. Resultados experimentais mostram que a acurácia obtida com a combinação de classificadores é superior em mais de 2% à acurácia obtida com classificadores usando uma segmentação fixa. Os métodos propostos podem ser úteis para a construção de sistemas de identificação de plâncton que sejam capazes de se ajustar rapidamente às mudanças nas características das imagens. / Plankton are microscopic organisms that constitute the basis of the food chain of aquatic ecosystems. They have an important role in the carbon cycle as they are responsible for the absorption of carbon in the ocean surfaces. Detecting, estimating and monitoring the distribution of plankton species are important activities for understanding the role of plankton and the consequences of changes in their environment. Part of these type of studies is based on the analysis of water volumes by means of imaging techniques. Due to the large quantity of generated images, computational methods for helping the process of image analysis are in demand. In this work we address the problem of species identification. We follow the conventional pipeline consisting of target detection, segmentation (contour delineation), feature extraction, and classification steps. In the first part of this work we address the problem of choosing an appropriate segmentation algorithm. Since evaluating segmentation results is a subjective and time consuming task, we propose a method to evaluate segmentation algorithms by evaluating the classification results at the end of the pipeline. Experiments with this method showed that distinct segmentation algorithms might be appropriate for identifying species of distinct classes. Therefore, in the second part of this work we propose a classification method that takes into consideration multiple segmentations. Specifically, multiple segmentations are computed and classifiers are trained individually for each segmentation, which are then combined to build the final classifier. Experimental results show that the accuracy obtained with the combined classifier is superior in more than 2% to the accuracy obtained with classifiers using a fixed segmentation. The proposed methods can be useful to build plankton identification systems that are able to quickly adjust to changes in the characteristics of the images. Classificação de imagens de plâncton Detecção de plâncton Detection of plankton Extração de características Feature extraction Plankton image classification Plankton image segmentation Segmentação de imagens de plâncton Segmentation algorithms assessment
4	Classificação de imagens de plâncton usando múltiplas segmentações / Plankton image classification using multiple segmentations Mariela Atausinchi Fernandez 27 March 2017 (has links) Plâncton são organismos microscópicos que constituem a base da cadeia alimentar de ecossistemas aquáticos. Eles têm importante papel no ciclo do carbono pois são os responsáveis pela absorção do carbono na superfície dos oceanos. Detectar, estimar e monitorar a distribuição das diferentes espécies são atividades importantes para se compreender o papel do plâncton e as consequências decorrentes de alterações em seu ambiente. Parte dos estudos deste tipo é baseada no uso de técnicas de imageamento de volumes de água. Devido à grande quantidade de imagens que são geradas, métodos computacionais para auxiliar no processo de análise das imagens estão sob demanda. Neste trabalho abordamos o problema de identificação da espécie. Adotamos o pipeline convencional que consiste dos passos de detecção de alvo, segmentação (delineação de contorno), extração de características, e classificação. Na primeira parte deste trabalho abordamos o problema de escolha de um algoritmo de segmentação adequado. Uma vez que a avaliação de resultados de segmentação é subjetiva e demorada, propomos um método para avaliar algoritmos de segmentação por meio da avaliação da classificação no final do pipeline. Experimentos com esse método mostraram que algoritmos de segmentação distintos podem ser adequados para a identificação de espécies de classes distintas. Portanto, na segunda parte do trabalho propomos um método de classificação que leva em consideração múltiplas segmentações. Especificamente, múltiplas segmentações são calculadas e classificadores são treinados individualmente para cada segmentação, os quais são então combinados para construir o classificador final. Resultados experimentais mostram que a acurácia obtida com a combinação de classificadores é superior em mais de 2% à acurácia obtida com classificadores usando uma segmentação fixa. Os métodos propostos podem ser úteis para a construção de sistemas de identificação de plâncton que sejam capazes de se ajustar rapidamente às mudanças nas características das imagens. / Plankton are microscopic organisms that constitute the basis of the food chain of aquatic ecosystems. They have an important role in the carbon cycle as they are responsible for the absorption of carbon in the ocean surfaces. Detecting, estimating and monitoring the distribution of plankton species are important activities for understanding the role of plankton and the consequences of changes in their environment. Part of these type of studies is based on the analysis of water volumes by means of imaging techniques. Due to the large quantity of generated images, computational methods for helping the process of image analysis are in demand. In this work we address the problem of species identification. We follow the conventional pipeline consisting of target detection, segmentation (contour delineation), feature extraction, and classification steps. In the first part of this work we address the problem of choosing an appropriate segmentation algorithm. Since evaluating segmentation results is a subjective and time consuming task, we propose a method to evaluate segmentation algorithms by evaluating the classification results at the end of the pipeline. Experiments with this method showed that distinct segmentation algorithms might be appropriate for identifying species of distinct classes. Therefore, in the second part of this work we propose a classification method that takes into consideration multiple segmentations. Specifically, multiple segmentations are computed and classifiers are trained individually for each segmentation, which are then combined to build the final classifier. Experimental results show that the accuracy obtained with the combined classifier is superior in more than 2% to the accuracy obtained with classifiers using a fixed segmentation. The proposed methods can be useful to build plankton identification systems that are able to quickly adjust to changes in the characteristics of the images. Classificação de imagens de plâncton Detecção de plâncton Extração de características Segmentação de imagens de plâncton Detection of plankton Feature extraction Plankton image classification Plankton image segmentation Segmentation algorithms assessment
5	Методы сегментации 3D объектов в облаке точек : магистерская диссертация / Methods of segmentation of 3D objects in a point cloud Самаркин, Д. С., Samarkin, D. S. January 2024 (has links) Цель: разработка модели сегментации трёхмерных объектов на основе методологии машинного обучения. Объект: процессы сегментации трёхмерных объектов, представленных облаком точек. Методы: проведение исследование моделей сегментации трёхмерных объектов на основании датасета ScanNet с оценкой точности на основании метрики Average Intersection over Union (avgloU). Результаты: в ходе работы проведено сравнение и выявлены наиболее точные и производительные сочетания внутренней структуры обрабатываемых данных и архитектуры моделей, которые являются самыми перспективными для дальнейших исследований. наилучшие результаты показала библиотек машинного обучения Point Transformer со значением метрики avgIoU, равной 0,794. Полученные результаты будут использованы для дальнейшей работы над методами обработкой данных, поиском и настройкой моделей машинного обучения для задачи сегментации 3D-объектов для достижения лучшей точности и производительности. / Objective: development of a three-dimensional object segmentation model based on machine learning methodology. Object: segmentation processes of three-dimensional objects represented by a point cloud. Methods: conducting a study of three-dimensional object segmentation models based on the ScanNet dataset with accuracy assessment based on the Average Intersection over Union (avgloU) metric. Results: during the work, a comparison was made and the most accurate and productive combinations of the internal structure of the processed data and the architecture of the models were identified, which are the most promising for further research. The best results were shown by the Point Transformer machine learning library with an avgIoU metric value of 0.794. The obtained results will be used for further work on data processing methods, searching and tuning machine learning models for the task of segmenting 3D objects to achieve better accuracy and performance. MASTER'S THESIS SEGMENTATION OF 3D OBJECTS ANALYSIS OF SEGMENTATION ALGORITHMS
6	Explicit Segmentation Of Speech For Indian Languages Ranjani, H G 03 1900 (has links) Speech segmentation is the process of identifying the boundaries between words, syllables or phones in the recorded waveforms of spoken natural languages. The lowest level of speech segmentation is the breakup and classification of the sound signal into a string of phones. The difficulty of this problem is compounded by the phenomenon of co-articulation of speech sounds. The classical solution to this problem is to manually label and segment spectrograms. In the first step of this two step process, a trained person listens to a speech signal, recognizes the word and phone sequence, and roughly determines the position of each phonetic boundary. The second step involves examining several features of the speech signal to place a boundary mark at the point where these features best satisfy a certain set of conditions specific for that kind of phonetic boundary. Manual segmentation of speech into phones is a highly time-consuming and painstaking process. Required for a variety of applications, such as acoustic analysis, or building speech synthesis databases for high-quality speech output systems, the time required to carry out this process for even relatively small speech databases can rapidly accumulate to prohibitive levels. This calls for automating the segmentation process. The state-of-art segmentation techniques use Hidden Markov Models (HMM) for phone states. They give an average accuracy of over 95% within 20 ms of manually obtained boundaries. However, HMM based methods require large training data for good performance. Another major disadvantage of such speech recognition based segmentation techniques is that they cannot handle very long utterances, Which are necessary for prosody modeling in speech synthesis applications. Development of Text to Speech (TTS) systems in Indian languages has been difficult till date owing to the non-availability of sizeable segmented speech databases of good quality. Further, no prosody models exist for most of the Indian languages. Therefore, long utterances (at the paragraph level and monologues) have been recorded, as part of this work, for creating the databases. This thesis aims at automating segmentation of very long speech sentences recorded for the application of corpus-based TTS synthesis for multiple Indian languages. In this explicit segmentation problem, we need to force align boundaries in any utterance from its known phonetic transcription. The major disadvantage of forcing boundary alignments on the entire speech waveform of a long utterance is the accumulation of boundary errors. To overcome this, we force boundaries between 2 known phones (here, 2 successive stop consonants are chosen) at a time. Here, the approach used is silence detection as a marker for stop consonants. This method gives around 89% (for Hindi database) accuracy and is language independent and training free. These stop consonants act as anchor points for the next stage. Two methods for explicit segmentation have been proposed. Both the methods rely on the accuracy of the above stop consonant detection stage. Another common stage is the recently proposed implicit method which uses Bach scale filter bank to obtain the feature vectors. The Euclidean Distance of the Mean of the Logarithm (EDML) of these feature vectors shows peaks at the point where the spectrum changes. The method performs with an accuracy of 87% within 20 ms of manually obtained boundaries and also achieves a low deletion and insertion rate of 3.2% and 21.4% respectively, for 100 sentences of Hindi database. The first method is a three stage approach. The first is the stop consonant detection stage followed by the next, which uses Quatieri’s sinusoidal model to classify sounds as voiced/unvoiced within 2 successive stop consonants. The final stage uses the EDML function of Bach scale feature vectors to further obtain boundaries within the voiced and unvoiced regions. It gives a Frame Error Rate (FER) of 26.1% for Hindi database. The second method proposed uses duration statistics of the phones of the language. It again uses the EDML function of Bach scale filter bank to obtain the peaks at the phone transitions and uses the duration statistics to assign probability to each peak being a boundary. In this method, the FER performance improves to 22.8% for the Hindi database. Both the methods are equally promising for the fact that they give low frame error rates. Results show that the second method outperforms the first, because it incorporates the knowledge of durations. For the proposed approaches to be useful, manual interventions are required at the output of each stage. However, this intervention is less tedious and reduces the time taken to segment each sentence by around 60% as compared to the time taken for manual segmentation. The approaches have been successfully tested on 3 different languages, 100 sentences each -Kannada, Tamil and English (we have used TIMIT database for validating the algorithms). In conclusion, a practical solution to the segmentation problem is proposed. Also, the algorithm being training free, language independent (ES-SABSF method) and speaker independent makes it useful in developing TTS systems for multiple languages reducing the segmentation overhead. This method is currently being used in the lab for segmenting long Kannada utterances, spoken by reading a set of 1115 phonetically rich sentences. Speech Processing Speech Segmentation Indian Languages - Speech Segmentation Stop-Consonant Detection Speech Segmentation - Algorithms Batch-Scale Filter Bank ES-SABSF Segmentation Method ES-DSBSF Segmentation Method Hidden Markov Models HMM) Text to Speech (TTS) Computer Science
7	Kompiuterine vaizdų analize pagrįstos sistemos, skirtos galvos smegenų tyrimams, analizė ir algoritmų plėtra / Systems based on computer image analysis and used for human brain research, analysis and development of algorithms Maknickas, Ramūnas 23 May 2005 (has links) One of the main problems in neurosurgery is knowledge about human brain. It's very important to see the whole brain with its critical neurostructures in virtual reality. This document is about three dimensional human brain visualization strategies. Review most recently used three dimensional objects building strategies from two dimensional medical MRI images. This task was split into 4 significant problems: image segmentation, point-sets correspondence, image registration and its frequently used transformation functions with image matching measurements. All these problems were addressed reader to show most recently used algorithms with advantages and disadvantages. Atlas types, patterns and maps survey was introduced with widely popular brain model coordinate systems. In order to find a better correspondence between two point sets it was modeled a new robust and accurate Overhauser spline points location optimization algorithm. Instead of deletion outlier points from overloaded point set, this algorithm generates more points in other set at optimized points locations. Determination of an accurate point location and choosing the correct transformation function are the key steps in registration process. Whereas registration is vital task in precise human brain visualization for neurosurgeries at preoperative and intraoperative process. Informatics Human Brain Atlas Types Spline Optimization Vaizdų registracija Non-rigid Image Transformation Functions Overhauzerio splainas Plonos plokštelės splainas Splaino optimizacija Taškų sutapatinimas Points Correspondence Taškų korespondencija Žmogaus smegenų atlasų tipai Vaizdų sutapatinimas Overhauzer Spline Overview of MRI Segmentation Algorithms Image Registration Thin-Plate Spline

1

Page generated in 0.0948 seconds