Spelling suggestions: "subject:"image segmentation"" "subject:"lmage segmentation""
91 |
Color Invariant Skin SegmentationXu, Han 25 March 2022 (has links)
This work addresses the problem of automatically detecting human skin in images without reliance on color information.
Unlike previous methods, we present a new approach that performs well in the absence of such information.
A key aspect of the work is that color-space augmentation is applied strategically during the training, with the goal of reducing the influence of features that are based entirely on color and increasing more semantic understanding.
The resulting system exhibits a dramatic improvement in performance for images in which color details are diminished.
We have demonstrated the concept using the U-Net architecture, and experimental results show improvements in evaluations for all Fitzpatrick skin tones in the ECU dataset.
We further tested the system with RFW dataset to show that the proposed method is consistent across different ethnicities and reduces bias to any skin tones.
Therefore, this work has strong potential to aid in mitigating bias in automated systems that can be applied to many applications including surveillance and biometrics. / Master of Science / Skin segmentation deals with the classification of skin and non-skin pixels and regions in a image containing these information.
Although most previous skin-detection methods have used color cues almost exclusively, they are vulnerable to external factors (e.g., poor or unnatural illumination and skin tones).
In this work, we present a new approach based on U-Net that performs well in the absence of color information.
To be specific, we apply a new color space augmentation into the training stage to improve the performance of skin segmentation system over the illumination and skin tone diverse. The system was trained and tested with both original and color changed ECU dataset. We also test our system with RFW dataset, a larger dataset with four human races with different skin tones. The experimental results show improvements in evaluations for skin tones and complex illuminations.
|
92 |
Image Segmentation and Range Estimation Using a Moving-aperture LensSubramanian, Anbumani 07 May 2001 (has links)
Given 2D images, it still remains a big challenge in the field of computer vision to group the image points into logical objects (segmentation) and to determine the locations in the scene (range estimation). Despite the decades of research, a single solution is yet to be found. Through our research we have demonstrated that a possible solution is to use moving aperture lens. This lens has the effect of introducing small, repeating movements of the camera center so that objects appear to translate in the image, by an amount that depends on distance from the plane of focus. Our novel method employs optical flow techniques to an image sequence, captured using a video camera with a moving aperture lens. For a stationary scene, optical flow magnitudes and direction are directly related to the three-dimensional object distance and location from the observer. Exploiting this information, we have successfully extracted objects at different depths and estimated the locations of objects in the scene, with respect to the plane of focus. Our work therefore demonstrates an ability for passive range estimation, without emitting any energy in an environment. Other potential applications include video compression, 3D video broadcast, teleconferencing and autonomous vehicle navigation. / Master of Science
|
93 |
A Machine Learning Approach to Recognize Environmental Features Associated with Social FactorsDiaz-Ramos, Jonathan 11 June 2024 (has links)
In this thesis we aim to supplement the Climate and Economic Justice Screening Tool (CE JST), which assists federal agencies in identifying disadvantaged census tracts, by extracting five environmental features from Google Street View (GSV) images. The five environmental features are garbage bags, greenery, and three distinct road damage types (longitudinal, transverse, and alligator cracks), which were identified using image classification, object detection, and image segmentation. We evaluate three cities using this developed feature space in order to distinguish between disadvantaged and non-disadvantaged census tracts.
The results of the analysis reveal the significance of the feature space and demonstrate the time efficiency, detail, and cost-effectiveness of the proposed methodology. / Master of Science / In this thesis we aim to supplement the Climate and Economic Justice Screening Tool (CE JST), which assists federal agencies in identifying disadvantaged census tracts, by extracting five environmental features from Google Street View (GSV) images. The five environmental features are garbage bags, greenery, and three distinct road damage types (longitudinal, transverse, and alligator cracks), which were identified using image classification, object detection, and image segmentation. We evaluate three cities using this developed feature space in order to distinguish between disadvantaged and non-disadvantaged census tracts.
The results of the analysis reveal the significance of the feature space and demonstrate the time efficiency, detail, and cost-effectiveness of the proposed methodology.
|
94 |
Continual Learning for Deep Dense PredictionLokegaonkar, Sanket Avinash 11 June 2018 (has links)
Transferring a deep learning model from old tasks to a new one is known to suffer from the catastrophic forgetting effects. Such forgetting mechanism is problematic as it does not allow us to accumulate knowledge sequentially and requires retaining and retraining on all the training data. Existing techniques for mitigating the abrupt performance degradation on previously trained tasks are mainly studied in the context of image classification. In this work, we present a simple method to alleviate catastrophic forgetting for pixel-wise dense labeling problems. We build upon the regularization technique using knowledge distillation to minimize the discrepancy between the posterior distribution of pixel class labels for old tasks predicted from 1) the original and 2) the updated networks. This technique, however, might fail in circumstances where the source and target distribution differ significantly. To handle the above scenario, we further propose an improvement to the distillation based approach by adding adaptive l2-regularization depending upon the per-parameter importance to the older tasks. We train our model on FCN8s, but our training can be generalized to stronger models like DeepLab, PSPNet, etc. Through extensive evaluation and comparisons, we show that our technique can incrementally train dense prediction models for novel object classes, different visual domains, and different visual tasks. / Master of Science / Modern deep networks have been successful on many important problems in computer vision viz. object classification, object detection, pixel-wise dense labeling, etc. However, learning various tasks incrementally still remains a fundamental problem in computer vision. When trained incrementally on multiple tasks, deep networks have been known to suffer from catastrophic forgetting on the older tasks, which leads to significant decrease in accuracy. Such forgetting is problematic and fundamentally constraints the knowledge that can be accumulated within these networks.
In this work, we present a simple algorithm to alleviate catastrophic forgetting for pixel-wise dense labeling problems. To prevent the network from forgetting connections important to the older tasks, we record the predicted labels/outputs of the older tasks for the images of the new task. While training on the images of the new task, we enforce a constraint to “remember” the recorded predictions of the older tasks while learning a new task. Additionally, we identify which connections in the deep network are important to the older tasks and prevent these connections from changing significantly. We show that our proposed algorithm can incrementally learn dense prediction models for novel object classes, different visual domains, and different visual tasks.
|
95 |
Semi-Automated Segmentation of 3D Medical Ultrasound ImagesQuartararo, John David 05 February 2009 (has links)
A level set-based segmentation procedure has been implemented to identify target object boundaries from 3D medical ultrasound images. Several test images (simulated, scanned phantoms, clinical) were subjected to various preprocessing methods and segmented. Two metrics of segmentation accuracy were used to compare the segmentation results to ground truth models and determine which preprocessing methods resulted in the best segmentations. It was found that by using an anisotropic diffusion filtering method to reduce speckle type noise with a 3D active contour segmentation routine using the level set method resulted in semi-automated segmentation on par with medical doctors hand-outlining the same images.
|
96 |
Efficient hierarchical layered graph approach for multi-region segmentation / Abordagem eficiente baseada em grafo hierárquico em camadas para a segmentação de múltiplas regiõesLeon, Leissi Margarita Castaneda 15 March 2019 (has links)
Image segmentation refers to the process of partitioning an image into meaningful regions of interest (objects) by assigning distinct labels to their composing pixels. Images are usually composed of multiple objects with distinctive features, thus requiring distinct high-level priors for their appropriate modeling. In order to obtain a good segmentation result, the segmentation method must attend all the individual priors of each object, as well as capture their inclusion/exclusion relations. However, many existing classical approaches do not include any form of structural information together with different high-level priors for each object into a single energy optimization. Consequently, they may be inappropriate in this context. We propose a novel efficient seed-based method for the multiple object segmentation of images based on graphs, named Hierarchical Layered Oriented Image Foresting Transform (HLOIFT). It uses a tree of the relations between the image objects, being each object represented by a node. Each tree node may contain different individual high-level priors and defines a weighted digraph, named as layer. The layer graphs are then integrated into a hierarchical graph, considering the hierarchical relations of inclusion and exclusion. A single energy optimization is performed in the hierarchical layered weighted digraph leading to globally optimal results satisfying all the high-level priors. The experimental evaluations of HLOIFT and its extensions, on medical, natural and synthetic images, indicate promising results comparable to the state-of-the-art methods, but with lower computational complexity. Compared to hierarchical segmentation by the min cut/max-flow algorithm, our approach is less restrictive, leading to globally optimal results in more general scenarios, and has a better running time. / A segmentação de imagem refere-se ao processo de particionar uma imagem em regiões significativas de interesse (objetos), atribuindo rótulos distintos aos seus pixels de composição. As imagens geralmente são compostas de vários objetos com características distintas, exigindo, assim, restrições de alto nível distintas para a sua modelagem apropriada. Para obter um bom resultado de segmentação, o método de segmentação deve atender a todas as restrições individuais de cada objeto, bem como capturar suas relações de inclusão/ exclusão. No entanto, muitas abordagens clássicas existentes não incluem nenhuma forma de informação estrutural, juntamente com diferentes restrições de alto nível para cada objeto em uma única otimização de energia. Consequentemente, elas podem ser inapropriadas nesse contexto. Estamos propondo um novo método eficiente baseado em sementes para a segmentação de múltiplos objetos em imagens baseado em grafos, chamado Hierarchical Layered Oriented Image Foresting Transform (HLOIFT). Ele usa uma árvore das relações entre os objetos de imagem, sendo cada objeto representado por um nó. Cada nó da árvore pode conter diferentes restrições individuais de alto nível, que são usadas para definir um dígrafo ponderado, nomeado como camada. Os grafos das camadas são então integrados em um grafo hierárquico, considerando as relações hierárquicas de inclusão e exclusão. Uma otimização de energia única é realizada no dígrafo hierárquico em camadas, levando a resultados globalmente ótimos, satisfazendo todas as restrições de alto nível. As avaliações experimentais do HLOIFT e de suas extensões, em imagens médicas, naturais e sintéticas,indicam resultados promissores comparáveis aos métodos do estado-da-arte, mas com menor complexidade computacional. Comparada à segmentação hierárquica pelo algoritmo min-cut/max-flow, nossa abordagem é menos restritiva, levando a resultados globalmente ótimo sem cenários mais gerais e com melhor tempo de execução.
|
97 |
Graph Laplacian for spectral clustering and seeded image segmentation / Estudo do Laplaciano do grafo para o problema de clusterização espectral e segmentação interativa de imagensCasaca, Wallace Correa de Oliveira 05 December 2014 (has links)
Image segmentation is an essential tool to enhance the ability of computer systems to efficiently perform elementary cognitive tasks such as detection, recognition and tracking. In this thesis we concentrate on the investigation of two fundamental topics in the context of image segmentation: spectral clustering and seeded image segmentation. We introduce two new algorithms for those topics that, in summary, rely on Laplacian-based operators, spectral graph theory, and minimization of energy functionals. The effectiveness of both segmentation algorithms is verified by visually evaluating the resulting partitions against state-of-the-art methods as well as through a variety of quantitative measures typically employed as benchmark by the image segmentation community. Our spectral-based segmentation algorithm combines image decomposition, similarity metrics, and spectral graph theory into a concise and powerful framework. An image decomposition is performed to split the input image into texture and cartoon components. Then, an affinity graph is generated and weights are assigned to the edges of the graph according to a gradient-based inner-product function. From the eigenstructure of the affinity graph, the image is partitioned through the spectral cut of the underlying graph. Moreover, the image partitioning can be improved by changing the graph weights by sketching interactively. Visual and numerical evaluation were conducted against representative spectral-based segmentation techniques using boundary and partition quality measures in the well-known BSDS dataset. Unlike most existing seed-based methods that rely on complex mathematical formulations that typically do not guarantee unique solution for the segmentation problem while still being prone to be trapped in local minima, our segmentation approach is mathematically simple to formulate, easy-to-implement, and it guarantees to produce a unique solution. Moreover, the formulation holds an anisotropic behavior, that is, pixels sharing similar attributes are preserved closer to each other while big discontinuities are naturally imposed on the boundary between image regions, thus ensuring better fitting on object boundaries. We show that the proposed approach significantly outperforms competing techniques both quantitatively as well as qualitatively, using the classical GrabCut dataset from Microsoft as a benchmark. While most of this research concentrates on the particular problem of segmenting an image, we also develop two new techniques to address the problem of image inpainting and photo colorization. Both methods couple the developed segmentation tools with other computer vision approaches in order to operate properly. / Segmentar uma image é visto nos dias de hoje como uma prerrogativa para melhorar a capacidade de sistemas de computador para realizar tarefas complexas de natureza cognitiva tais como detecção de objetos, reconhecimento de padrões e monitoramento de alvos. Esta pesquisa de doutorado visa estudar dois temas de fundamental importância no contexto de segmentação de imagens: clusterização espectral e segmentação interativa de imagens. Foram propostos dois novos algoritmos de segmentação dentro das linhas supracitadas, os quais se baseiam em operadores do Laplaciano, teoria espectral de grafos e na minimização de funcionais de energia. A eficácia de ambos os algoritmos pode ser constatada através de avaliações visuais das segmentações originadas, como também através de medidas quantitativas computadas com base nos resultados obtidos por técnicas do estado-da-arte em segmentação de imagens. Nosso primeiro algoritmo de segmentação, o qual ´e baseado na teoria espectral de grafos, combina técnicas de decomposição de imagens e medidas de similaridade em grafos em uma única e robusta ferramenta computacional. Primeiramente, um método de decomposição de imagens é aplicado para dividir a imagem alvo em duas componentes: textura e cartoon. Em seguida, um grafo de afinidade é gerado e pesos são atribuídos às suas arestas de acordo com uma função escalar proveniente de um operador de produto interno. Com base no grafo de afinidade, a imagem é então subdividida por meio do processo de corte espectral. Além disso, o resultado da segmentação pode ser refinado de forma interativa, mudando-se, desta forma, os pesos do grafo base. Experimentos visuais e numéricos foram conduzidos tomando-se por base métodos representativos do estado-da-arte e a clássica base de dados BSDS a fim de averiguar a eficiência da metodologia proposta. Ao contrário de grande parte dos métodos existentes de segmentação interativa, os quais são modelados por formulações matemáticas complexas que normalmente não garantem solução única para o problema de segmentação, nossa segunda metodologia aqui proposta é matematicamente simples de ser interpretada, fácil de implementar e ainda garante unicidade de solução. Além disso, o método proposto possui um comportamento anisotrópico, ou seja, pixels semelhantes são preservados mais próximos uns dos outros enquanto descontinuidades bruscas são impostas entre regiões da imagem onde as bordas são mais salientes. Como no caso anterior, foram realizadas diversas avaliações qualitativas e quantitativas envolvendo nossa técnica e métodos do estado-da-arte, tomando-se como referência a base de dados GrabCut da Microsoft. Enquanto a maior parte desta pesquisa de doutorado concentra-se no problema específico de segmentar imagens, como conteúdo complementar de pesquisa foram propostas duas novas técnicas para tratar o problema de retoque digital e colorização de imagens.
|
98 |
Graph Laplacian for spectral clustering and seeded image segmentation / Estudo do Laplaciano do grafo para o problema de clusterização espectral e segmentação interativa de imagensWallace Correa de Oliveira Casaca 05 December 2014 (has links)
Image segmentation is an essential tool to enhance the ability of computer systems to efficiently perform elementary cognitive tasks such as detection, recognition and tracking. In this thesis we concentrate on the investigation of two fundamental topics in the context of image segmentation: spectral clustering and seeded image segmentation. We introduce two new algorithms for those topics that, in summary, rely on Laplacian-based operators, spectral graph theory, and minimization of energy functionals. The effectiveness of both segmentation algorithms is verified by visually evaluating the resulting partitions against state-of-the-art methods as well as through a variety of quantitative measures typically employed as benchmark by the image segmentation community. Our spectral-based segmentation algorithm combines image decomposition, similarity metrics, and spectral graph theory into a concise and powerful framework. An image decomposition is performed to split the input image into texture and cartoon components. Then, an affinity graph is generated and weights are assigned to the edges of the graph according to a gradient-based inner-product function. From the eigenstructure of the affinity graph, the image is partitioned through the spectral cut of the underlying graph. Moreover, the image partitioning can be improved by changing the graph weights by sketching interactively. Visual and numerical evaluation were conducted against representative spectral-based segmentation techniques using boundary and partition quality measures in the well-known BSDS dataset. Unlike most existing seed-based methods that rely on complex mathematical formulations that typically do not guarantee unique solution for the segmentation problem while still being prone to be trapped in local minima, our segmentation approach is mathematically simple to formulate, easy-to-implement, and it guarantees to produce a unique solution. Moreover, the formulation holds an anisotropic behavior, that is, pixels sharing similar attributes are preserved closer to each other while big discontinuities are naturally imposed on the boundary between image regions, thus ensuring better fitting on object boundaries. We show that the proposed approach significantly outperforms competing techniques both quantitatively as well as qualitatively, using the classical GrabCut dataset from Microsoft as a benchmark. While most of this research concentrates on the particular problem of segmenting an image, we also develop two new techniques to address the problem of image inpainting and photo colorization. Both methods couple the developed segmentation tools with other computer vision approaches in order to operate properly. / Segmentar uma image é visto nos dias de hoje como uma prerrogativa para melhorar a capacidade de sistemas de computador para realizar tarefas complexas de natureza cognitiva tais como detecção de objetos, reconhecimento de padrões e monitoramento de alvos. Esta pesquisa de doutorado visa estudar dois temas de fundamental importância no contexto de segmentação de imagens: clusterização espectral e segmentação interativa de imagens. Foram propostos dois novos algoritmos de segmentação dentro das linhas supracitadas, os quais se baseiam em operadores do Laplaciano, teoria espectral de grafos e na minimização de funcionais de energia. A eficácia de ambos os algoritmos pode ser constatada através de avaliações visuais das segmentações originadas, como também através de medidas quantitativas computadas com base nos resultados obtidos por técnicas do estado-da-arte em segmentação de imagens. Nosso primeiro algoritmo de segmentação, o qual ´e baseado na teoria espectral de grafos, combina técnicas de decomposição de imagens e medidas de similaridade em grafos em uma única e robusta ferramenta computacional. Primeiramente, um método de decomposição de imagens é aplicado para dividir a imagem alvo em duas componentes: textura e cartoon. Em seguida, um grafo de afinidade é gerado e pesos são atribuídos às suas arestas de acordo com uma função escalar proveniente de um operador de produto interno. Com base no grafo de afinidade, a imagem é então subdividida por meio do processo de corte espectral. Além disso, o resultado da segmentação pode ser refinado de forma interativa, mudando-se, desta forma, os pesos do grafo base. Experimentos visuais e numéricos foram conduzidos tomando-se por base métodos representativos do estado-da-arte e a clássica base de dados BSDS a fim de averiguar a eficiência da metodologia proposta. Ao contrário de grande parte dos métodos existentes de segmentação interativa, os quais são modelados por formulações matemáticas complexas que normalmente não garantem solução única para o problema de segmentação, nossa segunda metodologia aqui proposta é matematicamente simples de ser interpretada, fácil de implementar e ainda garante unicidade de solução. Além disso, o método proposto possui um comportamento anisotrópico, ou seja, pixels semelhantes são preservados mais próximos uns dos outros enquanto descontinuidades bruscas são impostas entre regiões da imagem onde as bordas são mais salientes. Como no caso anterior, foram realizadas diversas avaliações qualitativas e quantitativas envolvendo nossa técnica e métodos do estado-da-arte, tomando-se como referência a base de dados GrabCut da Microsoft. Enquanto a maior parte desta pesquisa de doutorado concentra-se no problema específico de segmentar imagens, como conteúdo complementar de pesquisa foram propostas duas novas técnicas para tratar o problema de retoque digital e colorização de imagens.
|
99 |
Interactive segmentation of multiple 3D objects in medical images by optimum graph cuts = Segmentação interativa de múltiplos objetos 3D em imagens médicas por cortes ótimos em grafo / Segmentação interativa de múltiplos objetos 3D em imagens médicas por cortes ótimos em grafoMoya, Nikolas, 1991- 03 December 2015 (has links)
Orientador: Alexandre Xavier Falcão / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-27T14:45:13Z (GMT). No. of bitstreams: 1
Moya_Nikolas_M.pdf: 5706960 bytes, checksum: 9304544bfe8a78039de8b62562531865 (MD5)
Previous issue date: 2015 / Resumo: Segmentação de imagens médicas é crucial para extrair medidas de objetos 3D (estruturas anatômicas) que são úteis no diagnóstico e tratamento de doenças. Nestas aplicações, segmentação interativa é necessária quando métodos automáticos falham ou não são factíveis. Métodos por corte em grafo são considerados o estado da arte em segmentação interativa, mas diversas abordagens utilizam o algoritmo min-cut/max-flow, que é limitado à segmentação binária, sendo que segmentação de múltiplos objetos pode economizar tempo e esforço do usuário. Este trabalho revisita a transformada imagem floresta diferencial (DIFT, em inglês) -- uma abordagem por corte em grafo adequada para segmentação de múltiplos objetos -- resolvendo problemas relacionados a ela. O algoritmo da DIFT executa em tempo proporcional ao número de voxels nas regiões modificadas em cada execução da segmentação (sublinear). Tal característica é altamente desejável em segmentação interativa de imagens 3D para responder as ações do usuário em tempo real. O algoritmo da DIFT funciona da seguinte forma: o usuário desenha marcadores (traço com voxels de semente) rotulados dentro de cada objeto e o fundo, enquanto o computador interpreta a imagem como um grafo, cujos nós são os voxels e os arcos são definidos por pixels vizinhos, produzindo como resultado uma floresta de caminhos ótimos (partição na imagem) enraizada nos nós sementes do grafo. Nesta floresta, cada objeto é representado pela floresta de caminhos ótimos enraizado em suas sementes internas. Tais árvores são pintadas com a mesmo cor associada ao rótulo do marcador correspondente. Ao adicionar ou remover marcadores, é possível corrigir a segmentação até o mapa de rótulo de objeto representar o resultado desejado. Para garantir consistência na segmentação, métodos baseados em semente sempre devem manter a conectividade entre os voxels e suas sementes. Entretanto, isto não é mantido em algumas abordagens, como Random Walkers ou quando o mapa de rótulos é filtrado para suavizar a fronteira dos objetos. Esta conectividade é primordial para realizar correções sem recomeçar o processo depois de cada intervenção do usuário. Todavia, foi observado que a DIFT falha em manter consistência da segmentação em alguns casos. Consertamos este problema tanto no algoritmo da DIFT, quanto após a suavização dos objetos. Estes resultados comparam diversas estruturas anatômicas 3D de imagens de ressonância magnética e tomografia computadorizada / Abstract: Medical image segmentation is crucial to extract measures from 3D objects (body anatomical structures) that are useful for diagnosis and treatment of diseases. In such applications, interactive segmentation is necessary whenever automated methods fail or are not feasible. Graph-cut methods are considered the state of the art in interactive segmentation, but most approaches rely on the min-cut/max-flow algorithm, which is limited to binary segmentation while multi-object segmentation can considerably save user time and effort. This work revisits the differential image foresting transform (DIFT) ¿ a graph-cut approach suitable for multi-object segmentation in linear time ¿ and solves several problems related to it. Indeed, the DIFT algorithm can take time proportional to the number of voxels in the regions modified at each segmentation execution (sublinear time). Such a characteristic is highly desirable in 3D interactive segmentation to respond the user's actions as close as possible to real time. Segmentation using the DIFT works as follows: the user draws labeled markers (strokes of connected seed voxels) inside each object and background, while the computer interprets the image as a graph, whose nodes are the voxels and arcs are defined by neighboring voxels, and outputs an optimum-path forest (image partition) rooted at the seed nodes in the graph. In the forest, each object is represented by the optimum-path trees rooted at its internal seeds. Such trees are painted with same color associated to the label of the corresponding marker. By adding/removing markers, the user can correct segmentation until the forest (its object label map) represents the desired result. For the sake of consistency in segmentation, similar seed-based methods should always maintain the connectivity between voxels and seeds that have labeled them. However, this does not hold in some approaches, such as random walkers, or when the segmentation is filtered to smooth object boundaries. That connectivity is also paramount to make corrections without starting over the process at each user intervention. However, we observed that the DIFT algorithm fails in maintaining segmentation consistency in some cases. We have fixed this problem in the DIFT algorithm and when the obtained object boundaries are smoothed. These results are presented and evaluated on several 3D body anatomical structures from MR and CT images / Mestrado / Ciência da Computação / Mestre em Ciência da Computação
|
100 |
ANALYSIS OF VOCAL FOLD KINEMATICS USING HIGH SPEED VIDEOUnnikrishnan, Harikrishnan 01 January 2016 (has links)
Vocal folds are the twin in-folding of the mucous membrane stretched horizontally across the larynx. They vibrate modulating the constant air flow initiated from the lungs. The pulsating pressure wave blowing through the glottis is thus the source for voiced speech production. Study of vocal fold dynamics during voicing are critical for the treatment of voice pathologies. Since the vocal folds move at 100 - 350 cycles per second, their visual inspection is currently done by strobosocopy which merges information from multiple cycles to present an apparent motion. High Speed Digital Laryngeal Imaging(HSDLI) with a temporal resolution of up to 10,000 frames per second has been established as better suited for assessing the vocal fold vibratory function through direct recording. But the widespread use of HSDLI is limited due to lack of consensus on the modalities like features to be examined. Development of the image processing techniques which circumvents the need for the tedious and time consuming effort of examining large volumes of recording has room for improvement. Fundamental questions like the required frame rate or resolution for the recordings is still not adequately answered. HSDLI cannot get the absolute physical measurement of the anatomical features and vocal fold displacement. This work addresses these challenges through improved signal processing. A vocal fold edge extraction technique with subpixel accuracy, suited even for hard to record pediatric population is developed first. The algorithm which is equally applicable for pediatric and adult subjects, is implemented to facilitate user inspection and intervention. Objective features describing the fold dynamics, which are extracted from the edge displacement waveform are proposed and analyzed on a diverse dataset of healthy males, females and children. The sampling and quantization noise present in the recordings are analyzed and methods to mitigate them are investigated. A customized Kalman smoothing and spline interpolation on the displacement waveform is found to improve the feature estimation stability. The relationship between frame rate, spatial resolution and vibration for efficient capturing of information is derived. Finally, to address the inability to measure physical measurement, a structured light projection calibrated with respect to the endoscope is prototyped.
|
Page generated in 0.1352 seconds