• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 7
  • 3
  • 1
  • 1
  • 1
  • Tagged with
  • 12
  • 12
  • 5
  • 4
  • 4
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Evolving spiking neural networks for adaptive audiovisual pattern recognition

Wysoski, Simei Gomes Unknown Date (has links)
This dissertation presents new modular and integrative information methods and systems inspired by the way the brain performs information processing, in particular, pattern recognition. The proposed artificial systems use spiking neurons as basic elements, which are the key components of spiking neural networks. Of particular interest to this research are various spiking neural network architectures and learning procedures that permit different pattern recognition problems to be solved in an evolvable and adaptive way. Spiking neural networks are used to model human visual and auditory pathways and are trained to perform the specific task of person authentication. The systems are individually tuned and trained to recognize facial information and to analyze sound signals from spoken sentences. The modelling of the integration of different sources of information (multisensory integration) using spiking neural networks is also a subject of investigation. A network architecture is proposed and a model for audiovisual pattern recognition is designed as an example. The main original contributions of this thesis are: a) Evaluation and further extension of adaptive learning procedures to perform visual pattern recognition. A new learning procedure that enables the system to change its structure, creating/merging neuronal maps of spiking neurons is presented and evaluated on a face recognition problem. b) Design of two new spiking neural network architectures to perform person authentication through the processing of speech signals. c) Design and evaluation of a new architecture that integrates sensory modalities based on spiking neurons. The integrative architecture combines opinions from individual modalities within a supramodal layer, which contains neurons sensitive to multiple sensory information. An additional feature that increases biological relevance is the crossmodal coupling of modalities, which effectively enables a given sensory modality to exert direct influence upon the processing areas typically related to other modalities. The contributions were published in one journal paper and in four refereed international conference proceedings. The proposed system designs were implemented and, through computer simulations, demonstrated comparable performance with traditional benchmarking methods. The systems have some promising features: they can be naturally optimized in respect to different criteria: accuracy (when very accurate results are expected), energy efficiency (when management of resources play an important role), and speed (when a decision needs to be made within a limited time). In this thesis, most of the parameters have been exhaustively optimized by hand or by using simple heuristics. As a direction for future work, there is an opportunity to include automated, specially tailored parameters optimization procedures or even general-purpose optimization algorithms, e.g., Genetic Algorithms and Particle Swarm Optimization. Overall, the results obtained in this thesis clearly indicate that it is indeed possible to have fast and accurate adaptive pattern recognition systems scalable for multiple modalities computing with simple models of spiking neurons. However, it is important to advance the theory of spiking neurons to take advantage of its biological relevance to reach similar or better performance when compared to the human brain, for instance, exploring new neuron models, information coding schemes and network connectivity.
2

植基於質感圖樣之自動化人機區分機制 / A CAPTCHA Mechanism Based on Textured Patterns

張繼志, Chi-Chih Chang Unknown Date (has links)
隨著科技的進步與資訊科學的發展,大量的資訊處理自動化逐漸取代傳統人工技術,然而不恰當地使用自動化技術,卻可能危害人類的權益與空間。為避免過度濫用機器自動化對人類所造成的災害,本研究根據不同的適用情境,分別提出以靜態及動態圖型為基礎的人機區分方法,透過簡單的影像處理技術,產生機器難以分析但人類能夠易於判別的人機辨識影像。並且由認知的角度,設計實驗進一步探討人類視覺優勢以及接受度,作為影像產生時的標準。最後,提出人機區分技術與應用情境整合實作的方法,以觀實效。 / The idea of using a computer program to distinguish humans from machines, sometimes referred to as the “Reverse Turing Test”, has emerged only quite recently. The term CAPTCHA, which stands for “Completely Automated Public Turing Test to Tell Computers and Humans Apart", is defined as: “a program that can generate and grade tests that: □ Most human can pass but □ Current computer program can’t pass! “ In this thesis, a texture-image based approach is developed to encode text information in such a way that machine vision algorithms will experience significant difficulties while human can extract the embedded text effortlessly. Both static images and dynamic sequences will be explored. It is anticipated that the cost of storing, and subsequently decoding information from such visual patterns will be prohibitedly high, both in terms of time and space complexity. To validate the postulation, fundamental principles of the human cognitive process will be examined. Experiments will also be carried out to gather user feedback and investigate the limitations of human visual systems. Finally, several application scenarios that call for the integration of a CAPTCHA will be identified and discussed.
3

Evolving spiking neural networks for adaptive audiovisual pattern recognition

Wysoski, Simei Gomes Unknown Date (has links)
This dissertation presents new modular and integrative information methods and systems inspired by the way the brain performs information processing, in particular, pattern recognition. The proposed artificial systems use spiking neurons as basic elements, which are the key components of spiking neural networks. Of particular interest to this research are various spiking neural network architectures and learning procedures that permit different pattern recognition problems to be solved in an evolvable and adaptive way. Spiking neural networks are used to model human visual and auditory pathways and are trained to perform the specific task of person authentication. The systems are individually tuned and trained to recognize facial information and to analyze sound signals from spoken sentences. The modelling of the integration of different sources of information (multisensory integration) using spiking neural networks is also a subject of investigation. A network architecture is proposed and a model for audiovisual pattern recognition is designed as an example. The main original contributions of this thesis are: a) Evaluation and further extension of adaptive learning procedures to perform visual pattern recognition. A new learning procedure that enables the system to change its structure, creating/merging neuronal maps of spiking neurons is presented and evaluated on a face recognition problem. b) Design of two new spiking neural network architectures to perform person authentication through the processing of speech signals. c) Design and evaluation of a new architecture that integrates sensory modalities based on spiking neurons. The integrative architecture combines opinions from individual modalities within a supramodal layer, which contains neurons sensitive to multiple sensory information. An additional feature that increases biological relevance is the crossmodal coupling of modalities, which effectively enables a given sensory modality to exert direct influence upon the processing areas typically related to other modalities. The contributions were published in one journal paper and in four refereed international conference proceedings. The proposed system designs were implemented and, through computer simulations, demonstrated comparable performance with traditional benchmarking methods. The systems have some promising features: they can be naturally optimized in respect to different criteria: accuracy (when very accurate results are expected), energy efficiency (when management of resources play an important role), and speed (when a decision needs to be made within a limited time). In this thesis, most of the parameters have been exhaustively optimized by hand or by using simple heuristics. As a direction for future work, there is an opportunity to include automated, specially tailored parameters optimization procedures or even general-purpose optimization algorithms, e.g., Genetic Algorithms and Particle Swarm Optimization. Overall, the results obtained in this thesis clearly indicate that it is indeed possible to have fast and accurate adaptive pattern recognition systems scalable for multiple modalities computing with simple models of spiking neurons. However, it is important to advance the theory of spiking neurons to take advantage of its biological relevance to reach similar or better performance when compared to the human brain, for instance, exploring new neuron models, information coding schemes and network connectivity.
4

Effet de l'organisation des informations visuelles et de l'expertise sur les stratégies d'exploration visuelle dans un paradigme multitâches / Impact of visual elements organisation and the expertise on visual's exploration strategy on multitask paradigm

Dusaucy, Valériane 16 December 2016 (has links)
L'objectif de ce travail est de montrer que l’'expertise peut-être une solution aux baisses de performances observées dans un paradigme de multitâche. Beaucoup d'expériences sur l'expertise expliquent que les experts vont mobiliser leurs connaissances antérieurement acquises afin de réaliser une tâche (stratégie top-down). Au contraire des novices, qui vont explorer le matériel de manière plus exhaustive en partant uniquement de l’objectif de la tâche à réaliser (stratégie bottom-up). Nous avons réalisé trois études. La première porte sur les patterns visuels des experts du jeu WoW. Alors que les experts devaient mémoriser des éléments d’une vidéo (simple vs complexe) tout en écoutant une histoire, on observe que lorsque le matériel est simple, les experts mettent en place une stratégie top-down, au contraire lorsque la tâche est plus complexe ils vont retourner progressivement à une stratégie bottom-up. Les novices, quant à eux suivront une stratégie bottom-up tout le long de l’expérience. La saillance capturerait le regard des novices, au contraire des experts qui arriveraient à l'inhiber. Grâce à ces résultats, nous avons mis à jour une grille d’heuristiques ergonomique. Enfin, nous avons étudié les patterns visuels dans un environnement plus écologique comme la réservation de billets d’avion en ligne. Les résultats, tout comme ceux de la première expérience, montrent le même type de pattern visuel trouvé dans les recherches n’impliquant que l’expertise. Les experts dans un domaine seraient aussi experts en multitâche dans ce domaine. De plus, la dernière expérience montre que les experts, quelques soient la charge de travail, mettent en place des stratégies top-down. / The aim of this project is to show how expertise can be a solution to the decrease of the performance in a multitasking paradigm. A lot of experience on the expertise explain that these experts will use their knowledge previously learned to explore an interface (top-down strategy). Unlike novices, who will explore the material in a more exhaustive way based on the objective of the task to be achieved (bottom-up strategy).We have carried out three studies to answer this problem. We studied the visual patterns of the WoW game experts, while asking them to memorize elements of a video (simple vs complex) and listening a story . We observe ambivalence in the strategy of exploration. Effectively, when the material is simple, experts will use a top-down strategy, and progressively with the complexity of the task, they will return to a bottom-up strategy. The novices, meanwhile, will follow a bottom-up strategy throughout the experience. The study of the map of salience shows that the attention of the novices will be captured by this one. This is not the case for experts who will inhibit it and explore the most important information for the task. From these results, we have updated an usability heuristic grid. Finally, we have completed this research by studying visual patterns in a more ecological context as booking airline tickets online. The results, like those of the first experiment, show the same type of visual pattern found in research involving only expertise. Experts in one field would also be experts in multitasking in this field. Moreover, in the last research, we found experts use all long these top-down’ strategy even on the more complex condition.
5

Gravobraduras: processos de impressão e objetos de estrutura dobrada / Gravobraduras: printmaking processes and folding objects

Ávila, Eduardo Araújo de 27 March 2014 (has links)
Submitted by Erika Demachki (erikademachki@gmail.com) on 2014-12-30T18:43:27Z No. of bitstreams: 3 Dissertação - Eduardo Araujo de Avila - 2014 - parte 01.pdf: 8646531 bytes, checksum: 2d251c318b31526d24e9168520afd0df (MD5) Dissertação - Eduardo Araujo de Avila - 2014 - parte 02.pdf: 17246080 bytes, checksum: 67dcd9e7820c5cc116d527a40d07cc51 (MD5) license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) / Approved for entry into archive by Erika Demachki (erikademachki@gmail.com) on 2014-12-30T18:43:57Z (GMT) No. of bitstreams: 3 Dissertação - Eduardo Araujo de Avila - 2014 - parte 01.pdf: 8646531 bytes, checksum: 2d251c318b31526d24e9168520afd0df (MD5) Dissertação - Eduardo Araujo de Avila - 2014 - parte 02.pdf: 17246080 bytes, checksum: 67dcd9e7820c5cc116d527a40d07cc51 (MD5) license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) / Made available in DSpace on 2014-12-30T18:43:57Z (GMT). No. of bitstreams: 3 Dissertação - Eduardo Araujo de Avila - 2014 - parte 01.pdf: 8646531 bytes, checksum: 2d251c318b31526d24e9168520afd0df (MD5) Dissertação - Eduardo Araujo de Avila - 2014 - parte 02.pdf: 17246080 bytes, checksum: 67dcd9e7820c5cc116d527a40d07cc51 (MD5) license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) Previous issue date: 2014-03-27 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / This work presents the outcomes of a research master’s degree in Art and Visual Culture, which consists of experiments in printmaking and in studies on the use of foldings as matrices, whose poetic concept is defined as “gravobraduras”. The study has as reference the wrapping culture (tsutsumu bunka), which involves not only the study of origami, but also about other practices, such as furoshiki, that include the production of involucres and wrappings. The reference in question alludes to “practice of Japanese gift” and the artistic production of this work to the “wanderings of Japanese immigrants in Brazil”. Therefore, the primary objective of this research is to analyze what are the paths, sometimes errant sometimes constant, in which the image generated by the “matrix folding” travels up to become a visual pattern, but also evaluate the development of the artistic production in three methods of printing: by high relief (matrix folding), by permeation (silkscreen) and by digital means. The research also involves reflections on artistic creation as creating networks, according to Cecilia Salles, who emphasizes that are important the concomitant actions and the establishment of links between relevant subjects to the research to generate its own methodology and poetic. / Este trabalho apresenta os (des)dobramentos de uma pesquisa de mestrado em Arte e Cultura Visual, que consiste nas experimentações em gravura e nos estudos sobre o uso de dobraduras como matrizes, cujo conceito poético é definido como “gravobraduras”. O estudo tem como referencial a cultura do invólucro (tsutsumu bunka), e que envolve não apenas os estudos sobre origami, mas também sobre outras práticas, como o furoshiki, que abrangem técnicas de produção de envoltórios e embrulhos. O referencial em questão alude à “prática japonesa de presentear” e a produção artística deste trabalho às “errâncias dos imigrantes japoneses pelo Brasil”. Assim sendo, o objetivo primordial dessa pesquisa é analisar quais são os trajetos, ora errantes ora constantes, que a imagem gerada pela “matriz dobrada” percorre até tornar-se um padrão visual. Também avaliar o desenvolvimento da produção artística em três métodos de impressão: por alto relevo (matriz dobrada), por permeação (serigrafia) e por meio digital (sublimação). A pesquisa também envolve reflexões sobre a criação artística como redes de criação que, segundo Cecília Salles, destaca como importantes as ações concomitantes e o estabelecimento de nexos entre assuntos relevantes à pesquisa para a geração de metodologia e poética próprias.
6

Modelação de fenômenos de plasticidade rápida no sistema visual de mamíferos / Modeling Fast Plasticity Phenomena in the Mammalian Primary Visual Cortex

Oliveira, Rodrigo Freire 09 October 2006 (has links)
Neurônios do córtex visual primário (V1) são seletivos à orientação, direção e freqüência espacial de estímulos apresentados em seus campos receptivos. Os últimos 40 anos acumularam uma quantidade considerável de teorias e dados sobre o processamento cortical de seletividade. Apesar disso, um consenso sobre os mecanismos que geram preferência a orientação, uma das características mais marcantes do processamento visual inicial, ainda está longe de ser atingido. Este cenário torna-se ainda mais interessante quando se considera evidências recentes de plasticidade operando em diferentes escalas temporais em estágios iniciais como V1, que resultam em uma organização dinâmica da seletividade à orientação que se pensava rígida e inflexível no córtex adulto até então. Neste trabalho, descreve-se a construção de um modelo neuronal do córtex visual de primatas composto de 6 camadas corticais representando o canal M de processamento visual. As características fisiológicas e neuroanatômicas do modelo foram derivadas a partir de dados experimentais do sistema visual de primatas. Na primeira parte deste trabalho, o perfil de seletividade à orientação do modelo é apresentado e comparado com resultados experimentais. Os neurônios modelados apresentaram diversidade em seus padrões de seletividade a orientação consistente com dados experimentais (medidos com ISO, VC, MBA). Esta diversidade reflete a heterogeneidade de classes eletrofisiológicas presente no modelo e os diferentes padrões de circuitaria laminar. Na segunda parte examina-se o papel de plasticidade de curto termo na circuitaria intracortical na alteração dinâmica dos perfis de seletividade orientação. Depressão e deslocamento da resposta na vizinhança da orientação preferida foram observados mas não aumento em pontos distantes. Os neurônios simulados apresentaram alguma diversidade nos perfis de plasticidade de curto prazo restrita a camadas com com alta densidade de células com disparo em rajada. / V1 neurons are selective for the orientation, direction and spatial frequency of stimuli presented at their receptive fields. The last 40 years have witnessed the accumulation of a considerable amount of theory and data about the cortical processing of feature selectivity. Yet the mechanisms that underly orientation preference, one of the most conspicuous features of early visual cortical processing, remain far from reaching a consensus. This landscape gets even richer with the recent recognition of different time scales of plasticity operating as early as V1 resulting in a dynamic organization of orientation selectivity previously thought to be rigid and unmodifiable in the adult cortex. In this work we present a spiking neuron model of the primate primary visual cortex composed of 6 cortical layers, representing the M channel of visual processing. The physiological and architectural properties of the model were derived from experimental data for the primate visual pathway. In the first part we present the orientation selectivity profile of the model and discuss its relationship to experimental reports. Neurons have shown a diversity of orientation selectivity dependent responses consistent with data (measured with OSI, CV, HWB). This diversity is thought to reflect the electrophysiological heterogeneity of model cortical cells and the different patterns of laminar circuitry. In the second part of this study we examine the role of shortterm plasticity of the intracortical circuitry in the dynamic modification of orientation selectivity profiles. Depression and shift around preferred orientation but not enhancement at the far flank of the tuning curves are observed. Simulated neurons have also shown some diversity in short-term plasticity restricted to layers with high density of bursting cells.
7

Análise de sinais de voz por padrões visuais de dinâmica vocal / Voice signal analysis using vocal dynamic visual patterns

Dajer, Maria Eugenia 30 July 2010 (has links)
O objetivo deste trabalho foi avaliar vozes saudáveis e com alterações patológicas aplicando análise de Padrões Visuais de Dinâmica Vocal (PVDV) em conjunto com análise acústica e análise perceptivo-auditiva. Foram avaliadas 91 vozes da vogal sustentada /a/ do português do Brasil, de sujeitos de ambos os gêneros com idades entre 21 e 88 anos. As vozes foram gravadas com taxa de amostragem de 22.050 Hz, 16 bits, mono canal e formato WAV. Foram obtidos valores de jitter, shimmer e freqüência fundamental. Para análise perceptivo-auditiva foram avaliadas rugosidade, soprosidade, tensão e instabilidade. Para descrever a dinâmica dos sinais de voz dos PVDV foi utilizada a técnica de reconstrução de espaço de fase e foram analisados qualitativamente os parâmetros de loops, regularidade e convergência de traçados. Foram aplicados testes estatísticos paramétricos e não paramétricos. Os resultados demonstram que jitter apresenta uma correlação negativa com loop, regularidade e convergência dos traçados; e que shimmer tem correlação negativa com convergência e loops. As características de rugosidade e soprosidade estão correlacionadas negativamente com os três parâmetros dinâmicos. A análise qualitativa dos PVDV é uma técnica promissora, por considerar os componentes caótico e determinístico da voz. É sugerido que não substitui as técnicas existentes, embora possa aperfeiçoar e complementar os métodos usados por profissionais fonoaudiólogos e otorrinolaringologistas. / The aim of this research was to analyze healthy and pathologic voices using Vocal Dynamic Visual Patterns (VDVP) in combination with acoustical and perceptual analysis. Ninety one voice signals of sustained vowel /a/ from Brazilian Portuguese, from male and female patients, were analyzed using acoustical analysis, perceptual analysis and Vocal Dynamic Visual Patterns (VDVP) analysis. All voice samples were quantized in amplitude with 16 bits and recorded in mono-channel WAV format. The sampling frequency was 22050 Hz. Acoustical values for jitter, shimmer and fundamental frequency were obtained. Roughness, breathiness, strain and irregularity were analyzed for perceptual analysis. Phase space reconstruction technique was performed in order to describe the voice signal nonlinear characteristics by Vocal Dynamic Visual Patterns. Results showed negative correlation for jitter and the 3 dynamic parameters, as well as, for shimmer with convergence and loops. Roughness and breathiness were negative correlated with dynamical parameters. Vocal Dynamic Visual Pattern analysis is a promising technique for voice evaluation; including voice chaotic and deterministic components. It is suggested that visual pattern analysis do not replace the existing voice analysis techniques, although it complements and improves the voice evaluation methods available for speech therapist and laryngologists.
8

Modelação de fenômenos de plasticidade rápida no sistema visual de mamíferos / Modeling Fast Plasticity Phenomena in the Mammalian Primary Visual Cortex

Rodrigo Freire Oliveira 09 October 2006 (has links)
Neurônios do córtex visual primário (V1) são seletivos à orientação, direção e freqüência espacial de estímulos apresentados em seus campos receptivos. Os últimos 40 anos acumularam uma quantidade considerável de teorias e dados sobre o processamento cortical de seletividade. Apesar disso, um consenso sobre os mecanismos que geram preferência a orientação, uma das características mais marcantes do processamento visual inicial, ainda está longe de ser atingido. Este cenário torna-se ainda mais interessante quando se considera evidências recentes de plasticidade operando em diferentes escalas temporais em estágios iniciais como V1, que resultam em uma organização dinâmica da seletividade à orientação que se pensava rígida e inflexível no córtex adulto até então. Neste trabalho, descreve-se a construção de um modelo neuronal do córtex visual de primatas composto de 6 camadas corticais representando o canal M de processamento visual. As características fisiológicas e neuroanatômicas do modelo foram derivadas a partir de dados experimentais do sistema visual de primatas. Na primeira parte deste trabalho, o perfil de seletividade à orientação do modelo é apresentado e comparado com resultados experimentais. Os neurônios modelados apresentaram diversidade em seus padrões de seletividade a orientação consistente com dados experimentais (medidos com ISO, VC, MBA). Esta diversidade reflete a heterogeneidade de classes eletrofisiológicas presente no modelo e os diferentes padrões de circuitaria laminar. Na segunda parte examina-se o papel de plasticidade de curto termo na circuitaria intracortical na alteração dinâmica dos perfis de seletividade orientação. Depressão e deslocamento da resposta na vizinhança da orientação preferida foram observados mas não aumento em pontos distantes. Os neurônios simulados apresentaram alguma diversidade nos perfis de plasticidade de curto prazo restrita a camadas com com alta densidade de células com disparo em rajada. / V1 neurons are selective for the orientation, direction and spatial frequency of stimuli presented at their receptive fields. The last 40 years have witnessed the accumulation of a considerable amount of theory and data about the cortical processing of feature selectivity. Yet the mechanisms that underly orientation preference, one of the most conspicuous features of early visual cortical processing, remain far from reaching a consensus. This landscape gets even richer with the recent recognition of different time scales of plasticity operating as early as V1 resulting in a dynamic organization of orientation selectivity previously thought to be rigid and unmodifiable in the adult cortex. In this work we present a spiking neuron model of the primate primary visual cortex composed of 6 cortical layers, representing the M channel of visual processing. The physiological and architectural properties of the model were derived from experimental data for the primate visual pathway. In the first part we present the orientation selectivity profile of the model and discuss its relationship to experimental reports. Neurons have shown a diversity of orientation selectivity dependent responses consistent with data (measured with OSI, CV, HWB). This diversity is thought to reflect the electrophysiological heterogeneity of model cortical cells and the different patterns of laminar circuitry. In the second part of this study we examine the role of shortterm plasticity of the intracortical circuitry in the dynamic modification of orientation selectivity profiles. Depression and shift around preferred orientation but not enhancement at the far flank of the tuning curves are observed. Simulated neurons have also shown some diversity in short-term plasticity restricted to layers with high density of bursting cells.
9

Análise de sinais de voz por padrões visuais de dinâmica vocal / Voice signal analysis using vocal dynamic visual patterns

Maria Eugenia Dajer 30 July 2010 (has links)
O objetivo deste trabalho foi avaliar vozes saudáveis e com alterações patológicas aplicando análise de Padrões Visuais de Dinâmica Vocal (PVDV) em conjunto com análise acústica e análise perceptivo-auditiva. Foram avaliadas 91 vozes da vogal sustentada /a/ do português do Brasil, de sujeitos de ambos os gêneros com idades entre 21 e 88 anos. As vozes foram gravadas com taxa de amostragem de 22.050 Hz, 16 bits, mono canal e formato WAV. Foram obtidos valores de jitter, shimmer e freqüência fundamental. Para análise perceptivo-auditiva foram avaliadas rugosidade, soprosidade, tensão e instabilidade. Para descrever a dinâmica dos sinais de voz dos PVDV foi utilizada a técnica de reconstrução de espaço de fase e foram analisados qualitativamente os parâmetros de loops, regularidade e convergência de traçados. Foram aplicados testes estatísticos paramétricos e não paramétricos. Os resultados demonstram que jitter apresenta uma correlação negativa com loop, regularidade e convergência dos traçados; e que shimmer tem correlação negativa com convergência e loops. As características de rugosidade e soprosidade estão correlacionadas negativamente com os três parâmetros dinâmicos. A análise qualitativa dos PVDV é uma técnica promissora, por considerar os componentes caótico e determinístico da voz. É sugerido que não substitui as técnicas existentes, embora possa aperfeiçoar e complementar os métodos usados por profissionais fonoaudiólogos e otorrinolaringologistas. / The aim of this research was to analyze healthy and pathologic voices using Vocal Dynamic Visual Patterns (VDVP) in combination with acoustical and perceptual analysis. Ninety one voice signals of sustained vowel /a/ from Brazilian Portuguese, from male and female patients, were analyzed using acoustical analysis, perceptual analysis and Vocal Dynamic Visual Patterns (VDVP) analysis. All voice samples were quantized in amplitude with 16 bits and recorded in mono-channel WAV format. The sampling frequency was 22050 Hz. Acoustical values for jitter, shimmer and fundamental frequency were obtained. Roughness, breathiness, strain and irregularity were analyzed for perceptual analysis. Phase space reconstruction technique was performed in order to describe the voice signal nonlinear characteristics by Vocal Dynamic Visual Patterns. Results showed negative correlation for jitter and the 3 dynamic parameters, as well as, for shimmer with convergence and loops. Roughness and breathiness were negative correlated with dynamical parameters. Vocal Dynamic Visual Pattern analysis is a promising technique for voice evaluation; including voice chaotic and deterministic components. It is suggested that visual pattern analysis do not replace the existing voice analysis techniques, although it complements and improves the voice evaluation methods available for speech therapist and laryngologists.
10

VISUAL SEMANTIC SEGMENTATION AND ITS APPLICATIONS

Gao, Jizhou 01 January 2013 (has links)
This dissertation addresses the difficulties of semantic segmentation when dealing with an extensive collection of images and 3D point clouds. Due to the ubiquity of digital cameras that help capture the world around us, as well as the advanced scanning techniques that are able to record 3D replicas of real cities, the sheer amount of visual data available presents many opportunities for both academic research and industrial applications. But the mere quantity of data also poses a tremendous challenge. In particular, the problem of distilling useful information from such a large repository of visual data has attracted ongoing interests in the fields of computer vision and data mining. Structural Semantics are fundamental to understanding both natural and man-made objects. Buildings, for example, are like languages in that they are made up of repeated structures or patterns that can be captured in images. In order to find these recurring patterns in images, I present an unsupervised frequent visual pattern mining approach that goes beyond co-location to identify spatially coherent visual patterns, regardless of their shape, size, locations and orientation. First, my approach categorizes visual items from scale-invariant image primitives with similar appearance using a suite of polynomial-time algorithms that have been designed to identify consistent structural associations among visual items, representing frequent visual patterns. After detecting repetitive image patterns, I use unsupervised and automatic segmentation of the identified patterns to generate more semantically meaningful representations. The underlying assumption is that pixels capturing the same portion of image patterns are visually consistent, while pixels that come from different backdrops are usually inconsistent. I further extend this approach to perform automatic segmentation of foreground objects from an Internet photo collection of landmark locations. New scanning technologies have successfully advanced the digital acquisition of large-scale urban landscapes. In addressing semantic segmentation and reconstruction of this data using LiDAR point clouds and geo-registered images of large-scale residential areas, I develop a complete system that simultaneously uses classification and segmentation methods to first identify different object categories and then apply category-specific reconstruction techniques to create visually pleasing and complete scene models.

Page generated in 0.4564 seconds