Global ETD Search

191	Bayes Optimal Feature Selection for Supervised Learning Saneem Ahmed, C G January 2014 (has links) (PDF) The problem of feature selection is critical in several areas of machine learning and data analysis such as, for example, cancer classification using gene expression data, text categorization, etc. In this work, we consider feature selection for supervised learning problems, where one wishes to select a small set of features that facilitate learning a good prediction model in the reduced feature space. Our interest is primarily in filter methods that select features independently of the learning algorithm to be used and are generally faster to implement compared to other types of feature selection algorithms. Many common filter methods for feature selection make use of information-theoretic criteria such as those based on mutual information to guide their search process. However, even in simple binary classification problems, mutual information based methods do not always select the best set of features in terms of the Bayes error. In this thesis, we develop a general approach for selecting a set of features that directly aims to minimize the Bayes error in the reduced feature space with respect to the loss or performance measure of interest. We show that the mutual information based criterion is a special case of our setting when the loss function of interest is the logarithmic loss for class probability estimation. We give a greedy forward algorithm for approximately optimizing this criterion and demonstrate its application to several supervised learning problems including binary classification (with 0-1 error, cost-sensitive error, and F-measure), binary class probability estimation (with logarithmic loss), bipartite ranking (with pairwise disagreement loss), and multiclass classification (with multiclass 0-1 error). Our experiments suggest that the proposed approach is competitive with several state-of-the art methods. Data Analysis Logarithms Supervised Learning Bayes Optimality Binary Classsification Bipartite Ranking Multiclass Classification Bayes Optimal Feature Selection Optimal Feature Selection Bayes Error Binary Class Probability Estimation Supervised Learning Problems Computer Science
192	Spike-Based Bayesian-Hebbian Learning in Cortical and Subcortical Microcircuits Tully, Philip January 2017 (has links) Cortical and subcortical microcircuits are continuously modified throughout life. Despite ongoing changes these networks stubbornly maintain their functions, which persist although destabilizing synaptic and nonsynaptic mechanisms should ostensibly propel them towards runaway excitation or quiescence. What dynamical phenomena exist to act together to balance such learning with information processing? What types of activity patterns do they underpin, and how do these patterns relate to our perceptual experiences? What enables learning and memory operations to occur despite such massive and constant neural reorganization? Progress towards answering many of these questions can be pursued through large-scale neuronal simulations. In this thesis, a Hebbian learning rule for spiking neurons inspired by statistical inference is introduced. The spike-based version of the Bayesian Confidence Propagation Neural Network (BCPNN) learning rule involves changes in both synaptic strengths and intrinsic neuronal currents. The model is motivated by molecular cascades whose functional outcomes are mapped onto biological mechanisms such as Hebbian and homeostatic plasticity, neuromodulation, and intrinsic excitability. Temporally interacting memory traces enable spike-timing dependence, a stable learning regime that remains competitive, postsynaptic activity regulation, spike-based reinforcement learning and intrinsic graded persistent firing levels. The thesis seeks to demonstrate how multiple interacting plasticity mechanisms can coordinate reinforcement, auto- and hetero-associative learning within large-scale, spiking, plastic neuronal networks. Spiking neural networks can represent information in the form of probability distributions, and a biophysical realization of Bayesian computation can help reconcile disparate experimental observations. / <p>QC 20170421</p> Bayes' rule synaptic plasticity and memory modeling intrinsic excitability naïve Bayes classifier spiking neural networks Hebbian learning neuromorphic engineering reinforcement learning temporal sequence learning attractor network Computer Systems Datorsystem
193	[en] A STUDY OF MULTILABEL TEXT CLASSIFICATION ALGORITHMS USING NAIVE-BAYES / [pt] UM ESTUDO DE ALGORITMOS PARA CLASSIFICAÇÃO AUTOMÁTICA DE TEXTOS UTILIZANDO NAIVE-BAYES DAVID STEINBRUCH 12 March 2007 (has links) [pt] A quantidade de informação eletrônica vem crescendo de forma acelerada, motivada principalmente pela facilidade de publicação e divulgação que a Internet proporciona. Desta forma, é necessária a organização da informação de forma a facilitar a sua aquisição. Muitos trabalhos propuseram resolver este problema através da classificação automática de textos associando a eles vários rótulos (classificação multirótulo). No entanto, estes trabalhos transformam este problema em subproblemas de classificação binária, considerando que existe independência entre as categorias. Além disso, utilizam limiares (thresholds), que são muito específicos para o conjunto de treinamento utilizado, não possuindo grande capacidade de generalização na aprendizagem. Esta dissertação propõe dois algoritmos de classificação automática de textos baseados no algoritmo multinomial naive Bayes e sua utilização em um ambiente on-line de classificação automática de textos com realimentação de relevância pelo usuário. Para testar a eficiência dos algoritmos propostos, foram realizados experimentos na base de notícias Reuters 21758 e na base de documentos médicos Ohsumed. / [en] The amount of electronic information has been growing fast, mainly due to the easiness of publication and spreading that Internet provides. Therefore, is necessary the organisation of information to facilitate its retrieval. Many works have solved this problem through the automatic text classification, associating to them several labels (multilabel classification). However, those works have transformed this problem into binary classification subproblems, considering there is not dependence among categories. Moreover, they have used thresholds, which are very sepecific of the classifier document base, and so, does not have great generalization capacity in the learning process. This thesis proposes two text classifiers based on the multinomial algorithm naive Bayes and its usage in an on-line text classification environment with user relevance feedback. In order to test the proposed algorithms efficiency, experiments have been performed on the Reuters 21578 news base, and on the Ohsumed medical document base. [pt] APRENDIZADO DE MAQUINA [en] MACHINE LEARNING [pt] INTERNET [en] INTERNET [pt] CATEGORIZACAO DE TEXTOS [en] TEXT CATEGORIZATION [pt] CLASSIFICACAO DE TEXTOS [en] TEXT CLASSIFICATION [pt] MULTIROTULO [en] MULTILABEL [pt] NAIVE-BAYES [en] NAIVE-BAYES
194	Estimativa do valor da taxa de penetrância em doenças autossômicas dominantes: estudo teórico de modelos e desenvolvimento de um programa computacional / Penetrance rate estimation for autosomal dominant diseases: study of models and development of a computer program Horimoto, Andréa Roselí Vançan Russo 17 September 2009 (has links) O objetivo principal do trabalho foi o desenvolvimento de um programa computacional, em linguagem Microsoft Visual Basic 6.0 (versão executável), para estimativa da taxa de penetrância a partir da análise de genealogias com casos de doenças com herança autossômica dominante. Embora muitos dos algoritmos empregados no programa tenham se baseado em idéias já publicadas na literatura (em sua maioria por pesquisadores e pós-graduandos do Laboratório de Genética Humana do Instituto de Biociências da Universidade de São Paulo), desenvolvemos alguns métodos inéditos para lidar com situações encontradas com certa frequência nos heredogramas publicados na literatura, como: a) ausência de informações sobre o fenótipo do indivíduo gerador da genealogia; b) agrupamento de árvores de indivíduos normais sem a descrição da distribuição de filhos entre os progenitores; c) análise de estruturas da genealogia contendo uniões consanguíneas, utilizando um método alternativo ao descrito na literatura; d) determinação de soluções gerais para as funções de verossimilhança de árvores de indivíduos normais com ramificação regular e para as probabilidades de heterozigose de qualquer indivíduo pertencente a essas árvores. Além da versão executável, o programa, denominado PenCalc, é apresentado também numa versão para Internet (PenCalc Web), a qual fornece adicionalmente as probabilidades de heterozigose e o cálculo de afecção na prole de todos os indivíduos da genealogia. Essa versão pode ser acessada livre e gratuitamente no endereço http://www.ib.usp.br/~otto/pencalcweb. Desenvolvemos também um modelo com taxa de penetrância variável dependente da geração, uma vez que a inspeção de famílias com doenças autossômicas dominantes, como é o caso da síndrome da ectrodactilia associada à hemimelia tibial (EHT), sugere a existência de um fenômeno similar à antecipação, em relação à taxa de penetrância. Os modelos com taxa de penetrância constante e variável, e os métodos desenvolvidos neste trabalho foram aplicados a 21 heredogramas de famílias com afetados pela EHT e ao conjunto das informações de todas essas genealogias (meta-análise), obtendo-se em todos os casos estimativas da taxa de penetrância. / The main objective of this dissertation was the development of a computer program, in Microsoft® Visual Basic® 6.0, for estimating the penetrance rate of autosomal dominant diseases by means of the information contained on genealogies. Some of the algorithms we used in the program were based on ideas already published in the literature by researchers and (post-) graduate students of the Laboratory of Human Genetics, Department of Genetics and Evolutionary Biology, Institute of Biosciences, University of São Paulo. We developed several other methods to deal with particular structures found frequently in the genealogies published in the literature, such as: a) the absence of information on the phenotype of the individual generating of the genealogy; b) the grouping of trees of normal individuals without the separate description of the offspring number per individual; c) the analysis of structures containing consanguineous unions; d) the determination of general solutions in simple analytic form for the likelihood functions of trees of normal individuals with regular branching and for the heterozygosis probabilities of any individual belonging to these trees. In addition to the executable version of the program summarized above, we also prepared, in collaboration with the dissertation supervisor and the undergraduate student Marcio T. Onodera (main author of this particular version), another program, represented by a web version (PenCalc Web). It enables the calculation of heterozygosis probabilities and the offspring risk for all individuals of the genealogy, two details we did not include in the present version of our program. The program PenCalc Web can be accessed freely at the home-page address http://www.ib.usp.br/~otto/pencalcweb. Another important contribution of this dissertation was the development of a model of estimation with generationdependent penetrance rate, as suggested by the inspection of families with some autosomal dominant diseases, such as the ectrodactyly-tibial hemimelia syndrome (ETH), a condition which exhibits a phenomenon similar to anticipation in relation to the penetrance rate. The models with constant and variable penetrance rates, as well as practically all the methods developed in this dissertation, were applied to 21 individual genealogies from the literature with cases of ETH and to the set of all these genealogies (meta-analysis). The corresponding results of all these analysis are comprehensively presented. Bayes method Computer program Maximum likelihood method Método de Bayes Método de máxima verossimilhança. Modelos matemáticos Models Penetrance rate Programa computacional Taxa de penetrância
195	SELEÇÃO DE ATRIBUTOS EM IMAGENS COLETADAS SOB CONDIÇÕES DE ILUMINAÇÃO NÃO CONTROLADA E SUA INFLUÊNCIA NO DESEMPENHO DE CLASSIFICADORES NAIVE BAYES PARA IDENTIFICAÇÃO DE OBJETOS EM ESTUFAS AGRÍCOLAS Gaspareto, Marinaldo José 10 September 2013 (has links) Made available in DSpace on 2017-07-21T14:19:40Z (GMT). No. of bitstreams: 1 Marinaldo Gaspareto.pdf: 1456191 bytes, checksum: ffaf0b449c6b9d107bdf1946a4619315 (MD5) Previous issue date: 2013-09-10 / A problem regarding the implementation of navigation systems for autonomous moving robots is to detect the objects of interest and obstacles which are in the environment. This study considers the detection of walls / low walls of agricultural greenhouses in digital images obtained without illumination control. The proposed approach employs techniques of digital image processing and digital classification to detect the object of interest. The classifier has been developed digital type Naive Bayes. Two important issues when employing classification methods in computer vision is the accuracy of the classifier and the complexity of computing time. The selection of attributes descriptors that comprise a classifier has great impact on these two factors, generally the fewer attributes are required, the lower the computational cost. Regarding it, this study compared the performance of two methods of feature selection based on principal component analysis, named B2 and B4 in two cases. In the first scenario the feature selection was conducted on all the data extracted from all images. The second selection was performed for images grouped by similarity. After selection, the selected attributes for each approach was used to construct the type Naive Bayes classifier with 12, 17, 22 and 27 input variables. The results indicate that the grouping of images is useful when: (a) the distance from the center of the group to the center of the original database exceeds a threshold and (b) a correlation among the descriptors variables and the target variable is greater than in the group as a whole complete data. Keywords: Greenhouses, Autonomous navigation, Selection attributes, Naive Bayes classifiers. / Um problema relativo à implementação de sistemas de navegação para robôs autônomos móveis é a detecção dos objetos de interesse e dos obstáculos que estão no ambiente. Este trabalho considera a detecção das paredes/muretas de estufas agrícolas em imagens digitais adquiridas sem controle de iluminação. A abordagem proposta emprega técnicas de processamento digital de imagens e classificação digital para detectar o objeto de interesse. O classificador digital desenvolvido foi do tipo Naive Bayes. Duas questões importantes quando do emprego de métodos de classificação em visão computacional são a acurácia do classificador e a complexidade de tempo de computação. A seleção dos atributos descritores que compõem um classificador tem grande impacto sobre estes dois fatores, de um modo geral, quanto menos atributos forem necessários, menor o custo computacional. Considerando isso, este trabalho comparou o desempenho de dois métodos de seleção de atributos baseados na análise de componentes principais, chamados B2 e B4 em duas situações. Na primeira situação, a seleção de atributos foi realizada sobre o conjunto dos dados extraídos de todas as imagens. Na segunda, a seleção foi realizada para imagens agrupadas por similaridade. Após a seleção, os atributos selecionados em cada uma das abordagens foram usados para construir classificadores do tipo Naive Bayes com 12, 17, 22 e 27 variáveis de entrada. Os resultados indicam que o agrupamento de imagens é útil quando: (a) a distância do centro do grupo ao centro da base original ultrapassa um limiar e (b) a correlação entre as variáveis descritoras e a variável meta é maior no grupo do que no conjunto completo de dados. seleção de atributos classificadores Naive Bayes greenhouses autonomous navigation selection attributes Naive Bayes classifiers
196	Utilidade para testes de significância / Utility for significance tests Moura, Nathália Demetrio Vasconcelos 16 May 2014 (has links) Neste trabalho discutimos os principais argumentos da inferência bayesiana subjetivista. Posteriormente, a partir de uma revisão da literatura dos testes de hipóteses, os principais testes são analisados sob a ótica da teoria da decisão, particularmente no que tange às hipóteses precisas. Adicionalmente, funções de perda para testes de significância, seguindo a proposta de Fisher e do FBST, são analisadas e comparadas. / This work discusses the main points of the bayesian subjectivist inference. Posteriorly, from a literature review of hypothesis testing, the main approaches are interpreting from a decision-theoretic viewpoint, particularly regarding the precise hypotheses. Additionally, loss functions for tests of significance, following the proposal of Fisher and FBST, are analyzed and compared. Bayes Bayes decision theory FBST FBST Fisher Fisher função de perda hipótese precisa Jeffreys Jeffreys loss function Popper Popper precise null hypothesis significance tests teoria da decisão testes de significância
197	Seleção entre estratégias de geração automática de dados de teste por meio de métricas estáticas de softwares orientados a objetos / Selection between whole test generation strategies by analysing object oriented software static metrics Ramos, Gustavo da Mota 09 October 2018 (has links) Produtos de software com diferentes complexidades são criados diariamente através da elicitação de demandas complexas e variadas juntamente a prazos restritos. Enquanto estes surgem, altos níveis de qualidade são esperados para tais, ou seja, enquanto os produtos tornam-se mais complexos, o nível de qualidade pode não ser aceitável enquanto o tempo hábil para testes não acompanha a complexidade. Desta maneira, o teste de software e a geração automática de dados de testes surgem com o intuito de entregar produtos contendo altos níveis de qualidade mediante baixos custos e rápidas atividades de teste. Porém, neste contexto, os profissionais de desenvolvimento dependem das estratégias de geração automáticas de testes e principalmente da seleção da técnica mais adequada para conseguir maior cobertura de código possível, este é um fator importante dados que cada técnica de geração de dados de teste possui particularidades e problemas que fazem seu uso melhor em determinados tipos de software. A partir desde cenário, o presente trabalho propõe a seleção da técnica adequada para cada classe de um software com base em suas características, expressas por meio de métricas de softwares orientados a objetos a partir do algoritmo de classificação Naive Bayes. Foi realizada uma revisão bibliográfica de dois algoritmos de geração, algoritmo de busca aleatório e algoritmo de busca genético, compreendendo assim suas vantagens e desvantagens tanto de implementação como de execução. As métricas CK também foram estudadas com o intuito de compreender como estas podem descrever melhor as características de uma classe. O conhecimento adquirido possibilitou coletar os dados de geração de testes de cada classe como cobertura de código e tempo de geração a partir de cada técnica e também as métricas CK, permitindo assim a análise destes dados em conjunto e por fim execução do algoritmo de classificação. Os resultados desta análise demonstraram que um conjunto reduzido e selecionado das métricas CK é mais eficiente e descreve melhor as características de uma classe se comparado ao uso do conjunto por completo. Os resultados apontam também que as métricas CK não influenciam o tempo de geração dos dados de teste, entretanto, as métricas CK demonstraram correlação moderada e influência na seleção do algoritmo genético, participando assim na sua seleção pelo algoritmo Naive Bayes / Software products with different complexity are created daily through analysis of complex and varied demands together with tight deadlines. While these arise, high levels of quality are expected for such, as products become more complex, the quality level may not be acceptable while the timing for testing does not keep up with complexity. In this way, software testing and automatic generation of test data arise in order to deliver products containing high levels of quality through low cost and rapid test activities. However, in this context, software developers depend on the strategies of automatic generation of tests and especially on the selection of the most adequate technique to obtain greater code coverage possible, this is an important factor given that each technique of data generation of test have peculiarities and problems that make its use better in certain types of software. From this scenario, the present work proposes the selection of the appropriate technique for each class of software based on its characteristics, expressed through object oriented software metrics from the naive bayes classification algorithm. Initially, a literature review of the two generation algorithms was carried out, random search algorithm and genetic search algorithm, thus understanding its advantages and disadvantages in both implementation and execution. The CK metrics have also been studied in order to understand how they can better describe the characteristics of a class. The acquired knowledge allowed to collect the generation data of tests of each class as code coverage and generation time from each technique and also the CK metrics, thus allowing the analysis of these data together and finally execution of the classification algorithm. The results of this analysis demonstrated that a reduced and selected set of metrics is more efficient and better describes the characteristics of a class besides demonstrating that the CK metrics have little or no influence on the generation time of the test data and on the random search algorithm . However, the CK metrics showed a medium correlation and influence in the selection of the genetic algorithm, thus participating in its selection by the algorithm naive bayes Algoritmo genético Cobertura de testes Code coverages Genetic algorithm Geração de testes Métricas CK Naive bayes Naive bayes Software testing Test data generation Teste de software
198	Uma comparação da aplicação de métodos computacionais de classificação de dados aplicados ao consumo de cinema no Brasil / A comparison of the application of data classification computational methods to the consumption of film at theaters in Brazil Nieuwenhoff, Nathalia 13 April 2017 (has links) As técnicas computacionais de aprendizagem de máquina para classificação ou categorização de dados estão sendo cada vez mais utilizadas no contexto de extração de informações ou padrões em bases de dados volumosas em variadas áreas de aplicação. Em paralelo, a aplicação destes métodos computacionais para identificação de padrões, bem como a classificação de dados relacionados ao consumo dos bens de informação é considerada uma tarefa complexa, visto que tais padrões de decisão do consumo estão relacionados com as preferências dos indivíduos e dependem de uma composição de características individuais, variáveis culturais, econômicas e sociais segregadas e agrupadas, além de ser um tópico pouco explorado no mercado brasileiro. Neste contexto, este trabalho realizou o estudo experimental a partir da aplicação do processo de Descoberta do conhecimento (KDD), o que inclui as etapas de seleção e Mineração de Dados, para um problema de classificação binária, indivíduos brasileiros que consomem e não consomem um bem de informação, filmes em salas de cinema, a partir dos dados obtidos na Pesquisa de Orçamento Familiar (POF) 2008-2009, pelo Instituto Brasileiro de Geografia e Estatística (IBGE). O estudo experimental resultou em uma análise comparativa da aplicação de duas técnicas de aprendizagem de máquina para classificação de dados, baseadas em aprendizado supervisionado, sendo estas Naïve Bayes (NB) e Support Vector Machine (SVM). Inicialmente, a revisão sistemática realizada com o objetivo de identificar estudos relacionados a aplicação de técnicas computacionais de aprendizado de máquina para classificação e identificação de padrões de consumo indica que a utilização destas técnicas neste contexto não é um tópico de pesquisa maduro e desenvolvido, visto que não foi abordado em nenhum dos trabalhos estudados. Os resultados obtidos a partir da análise comparativa realizada entre os algoritmos sugerem que a escolha dos algoritmos de aprendizagem de máquina para Classificação de Dados está diretamente relacionada a fatores como: (i) importância das classes para o problema a ser estudado; (ii) balanceamento entre as classes; (iii) universo de atributos a serem considerados em relação a quantidade e grau de importância destes para o classificador. Adicionalmente, os atributos selecionados pelo algoritmo de seleção de variáveis Information Gain sugerem que a decisão de consumo de cultura, mais especificamente do bem de informação, filmes em cinema, está fortemente relacionada a aspectos dos indivíduos relacionados a renda, nível de educação, bem como suas preferências por bens culturais / Machine learning techniques for data classification or categorization are increasingly being used for extracting information or patterns from volumous databases in various application areas. Simultaneously, the application of these computational methods to identify patterns, as well as data classification related to the consumption of information goods is considered a complex task, since such decision consumption paterns are related to the preferences of individuals and depend on a composition of individual characteristics, cultural, economic and social variables segregated and grouped, as well as being not a topic explored in the Brazilian market. In this context, this study performed an experimental study of application of the Knowledge Discovery (KDD) process, which includes data selection and data mining steps, for a binary classification problem, Brazilian individuals who consume and do not consume a information good, film at theaters in Brazil, from the microdata obtained from the Brazilian Household Budget Survey (POF), 2008-2009, performed by the Brazilian Institute of Geography and Statistics (IBGE). The experimental study resulted in a comparative analysis of the application of two machine-learning techniques for data classification, based on supervised learning, such as Naïve Bayes (NB) and Support Vector Machine (SVM). Initially, a systematic review with the objective of identifying studies related to the application of computational techniques of machine learning to classification and identification of consumption patterns indicates that the use of these techniques in this context is not a mature and developed research topic, since was not studied in any of the papers analyzed. The results obtained from the comparative analysis performed between the algorithms suggest that the choice of the machine learning algorithms for data classification is directly related to factors such as: (i) importance of the classes for the problem to be studied; (ii) balancing between classes; (iii) universe of attributes to be considered in relation to the quantity and degree of importance of these to the classifiers. In addition, the attributes selected by the Information Gain variable selection algorithm suggest that the decision to consume culture, more specifically information good, film at theaters, is directly related to aspects of individuals regarding income, educational level, as well as preferences for cultural goods Algoritmos de classificação Bens de informação Classification algorithm Consumo Consumption Information goods Naïve Bayes Naïve Bayes Pattern recognition Reconhecimento de padrões Support Vector Machine Support Vector Machine SVM SVM
199	Utilidade para testes de significância / Utility for significance tests Nathália Demetrio Vasconcelos Moura 16 May 2014 (has links) Neste trabalho discutimos os principais argumentos da inferência bayesiana subjetivista. Posteriormente, a partir de uma revisão da literatura dos testes de hipóteses, os principais testes são analisados sob a ótica da teoria da decisão, particularmente no que tange às hipóteses precisas. Adicionalmente, funções de perda para testes de significância, seguindo a proposta de Fisher e do FBST, são analisadas e comparadas. / This work discusses the main points of the bayesian subjectivist inference. Posteriorly, from a literature review of hypothesis testing, the main approaches are interpreting from a decision-theoretic viewpoint, particularly regarding the precise hypotheses. Additionally, loss functions for tests of significance, following the proposal of Fisher and FBST, are analyzed and compared. Bayes FBST Fisher função de perda hipótese precisa Jeffreys Popper teoria da decisão testes de significância Bayes decision theory FBST Fisher Jeffreys loss function Popper precise null hypothesis significance tests
200	Método de entrada de texto baseada em gestos para dispositivos com telas sensíveis ao toque / Text input method based gestures for devices with touch screens Nascimento, Thamer Horbylon 13 October 2015 (has links) Submitted by Cláudia Bueno (claudiamoura18@gmail.com) on 2016-03-03T20:22:51Z No. of bitstreams: 2 Dissertação - Thamer Horbylon Nascimento - 2015.pdf: 3404276 bytes, checksum: 8b96b87a1ad94259e867e586b0cdde9b (MD5) license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) / Approved for entry into archive by Luciana Ferreira (lucgeral@gmail.com) on 2016-03-04T11:20:16Z (GMT) No. of bitstreams: 2 Dissertação - Thamer Horbylon Nascimento - 2015.pdf: 3404276 bytes, checksum: 8b96b87a1ad94259e867e586b0cdde9b (MD5) license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) / Made available in DSpace on 2016-03-04T11:20:16Z (GMT). No. of bitstreams: 2 Dissertação - Thamer Horbylon Nascimento - 2015.pdf: 3404276 bytes, checksum: 8b96b87a1ad94259e867e586b0cdde9b (MD5) license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) Previous issue date: 2015-10-13 / Fundação de Amparo à Pesquisa do Estado de Goiás - FAPEG / This paper proposes a method for gesture-based text input to be used in devices with touch screens. The steps required for this are: recognizing gestures, identify actions needed to enter letters and recognize letters with up to two interactions. To the recognition of gestures, we used the incremental recognition algorithm, which works so that it is not necessary to finish a gesture for this to be recognized, that is, works with continuous gesture recognition. Furthermore, a template is used as a reference for the recognition of gestures. The algorithm identifies the likelihood of the gesture being a running the template. As the incremental recognition algorithm did not have curves in templates using the reduced equation of the circle, was created a template with curves and straight lines were also added in it. The template created served to create a database for entering the letters of the alphabet from A to Z, using user gestures. This base was used for training of a Naïve Bayes classifier, which identifies the probability of inserting a letter entered by the user based on the gestures. Three experiments were conducted to test the method developed. It was found that most users entered a letter using up to two interactions when inserted the five vowels and the five most frequent consonants. When inserting the five vowels and five consonants less frequent, users were also able to enter the letters with up to two interactions. Thus, there is evidence that the method to solve the problem which has been proposed as a solution. / Este trabalho propõe um método para entrada de texto baseado em gestos para ser usado em dispositivos com telas sensíveis ao toque. Os passos necessários para isso são: reconhecer gestos, identificar gestos necessários para inserir letras e reconhecer letras com até duas interações. Para fazer o reconhecimento dos gestos, foi utilizado o algoritmo de reconhecimento incremental, o qual trabalha de forma que não seja necessário terminar um gesto para que este seja reconhecido, ou seja, trabalha com reconhecimento contínuo de gestos. Além disso, um template é utilizado como referência para o reconhecimento dos gestos, assim, o algoritmo identifica a probabilidade de o gesto em execução ser um do template. Como o algoritmo de reconhecimento incremental não possuía curvas em seus templates, utilizando a equação reduzida da circunferência, foi criado um template com curvas e foram também adicionadas retas nele. O template criado serviu para a criação de uma base de dados para a inserção das letras do alfabeto de A a Z, utilizando gestos de usuários. Essa base foi utilizada para treinamento de um classificador Naïve Bayes, que identifica a probabilidade de inserção de uma letra baseado nos gestos inseridos pelo usuário. Foram realizados três experimentos para testar o método desenvolvido. Verificouse que a maioria dos usuários inseriu uma letra utilizando até duas interações, quando inseriram as cinco vogais e as cinco consoantes mais frequentes. Ao inserir as cinco vogais e cinco consoantes menos frequentes, os usuários também conseguiram inserir as letras com até duas interações. Assim, há evidências de que o método resolva o problema para qual foi proposto como solução. Entrada de texto Telas sensíveis ao toque Naïve Bayes Interação homem computador Text entry Touch screens Naïve Bayes Human computer interaction

Search results