Global ETD Search

41	Síťový interface k detektoru klíčových slov / Network Interface for Keyword Spotting System Skotnica, Martin Unknown Date (has links) A considerable part of the research in computer science is dedicated to speech recognition as the speech-controlled systems become useful in many applications. One of them is the keyword spotting which makes possible to find words in audio data. Such a detector is developed at BUT Faculty of Information Technology. The goal of this work is to propose a network interface to this keyword detector based on client/server architecture. Client connects to the server and sends audio data. Server runs keyword detector with this received data and sends the result of keyword spotting back to client. Finally client visualizes the result and interact with user.
42	Wordspotting from multilingual and stylistic documents / Repérage de mots dans les images de documents multilingues et graphiques Tarafdar, Arundhati 12 July 2017 (has links) Les outils et méthodes d’analyse d’images de documents (DIA) donnent aujourd’hui la possibilité de faire des recherches par mots-clés dans des bases d’images de documents alors même qu’aucune transcription n’est disponible. Dans ce contexte, beaucoup de travaux ont déjà été réalisés sur les OCR ainsi que sur des systèmes de repérage de mots (spotting) dédiés à des documents textuels avec une mise en page simple. En revanche, très peu d’approches ont été étudiées pour faire de la recherche dans des documents contenant du texte multi-orienté et multi-échelle, comme dans les documents graphiques. Par exemple, les images de cartes géographiques peuvent contenir des symboles, des graphiques et du texte ayant des orientations et des tailles différentes. Dans ces documents, les caractères peuvent aussi être connectés entre eux ou bien à des éléments graphiques. Par conséquent, le repérage de mots dans ces documents se révèle être une tâche difficile. Dans cette thèse nous proposons un ensemble d’outils et méthodes dédiés au repérage de mots écrits en caractères bengali ou anglais (script Roman) dans des images de documents géographiques. L’approche proposée repose sur plusieurs originalités. / Word spotting in graphical documents is a very challenging task. To address such scenarios this thesis deals with developing a word spotting system dedicated to geographical documents with Bangla and English (Roman) scripts. In the proposed system, at first, text-graphics layers are separated using filtering, clustering and self-reinforcement through classifier. Additionally, instead of using binary decision we have used probabilistic measurement to represent the text components. Subsequently, in the text layer, character segmentation approach is applied using water-reservoir based method to extract individual character from the document. Then recognition of these isolated characters is done using rotation invariant feature, coupled with SVM classifier. Well recognized characters are then grouped based on their sizes. Initial spotting is started to find a query word among those groups of characters. In case if the system could spot a word partially due to any noise, SIFT is applied to identify missing portion of that partial spotting. Experimental results on Roman and Bangla scripts document images show that the method is feasible to spot a location in text labeled graphical documents. Experiments are done on an annotated dataset which was developed for this work. We have made this annotated dataset available publicly for other researchers. Analyse d’images de documents Repérage de mots (word spotting) Documents graphiques Recherche d’information Séparation texte-graphique Filtrage Cartes de probabilité Points d’intérêts (SIFT) Bengla Document Image Analysis Word Spotting Graphical documents Information Retrieval Probability matrix information 2-D Filter Water Reservoir Principle Clustering SIFT
43	Hardware/Software Co-Design for Keyword Spotting on Edge Devices Jacob Irenaeus M Bushur (15360553) 29 April 2023 (has links) <p>The introduction of artificial neural networks (ANNs) to speech recognition applications has sparked the rapid development and popularization of digital assistants. These digital assistants perform keyword spotting (KWS), constantly monitoring the audio captured by a microphone for a small set of words or phrases known as keywords. Upon recognizing a keyword, a larger audio recording is saved and processed by a separate, more complex neural network. More broadly, neural networks in speech recognition have popularized voice as means of interacting with electronic devices, sparking an interest in individuals using speech recognition in their own projects. However, while large companies have the means to develop custom neural network architectures alongside proprietary hardware platforms, such development precludes those lacking similar resources from developing efficient and effective neural networks for embedded systems. While small, low-power embedded systems are widely available in the hobbyist space, a clear process is needed for developing a neural network that accounts for the limitations of these resource-constrained systems. In contrast, a wide variety of neural network architectures exists, but often little thought is given to deploying these architectures on edge devices. </p> <p><br></p> <p>This thesis first presents an overview of audio processing techniques, artificial neural network fundamentals, and machine learning tools. A summary of a set of specific neural network architectures is also discussed. Finally, the process of implementing and modifying these existing neural network architectures and training specific models in Python using TensorFlow is demonstrated. The trained models are also subjected to post-training quantization to evaluate the effect on model performance. The models are evaluated using metrics relevant to deployment on resource-constrained systems, such as memory consumption, latency, and model size, in addition to the standard comparisons of accuracy and parameter count. After evaluating the models and architectures, the process of deploying one of the trained and quantized models is explored on an Arduino Nano 33 BLE using TensorFlow Lite for Microcontrollers and on a Digilent Nexys 4 FPGA board using CFU Playground.</p> keyword classification keyword spotting keyword spotting (KWS) machine learning artificial intelligence hardware software co-design hardware software codesign edge devices speech recognition system neural network architecture
44	Sparse representations over learned dictionary for document analysis / Présentations parcimonieuses sur dictionnaire d'apprentissage pour l'analyse de documents Do, Thanh Ha 04 April 2014 (has links) Dans cette thèse, nous nous concentrons sur comment les représentations parcimonieuses peuvent aider à augmenter les performances pour réduire le bruit, extraire des régions de texte, reconnaissance des formes et localiser des symboles dans des documents graphiques. Pour ce faire, tout d'abord, nous donnons une synthèse des représentations parcimonieuses et ses applications en traitement d'images. Ensuite, nous présentons notre motivation pour l'utilisation de dictionnaires d'apprentissage avec des algorithmes efficaces pour les construire. Après avoir décrit l'idée générale des représentations parcimonieuses et du dictionnaire d'apprentissage, nous présentons nos contributions dans le domaine de la reconnaissance de symboles et du traitement des documents en les comparants aux travaux de l'état de l'art. Ces contributions s'emploient à répondre aux questions suivantes: La première question est comment nous pouvons supprimer le bruit des images où il n'existe aucune hypothèse sur le modèle de bruit sous-jacent à ces images ? La deuxième question est comment les représentations parcimonieuses sur le dictionnaire d'apprentissage peuvent être adaptées pour séparer le texte du graphique dans des documents? La troisième question est comment nous pouvons appliquer la représentation parcimonieuse à reconnaissance de symboles? Nous complétons cette thèse en proposant une approche de localisation de symboles dans les documents graphiques qui utilise les représentations parcimonieuses pour coder un vocabulaire visuel / In this thesis, we focus on how sparse representations can help to increase the performance of noise removal, text region extraction, pattern recognition and spotting symbols in graphical documents. To do that, first of all, we give a survey of sparse representations and its applications in image processing. Then, we present the motivation of building learning dictionary and efficient algorithms for constructing a learning dictionary. After describing the general idea of sparse representations and learned dictionary, we bring some contributions in the field of symbol recognition and document processing that achieve better performances compared to the state-of-the-art. These contributions begin by finding the answers to the following questions. The first question is how we can remove the noise of a document when we have no assumptions about the model of noise found in these images? The second question is how sparse representations over learned dictionary can separate the text/graphic parts in the graphical document? The third question is how we can apply the sparse representation for symbol recognition? We complete this thesis by proposing an approach of spotting symbols that use sparse representations for the coding of a visual vocabulary Représentations parcimonieuses Dictionnaire d'apprentissage Algorithme apprentissage Réduction du bruit Séparation texte/graphique Reconnaissance de symboles Localisation de symboles Mots visuels Sparse representations Learned dictionary Learning algorithms Removal noise Separation text/graphic Symbol recognition Symbol spotting Visual words 006.42
45	Fuzzy multilevel graph embedding for recognition, indexing and retrieval of graphic document images / Apport des modèles graphiques à l'analyse et à l'indexation d'images de documents Luqman, Muhammad Muzzamil 02 March 2012 (has links) Cette thèse aborde le problème du manque de performance des outils exploitant des représentationsà base de graphes en reconnaissance des formes. Nous proposons de contribuer aux nouvellesméthodes proposant de tirer partie, à la fois, de la richesse des méthodes structurelles et de la rapidité des méthodes de reconnaissance de formes statistiques. Deux principales contributions sontprésentées dans ce manuscrit. La première correspond à la proposition d'une nouvelle méthode deprojection explicite de graphes procédant par analyse multi-facettes des graphes. Cette méthodeeffectue une caractérisation des graphes suivant différents niveaux qui correspondent, selon nous,aux point-clés des représentations à base de graphes. Il s'agit de capturer l'information portéepar un graphe au niveau global, au niveau structure et au niveau local ou élémentaire. Ces informationscapturées sont encapsulés dans un vecteur de caractéristiques numériques employantdes histogrammes flous. La méthode proposée utilise, de plus, un mécanisme d'apprentissage nonsupervisée pour adapter automatiquement ses paramètres en fonction de la base de graphes àtraiter sans nécessité de phase d'apprentissage préalable. La deuxième contribution correspondà la mise en place d'une architecture pour l'indexation de masses de graphes afin de permettre,par la suite, la recherche de sous-graphes présents dans cette base. Cette architecture utilise laméthode précédente de projection explicite de graphes appliquée sur toutes les cliques d'ordre 2pouvant être extraites des graphes présents dans la base à indexer afin de pouvoir les classifier.Cette classification permet de constituer l'index qui sert de base à la description des graphes etdonc à leur indexation en ne nécessitant aucune base d'apprentissage pré-étiquetées. La méthodeproposée est applicable à de nombreux domaines, apportant la souplesse d'un système de requêtepar l'exemple et la granularité des techniques d'extraction ciblée (focused retrieval). / This thesis addresses the problem of lack of efficient computational tools for graph based structural pattern recognition approaches and proposes to exploit computational strength of statistical pattern recognition. It has two fold contributions. The first contribution is a new method of explicit graph embedding. The proposed graph embedding method exploits multilevel analysis of graph for extracting graph level information, structural level information and elementary level information from graphs. It embeds this information into a numeric feature vector. The method employs fuzzy overlapping trapezoidal intervals for addressing the noise sensitivity of graph representations and for minimizing the information loss while mapping from continuous graph space to discrete vector space. The method has unsupervised learning abilities and is capable of automatically adapting its parameters to underlying graph dataset. The second contribution is a framework for automatic indexing of graph repositories for graph retrieval and subgraph spotting. This framework exploits explicit graph embedding for representing the cliques of order 2 by numeric feature vectors, together with classification and clustering tools for automatically indexing a graph repository. It does not require a labeled learning set and can be easily deployed to a range of application domains, offering ease of query by example (QBE) and granularity of focused retrieval. Partitionnement de graphes Projection de graphes Repérage de sous-graphes Reconnaissance des forme Classification de graphes Logique floue Reconnaissance de graphiques Pattern recognition Graph clustering Graph classification Graph embedding Subgraph spotting Fuzzy logic Graphies recognition
46	Query-by-Example Keyword Spotting / Query-by-Example Keyword Spotting Skácel, Miroslav January 2015 (has links) Tato diplomová práce se zabývá moderními přístupy detekce klíčových slov a detekce frází v řečových datech. V úvodní části je seznámení s problematikou a teoretický popis metod pro detekci. Následuje popis reprezentace vstupních datových sad použitých při experimentech a evaluaci. Dále jsou uvedeny metody pro detekci klíčových slov definovaných vzorem. Následně jsou popsány evaluační metody a techniky použité pro skórování. Po provedení experimentů na datových sadách a po evaluaci jsou diskutovány výsledky. V dalším kroku jsou navrženy a poté implementovány moderní postupy vedoucí k vylepšení systému pro detekci a opět je provedena evaluace a diskuze dosažených výsledků. V závěrečné části je práce zhodnocena a jsou zde navrženy další směy vývoje našeho systému. Příloha obsahuje manuál pro používání implementovaných skriptů.
47	Vizualizace výstupu z řečových technologií pro potřeby kontaktních center / Vizualization of Outputs from Speech Technologies for Contact Centers Zhezhela, Oleksandr January 2014 (has links) The thesis is aimed on visualisation of data mined by speech processing technologies. Some methods speech data extraction were studied and technologies for this task were analysed. The variety of meta data that can be mined from speech was defined. Were also examined existing standards and processes of call centres. Some requirements for the user interface were gathered and analysed. On that basis and after communication with call centre employees there was defined and implemented a concept for speech data visualization. Gained solutions were integrated into Speech Analytics Server (SPAS).
48	Analysis of Micro-Expressions based on the Riesz Pyramid : Application to Spotting and Recognition / Analyse des micro-expressions exploitant la pyramide de Riesz : application à la détection et à la reconnaissance Arango Duque, Carlos 06 December 2018 (has links) Les micro-expressions sont des expressions faciales brèves et subtiles qui apparaissent et disparaissent en une fraction de seconde. Ce type d'expressions reflèterait "l'intention réelle" de l'être humain. Elles ont été étudiées pour mieux comprendre les communications non verbales et dans un contexte médicale lorsqu'il devient presque impossible d'engager une conversation ou d'essayer de traduire les émotions du visage ou le langage corporel d'un patient. Cependant, détecter et reconnaître les micro-expressions est une tâche difficile pour l'homme. Il peut donc être pertinent de développer des systèmes d'aide à la communication exploitant les micro-expressions. De nombreux travaux ont été réalisés dans les domaines de l'informatique affective et de la vision par ordinateur pour analyser les micro-expressions, mais une grande majorité de ces méthodes repose essentiellement sur des méthodes de vision par ordinateur classiques telles que les motifs binaires locaux, les histogrammes de gradients orientés et le flux optique. Étant donné que ce domaine de recherche est relativement nouveau, d'autres pistes restent à explorer. Dans cette thèse, nous présentons une nouvelle méthodologie pour l'analyse des petits mouvements (que nous appellerons par la suite mouvements subtils) et des micro-expressions. Nous proposons d'utiliser la pyramide de Riesz, une approximation multi-échelle et directionnelle de la transformation de Riesz qui a été utilisée pour l'amplification du mouvement dans les vidéos à l'aide de l'estimation de la phase 2D locale. Pour l'étape générale d'analyse de mouvements subtils, nous transformons une séquence d'images avec la pyramide de Riesz, extrayons et filtrons les variations de phase de l'image. Ces variations de phase sont en lien avec le mouvement. De plus, nous isolons les régions d'intérêt où des mouvements subtils pourraient avoir lieu en masquant les zones de bruit à l'aide de l'amplitude locale. La séquence d'image est transformée en un signal ID utilisé pour l'analyse temporelle et la détection de mouvement subtils. Nous avons créé notre propre base de données de séquences de mouvements subtils pour tester notre méthode. Pour l'étape de détection de micro-expressions, nous adaptons la méthode précédente au traitement de certaines régions d'intérêt du visage. Nous développons également une méthode heuristique pour détecter les micro-événements faciaux qui sépare les micro-expressions réelles des clignotements et des mouvements subtils des yeux. Pour la classification des micro-expressions, nous exploitons l'invariance, sur de courtes durées, de l'orientation dominante issue de la transformation de Riesz afin de moyenner la séquence d'une micro-expression en une paire d'images. A partir de ces images, nous définissons le descripteur MORF (Mean Oriented Riesz Feature) constitué d'histogrammes d'orientation. Les performances de nos méthodes sont évaluées à l'aide de deux bases de données de micro-expressions spontanées. / Micro-expressions are brief and subtle facial expressions that go on and off the face in a fraction of a second. This kind of facial expressions usually occurs in high stake situations and is considered to reflect a humans real intent. They have been studied to better understand non-verbal communications and in medical applications where is almost impossible to engage in a conversation or try to read the facial emotions or body language of a patient. There has been some interest works in micro-expression analysis, however, a great majority of these methods are based on classically established computer vision methods such as local binary patterns, histogram of gradients and optical flow. Considering the fact that this area of research is relatively new, much contributions remains to be made. ln this thesis, we present a novel methodology for subtle motion and micro-expression analysis. We propose to use the Riesz pyramid, a multi-scale steerable Hilbert transformer which has been used for 2-D phase representation and video amplification, as the basis for our methodology. For the general subtle motion analysis step, we transform an image sequence with the Riesz pyramid, extract and lifter the image phase variations as proxies for motion. Furthermore, we isolate regions of intcrcst where subtle motion might take place and mask noisy areas by thresholding the local amplitude. The total sequence is transformed into a ID signal which is used fo temporal analysis and subtle motion spotting. We create our own database of subtle motion sequences to test our method. For the micro-expression spotting step, we adapt the previous method to process some facial regions of interest. We also develop a heuristic method to detect facial micro-events that separates real micro-expressions from eye blinkings and subtle eye movements. For the micro-expression classification step, we exploit the dominant orientation constancy fom the Riesz transform to average the micro-expression sequence into an image pair. Based on that, we introduce the Mean Oriented Riesz Feature descriptor. The accuracy of our methods are tested in Iwo spontaneous micro-expressions databases. Furthermore, wc analyse the parameter variations and their effect in our results. Micro-expressions Pyramide Riesz Classification des expressions faciales Repérage Mouvements subtiles Analyse des petits mouvements Descripteur MORF Micro-expressions Riesz Pyramid Spotting Facial Expression Classification Subtle Motion Amplitude Masking Feature Extraction Mean Oriented Riesz Feature
49	Electronic Flight Bag / Electronic Flight Bag Kúšik, Lukáš January 2021 (has links) Cieľom tejto diplomovej práce je vytvoriť Electronic Flight Bag (EFB) aplikáciu pre mobilné telefóny s operačným systémom Android. Pre splnenie tejto úlohy bola preskúmaná aktuálna legislatíva ohľadom EFB aplikácií spolu s najmodernejšími EFB aplikáciami dostupnými na aplikačnom trhu. Na základe týchto informácií je navrhnutá a implementovaná EFB aplikácia určená pre pilotov všeobecného letectva. Výsledný produkt obsahuje funkcie pre plánovanie letu, vlastnú leteckú mapu, pilotný denník, katalóg letísk s dátami z celého sveta a ďalšie. Podpora offline zaručuje funkčnosť v reálnych podmienkach letu. Konečný produkt sa taktiež snaží inovovať nad existujúcimi EFB aplikáciami zahrnutím funkcionalít, akými sú napríklad automatické kontrolné zoznamy a náhľad v rozšírenej realite.
50	Tušírovací lis s pohybovými šrouby / Try-out press with motion screws Švoma, Jan Unknown Date (has links) The aim of this thesis is a complex design of the spotting press with a nominal force of 500 kN, which is intended for mating of the both halves of pressing tool for the automotive industry. The press ram is fitted with a hydraulic mechanism which allows the upper clamping board to be tilted in range of 0° -180° and removed from the working space of the press along a profile track. The lower clamping board is a part of moving bolster, which is equipped with a mechanism for lifting and centring. The concept of motion screws is used to drive the ram. Motion screws are mounted in a multiple-part frame and driven by servomotors. The thesis contains background research of the issue, solutions of the main design nodes of the press including calculations, detailed 3D model of the device and partial drawing documentation.

Search results