• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 13
  • 10
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 38
  • 38
  • 14
  • 10
  • 9
  • 9
  • 9
  • 8
  • 8
  • 7
  • 7
  • 7
  • 6
  • 6
  • 6
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

[en] AN ASSESSMENT OF PRESENTATION ATTACK DETECTION METHODS FOR FACE RECOGNITION SYSTEMS / [pt] AVALIAÇÃO DE MÉTODOS DE DETECÇÃO DE FRAUDE EM SISTEMAS DE RECONHECIMIENTO FACIAL

GUILLERMO ESTRADA DOMECH 07 November 2018 (has links)
[pt] As vulnerabilidades dos Sistemas de Reconhecimento Facial (FRS) aos Ataques de Apresentação (PA) foram recentemente reconhecidas pela comunidade biométrica, mas ainda existe a falta de técnicas faciais de Detecção de Ataque de Apresentação (PAD) baseadas em software que apresentam desempenho robusto em cenários de autenticação realistas. O objetivo principal desta dissertação é analisar, avaliar e comparar alguns dos métodos baseados em atributos do estado-da-arte para PAD facial em uma variedade de condições, considerando três dos bancos de dados de fraude facial publicamente disponíveis 3DMAD, REPLAY-MOBILE e OULU-NPU. No presente trabalho, os métodos de PAD baseados em descritores de texturas LBP-RGB, BSIF-RGB e IQM foram investigados. Ademais, um Autoencoder Convolucional (CAE), um descritor de atributos aprendidos, também foi implementado e avaliado. Também, abordagens de classificação de uma e duas classes foram implementadas e avaliadas. Os experimentos realizados neste trabalho foram concebidos para medir o desempenho de diferentes esquemas de PAD em duas condições: (i) intra-banco de dados e (ii) inter-banco de dados. Os resultados revelaram que a eficácia dos atributos aprendidos pelo CAE em esquemas de PAD baseados na abordagem de classificação de duas classes fornece, em geral, o melhor desempenho em protocolos de avaliação intra-banco de dados. Os resultados também indicam que os esquemas de PAD baseados na abordagem de classificação de uma classe não são inferiores em comparação às suas contrapartes de duas classes nas avaliações inter-banco de dados. / [en] The vulnerabilities of Face Recognition Systems (FRS) to Presentation Attacks (PA) have been recently recognized by the biometric community, but there is still a lack of generalized software-based facial Presentation Attack Detection (PAD) techniques that perform robustly in realistic authentication scenarios. The main objective of this dissertation is to analyze, evaluate and compare some of the most relevant, state-of-the-art feature-based methods for facial PAD in a variety of conditions, considering three of the facial spoofing databases publicly available 3DMAD, REPLAYMOBILE and OULU-NPU. In the current work, PAD methods based on LBP-RGB, BSIF-RGB and IQM hand-crafted texture descriptors were investigated. Additionally, a Convolutional Autoencoder (CAE), a learned feature descriptor, was also implemented and evaluated. Furthermore, oneclass and two-class classification approaches were implemented and evaluated. The experiments conducted in this work were designed to measure the performance of different PAD schemes in two conditions, namely: (i) intradatabase and (ii) inter-database (or cross-database). The results revealed the effectiveness of the features learned by CAE in two-class classification PAD schemes provide, in general, the best performance in intra-database evaluation protocols. The results also indicate that PAD schemes based on one-class classification approach are not inferior as compared to their twoclass counterpart in the inter-database evaluations.
32

Metody potlaÄen­ umu pro rozpoznvaÄe eÄi / Methods of noise suppression for speech recognition systems

Mold­kov, Zuzana January 2014 (has links)
This diploma thesis deals with methods of noise suppression for speech recognition systems. In theoretical part are discussed basic terms of this topic and also methods for noise suppression. These are spectral subtraction, Wiener filtering, RASTA, mapping of spectrogram or algorithms based on noise estimation. In second part types of noise are analyzed, there is proposal and implementation of spectral subtraction method of noise suppression for speech recognition system. Also extensive testing of spectral subtractive algorithms in comparison with Wiener filter is conducted. Assessment of this testing is done with defined metrics, successfulness of recognition, recognition system score and signal to noise ratio.
33

A framework for correlation and aggregation of security alerts in communication networks. A reasoning correlation and aggregation approach to detect multi-stage attack scenarios using elementary alerts generated by Network Intrusion Detection Systems (NIDS) for a global security perspective.

Alserhani, Faeiz January 2011 (has links)
The tremendous increase in usage and complexity of modern communication and network systems connected to the Internet, places demands upon security management to protect organisations¿ sensitive data and resources from malicious intrusion. Malicious attacks by intruders and hackers exploit flaws and weakness points in deployed systems through several sophisticated techniques that cannot be prevented by traditional measures, such as user authentication, access controls and firewalls. Consequently, automated detection and timely response systems are urgently needed to detect abnormal activities by monitoring network traffic and system events. Network Intrusion Detection Systems (NIDS) and Network Intrusion Prevention Systems (NIPS) are technologies that inspect traffic and diagnose system behaviour to provide improved attack protection. The current implementation of intrusion detection systems (commercial and open-source) lacks the scalability to support the massive increase in network speed, the emergence of new protocols and services. Multi-giga networks have become a standard installation posing the NIDS to be susceptible to resource exhaustion attacks. The research focuses on two distinct problems for the NIDS: missing alerts due to packet loss as a result of NIDS performance limitations; and the huge volumes of generated alerts by the NIDS overwhelming the security analyst which makes event observation tedious. A methodology for analysing alerts using a proposed framework for alert correlation has been presented to provide the security operator with a global view of the security perspective. Missed alerts are recovered implicitly using a contextual technique to detect multi-stage attack scenarios. This is based on the assumption that the most serious intrusions consist of relevant steps that temporally ordered. The pre- and post- condition approach is used to identify the logical relations among low level alerts. The alerts are aggregated, verified using vulnerability modelling, and correlated to construct multi-stage attacks. A number of algorithms have been proposed in this research to support the functionality of our framework including: alert correlation, alert aggregation and graph reduction. These algorithms have been implemented in a tool called Multi-stage Attack Recognition System (MARS) consisting of a collection of integrated components. The system has been evaluated using a series of experiments and using different data sets i.e. publicly available datasets and data sets collected using real-life experiments. The results show that our approach can effectively detect multi-stage attacks. The false positive rates are reduced due to implementation of the vulnerability and target host information.
34

Methods for face detection and adaptive face recognition

Pavani, Sri-Kaushik 21 July 2010 (has links)
The focus of this thesis is on facial biometrics; specifically in the problems of face detection and face recognition. Despite intensive research over the last 20 years, the technology is not foolproof, which is why we do not see use of face recognition systems in critical sectors such as banking. In this thesis, we focus on three sub-problems in these two areas of research. Firstly, we propose methods to improve the speed-accuracy trade-off of the state-of-the-art face detector. Secondly, we consider a problem that is often ignored in the literature: to decrease the training time of the detectors. We propose two techniques to this end. Thirdly, we present a detailed large-scale study on self-updating face recognition systems in an attempt to answer if continuously changing facial appearance can be learnt automatically. / L'objectiu d'aquesta tesi és sobre biometria facial, específicament en els problemes de detecció de rostres i reconeixement facial. Malgrat la intensa recerca durant els últims 20 anys, la tecnologia no és infalible, de manera que no veiem l'ús dels sistemes de reconeixement de rostres en sectors crítics com la banca. En aquesta tesi, ens centrem en tres sub-problemes en aquestes dues àrees de recerca. En primer lloc, es proposa mètodes per millorar l'equilibri entre la precisió i la velocitat del detector de cares d'última generació. En segon lloc, considerem un problema que sovint s'ignora en la literatura: disminuir el temps de formació dels detectors. Es proposen dues tècniques per a aquest fi. En tercer lloc, es presenta un estudi detallat a gran escala sobre l'auto-actualització dels sistemes de reconeixement facial en un intent de respondre si el canvi constant de l'aparença facial es pot aprendre de forma automàtica.
35

臺灣大學生透過電腦輔助軟體學習英語發音的研究 / A Passage to being understood and understanding others:

蔡碧華, Tsai, Pi Hua Unknown Date (has links)
本研究旨在調查電腦輔助英語發音學習軟體 「MyET」,對學習者在學習英語發音方面的影響。 利用電腦輔助英語發音學習軟體(CAPT),練習英語的類化效果,也列為調查重點之一。 此外,學生使用CAPT過程中遭遇的困難和挑戰,以及互動過程中發展出來的對策也一一加以探討。 本研究的目的是要把CAPT在英語聲韻教學的領域中做正確的定位,並且探討如何使用其他的中介工具(例如人類)來強化此類軟體的輔助學習效果。 參與本次研究的大學生一共有九十名,分為三組:兩組CAPT組(亦即實驗組,使用CAPT獨自或與同儕一起使用CAPT學習英語發音)、非CAPT組(控制組)一 組。每組三十名。實驗開始,所有學生以十週的時間練習朗讀 從「灰姑娘」(Cinderella) 摘錄的文字,此段文字由發行 MyET 的公司線上免費提供。 實驗前與實驗後,兩組的學生各接受一次測驗。 每週練習結束後,學生必須將學習心得記載於學習日誌上;教師也針對每個學生的學習心得給予指導回饋。 研究結果顯示,兩個CAPT組別(亦即使用CAPT發音學習軟體的組別)的學生在學習英語聲韻的過程中,都有明顯及正面的進步與改變。尤其是語調與速度快慢方面的進步遠勝於發音的進步。再者,實驗組學生以十週的時間利用CAPT學習英語後,在朗讀新的文字時,無論是發音或語調都有類化的效應,但是在速度快慢方面則無顯著進步。然而,實驗結果三組的發音表現,在量化統計上並未達到明顯的差異。 雖然如此,在質化的探究上,經過分析學生的學習心得後得知:所有組別當中,獨自使用CAPT學習英語發音的組別,最能夠自我審視語言學習歷程 (包括模仿和學習樂趣)。至於共同使用CAPT學習的學生自述在英語流暢度、語調及發音方面獲致最大的改善。控制組的學生因為沒有同儕的鷹架教學及回饋,也沒有 MyET提供的練習回饋,練習過程中,學生自述學習困難的頻率最高,學生也認為學習收穫很少。 參與本次研究實驗組的學生認為, CAPT提供練習回饋的機制設計有改進的空間。 有關本研究結果在理論及英語教學上的意涵以及研究限制,於結論當中一一提出加以討論。 關鍵字:電腦輔助語言教學,語音辨識軟體,超音段,語調,時長,學習策略, 中介 / This present study investigated the impact of computer-assisted pronunciation training (CAPT) software, i.e., MyET, on students’ learning of English pronunciation. The investigation foci included the generalization of the effect of practice with the CAPT system. Also examined are the difficulties and challenges reported by the students who employed the CAPT system and the strategy scheme they developed from their interaction with the system. This study aimed to position the role of the CAPT system in the arena of instruction on English pronunciation and to investigate how other kinds of mediation, such as that of peer support, could reinforce its efficacy. This study involved 90 Taiwanese college students, divided into two experimental groups and one control group. The two experimental groups practiced English pronunciation by using a computer-assisted pronunciation training (CAPT) program either independently or with peers while the control group only had access to MP3 files in their practice. All the groups practiced for ten weeks texts adopted from a play, Cinderella, provided by MyET free of charge on line. They all received a pretest and a posttest on the texts they had practiced and a novel text. Each week after their practice with the texts, the participants were asked to write down in their learning logs their reflections on the learning process in Chinese. In the same way, the instructor would provide her feedback on the students’ reflections in the logs every week. The results showed that the ten-week practice with the CAPT system resulted in significant and positive changes in the learning of English pronunciation of CAPT groups (i.e., the Self-Access CAPT Group and the Collaborative CAPT Group). The progress of the participants in intonation and timing was always higher than in segmental pronunciation. Moreover, the ten-week practice with the CAPT system was found to be generalized (though the generalization is less than mediocre) to the participants’ performance in the production of segmental pronunciation and intonation but not in the timing component in reading the novel text. However, the improvement of the CAPT groups was not great enough to differentiate themselves from the MP3 Group. Though the quantitative investigation did not reveal significant group differences, the qualitative analysis of the students’ reflections showed that the learning processes all the three groups went through differed. The Self-Access CAPT Group outperformed the other two groups in developing self-monitoring of language learning and production, and in enjoying working with the CAPT system/texts. Among the three groups, the Collaborative CAPT Group outscored the other two groups in reporting their gains and improvement in fluency, intonation and segmental pronunciation, as well as developing strategies to deal with their learning difficulty. Though the students in the MP3 group also made significant progress after the practice, without peers’ scaffolding and the feedback provided by MyET, they reported the highest frequency of difficulties and the least frequency of gains and strategies during the practice. The participants of this study also considered necessary the improvement of the CAPT system’s feedback design. At the end of the study theoretical and pedagogical implications as well as research limitations are presented. Key words: Computer-Assisted Language Learning (CALL), Automatic Speech Recognition System (ASRS), segmental pronunciation, prosody, intonation, timing, learning strategies, mediation
36

Hardware/Software Co-Design for Keyword Spotting on Edge Devices

Jacob Irenaeus M Bushur (15360553) 29 April 2023 (has links)
<p>The introduction of artificial neural networks (ANNs) to speech recognition applications has sparked the rapid development and popularization of digital assistants. These digital assistants perform keyword spotting (KWS), constantly monitoring the audio captured by a microphone for a small set of words or phrases known as keywords. Upon recognizing a keyword, a larger audio recording is saved and processed by a separate, more complex neural network. More broadly, neural networks in speech recognition have popularized voice as means of interacting with electronic devices, sparking an interest in individuals using speech recognition in their own projects. However, while large companies have the means to develop custom neural network architectures alongside proprietary hardware platforms, such development precludes those lacking similar resources from developing efficient and effective neural networks for embedded systems. While small, low-power embedded systems are widely available in the hobbyist space, a clear process is needed for developing a neural network that accounts for the limitations of these resource-constrained systems. In contrast, a wide variety of neural network architectures exists, but often little thought is given to deploying these architectures on edge devices. </p> <p><br></p> <p>This thesis first presents an overview of audio processing techniques, artificial neural network fundamentals, and machine learning tools. A summary of a set of specific neural network architectures is also discussed. Finally, the process of implementing and modifying these existing neural network architectures and training specific models in Python using TensorFlow is demonstrated. The trained models are also subjected to post-training quantization to evaluate the effect on model performance. The models are evaluated using metrics relevant to deployment on resource-constrained systems, such as memory consumption, latency, and model size, in addition to the standard comparisons of accuracy and parameter count. After evaluating the models and architectures, the process of deploying one of the trained and quantized models is explored on an Arduino Nano 33 BLE using TensorFlow Lite for Microcontrollers and on a Digilent Nexys 4 FPGA board using CFU Playground.</p>
37

Evaluation of system design strategies and supervised classification methods for fruit recognition in harvesting robots / Undersökning av Systemdesignstrategier och Klassifikationsmetoder för Identifiering av Frukt i Skörderobotar

Björk, Gabriella January 2017 (has links)
This master thesis project is carried out by one student at the Royal Institute of Technology in collaboration with Cybercom Group. The aim was to evaluate and compare system design strategies for fruit recognition in harvesting robots and the performance of supervised machine learning classification methods when applied to this specific task. The thesis covers the basics of these systems; to which parameters, constraints, requirements, and design decisions have been investigated. The framework is used as a foundation for the implementation of both sensing system, and processing and classification algorithms. A plastic tomato plant with fruit of varying maturity was used as a basis for training and testing, and a Kinect v2 for Windows including sensors for high resolution color-, depth, and IR data was used for image acquisition. The obtained data were processed and features of objects of interest extracted using MATLAB and a SDK for Kinect provided by Microsoft. Multiple views of the plant were acquired by having the plant rotate on a platform controlled by a stepper motor and an Ardunio Uno. The algorithms tested were binary classifiers, including Support Vector Machine, Decision Tree, and k-Nearest Neighbor. The models were trained and validated using a five fold cross validation in MATLABs Classification Learner application. Peformance metrics such as precision, recall, and the F1-score, used for accuracy comparison, were calculated. The statistical models k-NN and SVM achieved the best scores. The method considered most promising for fruit recognition purposes was the SVM. / Det här masterexamensarbetet har utförts av en student från Kungliga Tekniska Högskolan i samarbete med Cybercom Group. Målet var att utvärdera och jämföra designstrategier för igenkänning av frukt i en skörderobot och prestandan av klassificerande maskininlärningsalgoritmer när de appliceras på det specifika problemet. Arbetet omfattar grunderna av dessa system; till vilket parametrar, begränsningar, krav och designbeslut har undersökts. Ramverket användes sedan som grund för implementationen av sensorsystemet, processerings- och klassifikationsalgoritmerna. En tomatplanta i pplast med frukter av varierande mognasgrad användes som bas för träning och validering av systemet, och en Kinect för Windows v2 utrustad med sensorer för högupplöst färg, djup, och infraröd data anvöndes för att erhålla bilder. Datan processerades i MATLAB med hjälp av mjukvaruutvecklingskit för Kinect tillhandahållandet av Windows, i syfte att extrahera egenskaper ifrån objekt på bilderna. Multipla vyer erhölls genom att låta tomatplantan rotera på en plattform, driven av en stegmotor Arduino Uno. De binära klassifikationsalgoritmer som testades var Support Vector MAchine, Decision Tree och k-Nearest Neighbor. Modellerna tränades och valideras med hjälp av en five fold cross validation i MATLABs Classification Learner applikation. Prestationsindikatorer som precision, återkallelse och F1- poäng beräknades för de olika modellerna. Resultatet visade bland annat att statiska modeller som k-NN och SVM presterade bättre för det givna problemet, och att den sistnömnda är mest lovande för framtida applikationer.
38

Channel Modeling Applied to Robust Automatic Speech Recognition

Sklar, Alexander Gabriel 01 January 2007 (has links)
In automatic speech recognition systems (ASRs), training is a critical phase to the system?s success. Communication media, either analog (such as analog landline phones) or digital (VoIP) distort the speaker?s speech signal often in very complex ways: linear distortion occurs in all channels, either in the magnitude or phase spectrum. Non-linear but time-invariant distortion will always appear in all real systems. In digital systems we also have network effects which will produce packet losses and delays and repeated packets. Finally, one cannot really assert what path a signal will take, and so having error or distortion in between is almost a certainty. The channel introduces an acoustical mismatch between the speaker's signal and the trained data in the ASR, which results in poor recognition performance. The approach so far, has been to try to undo the havoc produced by the channels, i.e. compensate for the channel's behavior. In this thesis, we try to characterize the effects of different transmission media and use that as an inexpensive and repeatable way to train ASR systems.

Page generated in 0.4549 seconds