Global ETD Search

1	Neuronové jazykové modely zohledňující morfologii pro strojový překlad / Neural Language Models with Morphology for Machine Translation Musil, Tomáš January 2017 (has links) Language models play an important role in many natural language processing tasks. In this thesis, we focus on language models built on artificial neural net- works. We examine the possibilities of using morphological annotations in these models. We propose a neural network architecture for a language model that explicitly makes use of morphological annotation of the input sentence: instead of word forms it processes lemmata and morphological tags. Both the baseline and the proposed method are evaluated on their own by perplexity, and also in the context of machine translation by the means of automatic translation quality evaluation. While in isolation the proposed model significantly outperforms the baseline, there is no apparent gain in machine translation. 1
2	Autopoietic approach to cultural transmission Papadopoulos-Korfiatis, Alexandros January 2017 (has links) Non-representational cognitive science is a promising research field that provides an alternative to the view of the brain as a “computer” filled with symbolic representations of the world and cognition as “calculations” performed on those symbols. Autopoiesis is a biological, bottom-up, non-representational theory of cognition, in which representations and meaning are framed as explanatory concepts that are constituted in an observer’s description of a cognitive system, not operational concepts in the system itself. One of the problems of autopoiesis, and all non-representational theories, is that they struggle with scaling up to high-level cognitive behaviour such as language. The Iterated Learning Model is a theory of language evolution that shows that certain features of language are explained not because of something happening in the linguistic agent’s brain, but as the product of the evolution of the linguistic system itself under the pressures of learnability and expressivity. Our goal in this work is to combine an autopoietic approach with the cultural transmission chains that the ILM uses, in order to provide the first step in an autopoietic explanation of the evolution of language. In order to do that, we introduce a simple, joint action physical task in which agents are rewarded for dancing around each other in either of two directions, left or right. The agents are simulated e-pucks, with continuous-time recurrent neural networks as nervous systems. First, we adapt a biologically plausible reinforcement learning algorithm based on spike-timing dependent plasticity tagging and dopamine reward signals. We show that, using this algorithm, our agents can successfully learn the left/right dancing task and examine how learning time influences the agents’ task success rates. Following that, we link individual learning episodes in cultural transmission chains and show that an expert agent’s initial behaviour is successfully transmitted in long chains. We investigate the conditions under which these transmission chains break down, as well as the emergence of behaviour in the absence of expert agents. By using long transmission chains, we look at the boundary conditions for the re-establishment of transmitted behaviour after chain breakdowns. Bringing all the above experiments together, we discuss their significance for non-representational cognitive science and draw some interesting parallels to existing Iterated Learning research; finally, we close by putting forward a number of ideas for additions and future research directions.
3	Adaptace rozpoznávače řeči na datech bez přepisu / Unsupervised Adaptation of Speech Recognizer Švec, Ján January 2015 (has links) The goal of this thesis is to design and test techniques for unsupervised adaptation of speech recognizers on some audio data without any textual transcripts. A training set is prepared at first, and a baseline speech recognition system is trained. This sistem is used to transcribe some unseen data. We will experiment with an adaptation data selection process based on some speech transcript quality measurement. The system is re-trained on this new set than, and the accuracy is evaluated. Then we experiment with the amount of adaptation data.
4	Využití neanotovaných dat pro trénování OCR / OCR Trained with Unanotated Data Buchal, Petr January 2021 (has links) The creation of a high-quality optical character recognition system (OCR) requires a large amount of labeled data. Obtaining, or in other words creating, such a quantity of labeled data is a costly process. This thesis focuses on several methods which efficiently use unlabeled data for the training of an OCR neural network. The proposed methods fall into the category of self-training algorithms. The general approach of all proposed methods can be summarized as follows. Firstly, the seed model is trained on a limited amount of labeled data. Then, the seed model in combination with the language model is used for producing pseudo-labels for unlabeled data. Machine-labeled data are then combined with the training data used for the creation of the seed model and they are used again for the creation of the target model. The successfulness of individual methods is measured on the handwritten ICFHR 2014 Bentham dataset. Experiments were conducted on two datasets which represented different degrees of labeled data availability. The best model trained on the smaller dataset achieved 3.70 CER [%], which is a relative improvement of 42 % in comparison with the seed model, and the best model trained on the bigger dataset achieved 1.90 CER [%], which is a relative improvement of 26 % in comparison with the seed model. This thesis shows that the proposed methods can be efficiently used to improve the OCR error rate by means of unlabeled data.
5	A secure client / server interface protocol for the electricity prepayment vending industry Subramoney, Kennedy Pregarsen 24 August 2010 (has links) Electricity prepayment systems have been successfully implemented by South Africa’s national electricity utility (Eskom) and local municipalities for more than 17 years. The prepayment vending sub-system is a critical component of prepayment systems. It provides convenient locations for customers to purchase electricity. It predominantly operates in an “offline” mode, however, electricity utilities are now opting for systems that operate in an “online” mode. “Online” mode of operation or online vending is when a prepayment token is requested from a centralised server that is remote from the client at the actual point of sale (POS). The token is only generated by the server and transferred to the POS client, once the transaction, the POS client and the payment mechanism has been authenticated and authorised. The connection between the POS client and the server is a standard computer network channel (like Internet, direct dial-up link, X.25, GPRS, etc) The lack of online vending system standardisation was a concern and significant risk for utilities, as they faced the problem of being locked into proprietary online vending systems. Thus the South African prepayment industry, lead by Eskom, initiated a project to develop an industry specification for online vending systems. The first critical project task was a current state analysis of the South African prepayment industry, technology and specifications. The prepayment industry is built around the Standard Transfer Specification (STS). STS has become the de-facto industry standard to securely transfer electricity credit from a Point of Sale (POS) to the prepaid meter. STS is supported by several “offline” vending system specifications. The current state analysis was followed by the requirements analysis phase. The requirements analysis confirmed the need for a standard interface protocol specification rather than a full systems specification. The interface specification focuses on the protocol between a vending client and vending server and does not specify the client and server application layer functionality and performance requirements. This approach encourages innovation and competitiveness amongst client and server suppliers while ensuring interoperability between these systems. The online vending protocol design was implemented using the web services framework and therefore appropriately named, XMLVend. The protocol development phase was an iterative process with two major releases, XMLVend 1.22 and XMLVend 2.1. XMLVend 2.1 is the current version of the protocol. XMLVend 2.1 addressed the shortcomings identified in XMLVend 1.22, updated the existing use cases and added several new use cases. It was also modelled as a unified modelling language (UML) interface or contract for prepayment vending services. Therefore, clients using the XMLVend interface are able to request services from any service provider (server) that implements the XMLVend interface. The UML modelled interface and use case message pairs were mapped to Web Service Definition Language (WSDL) and schema (XSD) definitions respectively. XMLVend 2.1 is a secure and open web service based protocol that facilitates prepayment vending functionality between a single logical vending server and ‘n’ number of clients. It has become a key enabler for utilities to implement standardised, secure, interoperable and flexible online vending systems. AFRIKAANS : Voorafbetaalde elektrisiteitstelsels is suksesvol deur Suid-Afrika se nasionale elektrisiteitsverskaffer (Eskom) en plaaslike munisipaliteite geïmplementeer vir meer as 17 jaar. Die Voorafbetaal verkoop-subsisteem is 'n esensiële komponent van voorafbetaal elektrisiteitstelsels. Dit laat gebruikers toe om elektrisiteit te koop by ‘n verskeidenheid van verkooppunte. In die verlede het hierdie stelsels meestal bestaan as alleenstaande verkooppunte maar elektrisiteitsverskaffers is besig om hulle stelsels te verander om in n aanlyn modus te werk. Aanlyn verkoop is wanneer 'n voorafbetaalkoepon versoek word vanaf ‘n sentrale bediener wat vêr verwydered is van die kliënt se verkooppunt. Die koepon word slegs gegenereer deur die bediener en gestuur aan die kliënt nadat die transaksie, die kliënt self, en die betaling meganisme, gemagtig is. Die koppeling tussen verkooppuntkliënt en die bediener is ‘n standaard kommunikasie kanaal, (byvoorbeeld; Internettoegang, direkte inbel skakel, X.25 en “GPRS”) Die gebrek aan 'n standaard vir aanlynverkoopstelsels was 'n bekommernis en beduidende risiko vir elektrisiteitsverskaffers, aangesien hulle ‘n probleem ondervind dat hulle ingeperk sal word tot ‘n eksklusiewe ontwerp vir so ‘n aanlynverkoopstelsel. Dus het die Suid Afrikaanse voorafbetaal industrie, gelei deur Eskom, 'n projek begin om 'n industriespesifikasie te ontwikkel vir aanlyn verkoopstelsels. Die eerste kritiese projek taak was 'n analise van die huidige stand van die Suid-Afrikaanse vooruitbetaling industrie, die tegnologie en spesifikasies. Die voorafbetaal sektor is gebou rondom die Standaard Oordrag Spesifikasie, bekend as “Standard Transfer Specification” (STS). STS word algemeen aanvaar as die industrie standaard vir die oordrag van elektrisiteit krediet vanaf 'n Verkooppunt na die voorafbetaalmeter. STS word ondersteun deur verskeie alleenstaande verkoopstelsel spesifikasies. Die analise vir die huidige status was opgevolg deur ‘n studie van die vereistes vir so ‘n stelsel. Die vereistes analise het die behoefte bevestig vir 'n standaard koppelvlak protokol spesifikasie, eerder as 'n nuwe spesifikasie vir ‘n volledige oorafbetaalstelsel. Dit bepaal alleenlik die protokol koppelvlak tussen 'n voorafbetaalkliënt en die bediener. Dit spesifiseer nie die program vlak funksionaliteit of prestasie vereistes, vir die kliënt en bediener nie. Hierdie benadering bevorder innovasie en mededingendheid onder kliënt- en bediener-verskaffers, terwyl dit nog steeds verseker dat die stelsels wedersyds aanpasbaar bly. Die aanlyn verkoopprotokol ontwerp is geïmplementeer met die webdienste raamwerk en staan bekend as XMLVend. Die protokol vir die ontwikkeling fase was 'n iteratiewe proses met die twee groot weergawes, XMLVend 1.22 en XMLVend 2.1. Die huidige weergawe van die protokol - XMLVend 2.1, adresseer die tekortkominge wat geïdentifiseer is met XMLVend 1.22, terwyl dit ook die bestaande gebruiksgevalle opdatteer en verskeie nuwe gebruiksgevalle byvoeg. Dit was ook geskoei as 'n verenigde modelleringtaal (UML) koppelvlak, of 'n kontrak, vir die voorafbetaal verkoopsdienste. Kliënte is daarom in staat om, met behulp van die XMLVend koppelvlak, dienste te versoek van enige diensverskaffer wat die XMLVend koppelvlak ondersteun. Die UML gemodelleerde koppelvlak- en gebruiksgevalle- boodskappare was gemodeleer in die Web Dienste Definisie Taal (WSDL) en skema (XSD) definisies onderskeidelik. XMLVend 2.1 is 'n sekure en oop webdienste-gebaseerde protokol wat dit moontlik maak om voorafbetaalfunksies te fasilliteer tussen 'n enkele logiese verkoopbediener en 'x' aantal kliënte. Dit het 'n sleutelrol aangeneem vir verskaffers om ‘n gestandaardiseerde, veilige, wedersyds-aanpasbare en buigsame aanlyn verkoopstelsels moontlik te maak. Copyright / Dissertation (MSc)--University of Pretoria, 2010. / Electrical, Electronic and Computer Engineering / unrestricted Webdienste Spesifikasie Protokol Standard oordrag spesifikasie Verkoopstele Uml Voorafbetaalde Unified model language Verenigde modelleringtaal Elektrisiteit Web services Specification Protocol Electricity Sts Standard transfer specification Vending Prepayment UCTD
6	Inteligentní nákupní lístek / Intelligent Shopping List Doubek, Milan January 2012 (has links) This thesis deals with creating of unique shopping lists management application and we used the newest startup techniques and principles during its development. All our hypotheses were tested by early adopters and the new courses of development were based on their feedback. The result of this thesis is a mobile application for Android operating system which is placed on Google Play market and two its components which will extend the application on the market. The main component is inteligent sorting of items on the shopping list by the supermarket model, which is created from last purchases in this supermarket. The second one is web application enabling us send new shopping lists to the mobile device.
7	Dynamický dekodér pro rozpoznávání řeči / Dynamic Decoder for Speech Recognition Veselý, Michal January 2017 (has links) The result of this work is a fully working and significantly optimized implementation of a dynamic decoder. This decoder is based on dynamic recognition network generation and decoding by a modified version of the Token Passing algorithm. The implemented solution provides very similar results to the original static decoder from BSCORE (API of Phonexia company). Compared to BSCORE this implementation offers significant reduction of memory usage. This makes use of more complex language models possible. It also facilitates integration the speech recognition to some mobile devices or dynamic adding of new words to the system.
8	K interferenci češtiny, ruštiny a angličtiny v jazykové výuce / On interference between Czech, Russian and English in language learning Dvořáková, Jana January 2011 (has links) The thesis deals with second language acquisition (SLA) of Czech in Russian and English students. It presents the main theories of SLA (generative and cognitive approaches) and compares them to the results of author's research into L2 acquisition of Czech morphology and syntax in speakers of two typologically and structurally different mothertongues. It shows that language transfer plays an important role in SLA and that some of the generative assumptions about SLA that are claimed to apply universally cannot be proven for Czech.
9	Překlad z češtiny do angličtiny / Czech-English Translation Petrželka, Jiří January 2010 (has links) Tato diplomová práce popisuje principy statistického strojového překladu a demonstruje, jak sestavit systém pro statistický strojový překlad Moses. V přípravné fázi jsou prozkoumány volně dostupné bilingvní česko-anglické korpusy. Empirická analýza časové náročnosti vícevláknových nástrojů pro zarovnání slov demonstruje, že MGIZA++ může dosáhnout až pětinásobného zrychlení, zatímco PGIZA++ až osminásobného zrychlení (v porovnání s GIZA++). Jsou otestovány tři způsoby morfologického pre-processingu českých trénovacích dat za použití jednoduchých nefaktorových modelů. Zatímco jednoduchá lemmatizace může snížit BLEU, sofistikovanější přístupy většinou BLEU zvyšují. Positivní efekty morfologického pre-processingu se vytrácejí s růstem velikosti korpusu. Vztah mezi dalšími charakteristikami korpusu (velikost, žánr, další data) a výsledným BLEU je empiricky měřen. Koncový systém je natrénován na korpusu CzEng 0.9 a vyhodnocen na testovacím vzorku z workshopu WMT 2010.

Search results