11 |
Estimativa da radiação solar global pelos modelos de Hargreaves e aprendizado de máquina em 11 regiões de São Paulo /Brasil /Zamora Ortega, Lisett Rocio January 2020 (has links)
Orientador: João Francisco Escobedo / Resumo: No presente trabalho é descrito o estudo comparativo de métodos de estimativas da irradiação solar global (HG) diária através do modelo de Hargreaves-Samani (H-S) HG/HO = a ΔT0,5 e duas técnicas de Aprendizado de Máquina (AM), Máquinas de Vetores de Suporte (MVS) e Redes Neurais Artificiais (RNA). A base de dados utilizada foi obtida em 11 cidades do estado de São Paulo de diferentes classificações climáticas no período de 2013-2017. Por meio de regressão entre a transmissividade atmosférica (HG/HO) e raiz quadrada da diferença de temperatura (ΔT0,5). O modelo estatístico H-S foi calibrado e determinado para os valores da constante (a) e equações que permitem estimar HG com baixos coeficientes de determinação para duas condições:11 cidades individualmente e total. Os modelos de H–S foram validados por meio de correlações entre os valores estimados e medidos através dos indicadores de correlação (r) e rRMSE cujos valores indicaram que os modelos podem estimar HG com razoável precisão e exatidão. As técnicas computacionais, MVS e RNA, foram treinadas com 70% dos dados nas mesmas variáveis usadas no modelo de H-S, e posteriormente foram treinadas com entradas de mais 4 variáveis meteorológicas totalizando 5 combinações. Os treinos foram validados usando uma base de dados independente de 30% da base. Os indicativos estatísticos (r) das correlações mostraram que o modelo H-S pode estimar HG com baixos coeficientes de determinação. Os indicativos estatísticos rMBE, MBE, rRMSE, RMSE... (Resumo completo, clicar acesso eletrônico abaixo) / Abstract: This work describes the comparative study of methods for estimating daily global solar irradiation (HG) using the Hargreaves-Samani (H-S) model HG / HO = a ΔT0.5 and two Machine Learning techniques (AM), Support Vectors Machines (MVS) and Artificial Neural Networks (ANN). The database used was obtained in 11 cities by the state of São Paulo from different climatic classifications between period 2013-2017. Through regression between differents atmospheric transmissivity (HG / HO) and square root of the temperature difference (ΔT0.5). The H-S statistical model was calibrated and determined for the values of constant (a) and equations that allow estimating HG with low determination coefficients for two conditions:11 cities individually and total. The H–S models were validated by correlations between the estimated and measured values using the correlation indicators (r) and rRMSE, whose values indicated that the models can estimate HG with reasonable precision and exactitude. The computational techniques, MVS and RNA, were trained with 70% of the data in the same variables used in the H-S model, later they were trained with inputs of 4 more meteorological totalling 5 combinations. The training was validated using an independent database of 30%. The statistical indications of the correlations showed that the H-S model can estimate HG with low determination coefficients. The statistical indications rMBE, MBE, rRMSE, RMSE indicate that the H-S model can be used to estimate HG with r... (Complete abstract click electronic access below) / Mestre
|
12 |
Transparency of transitivity in pantomime, sign languageCharles Roger Bradley (6410666) 02 May 2020 (has links)
This dissertation investigates whether transitivity distinctions are manifest in the
phonetics of linguistic and paralinguistic manual actions, namely lexical verbs and
classifier constructions in American Sign Language (ASL) and gestures produced
by hearing non-signers without speech (i.e., pantomime). A positive result would
indicate that grammatical features are (a) transparent and (b) may thus arise from
non-linguistic sources, here the visual-praxic domain. Given previous literature, we
predict that transitivity is transparent in pantomime and classifier constructions, but
opaque in lexical verbs. <div><br></div><div>We first collected judgments from hearing non-signers who classed pantomimes,
classifier constructions, and ASL lexical verbs as unergative, unaccusative, transitive,
or ditransitive. We found that non-signers consistently judged items across all three
stimulus types, suggesting that there is transitivity-related information in the signed
signal. </div><div><br></div><div>We then asked whether non-signers’ judging ability has its roots in a top-down
or bottom-up strategy. A top-down strategy might entail guessing the meaning of
the sign or pantomime and then using the guessed meaning to assess/guess its transitivity. A bottom-up strategy entails using one or more meaningful phonetic features
available in the formation of the signal to judge an item. We predicted that both
strategies would be available in classing pantomimes and classifier constructions, but
that transitivity information would only be available top-down in lexical verbs, given
that the former are argued to be more imagistic generally than lexical verbs. Further,
each strategy makes a different prediction with respect to the internal representation
xv
of signs and pantomimes. The top-down strategy would suggest signs and pantomimes
are unanalyzable wholes, whereas the bottom-up strategy would suggest the same are
compositional. </div><div><br></div><div>For the top-down analysis, we correlated lexical iconicity score and a measure of
the degree to which non-signers ‘agreed’ on the transitivity of an item. We found that
lexical iconicity only weakly predicts non-signer judgments of transitivity, on average
explaining 10-20% of the variance for each stimulus class. However, we note that this
is the only strategy available for lexical verbs. </div><div><br></div><div>For the bottom-up analysis, we annotate our stimuli for phonetic and phonological
features known to be relevant to transitivity and/ or event semantics in sign languages.
We then apply a text classification model to try to predict transitivity from these
features. As expected, our classifiers achieved stably high accuracy for pantomimes
and classifier constructions, but only chance accuracy for lexical verbs. </div><div><br></div><div>Taken together, the top-down and bottom-up analyses were able to predict nonsigner transitivity judgments for the pantomimes and classifier constructions, with
the bottom-up analysis providing a stronger, more convincing result. For lexical
verbs, only the top-down analysis was relevant and it performed weakly, providing
little explanatory power. When interpreting these results, we look to the semantics
of the stimuli to explain the observed differences between classes: pantomimes and
classifier constructions both encode events of motion and manipulation (by human
hands), the transitivity of which may be encoded using a limited set of strategies.
By contrast, lexical verbs denote a multitude of event types, with properties of those
events (and not necessarily their transitivity) being preferentially encoded compared
to the encoding of transitivity. That is, the resolution of transitivity is a much more
difficult problem when looking at lexical verbs. </div><div><br></div><div>This dissertation contributes to the growing body of literature that appreciates
how linguistic and paralinguistic forms may be both (para)linguistic and iconic at
the same time. It further helps disentangle at least two different types of iconicities
(lexical vs. structural), which may be selectively active in some signs or constructions
xvi
but not others. We also argue from our results that pantomimes are not holistic units,
but instead combine elements of form and meaning in an analogous way to classifier
constructions. Finally, this work also contributes to the discussion of how Language
could have evolved in the species from a gesture-first perspective: The ‘understanding’
of others’ object-directed (i.e. transitive) manual actions becomes communicative.</div>
|
13 |
Bee Shadow Recognition in Video Analysis of Omnidirectional Bee TrafficAlavala, Laasya 01 August 2019 (has links)
Over a decade ago, beekeepers noticed that the bees were dying or disappearing without any prior health disorder. Colony Collapse Disorder (CCD) has been a major threat to bee colonies around the world which affects vital human crop pollination. Possible instigators of CCD include viral and fungal diseases, decreased genetic diversity, pesticides and a variety of other factors. The interaction among any of these potential facets may be resulting in immunity loss for honey bees and the increased likelihood of collapse. It is essential to rescue honey bees and improve the health of bee colony.
Monitoring the traffic of bees helps to track the status of hive remotely. An Electronic beehive monitoring system extracts video, audio and temperature data without causing any interruption to the bee hives. This data could provide vital information on colony behavior and health. This research uses Artificial Intelligence and Computer Vision methodologies to develop and analyze technologies to monitor omnidirectional bee traffic of hives without disrupting the colony. Bee traffic means the number of bees moving in a given area in front of the hive over a given period of time. Forager traffic is the number of bees coming in and/or leaving the hive over a time. Forager traffic is a significant component in monitoring food availability and demand, colony age structure, impacts of pests and diseases, etc on hives. The goal of this research is to estimate and keep track of bee traffic by eliminating unnecessary information from video samples.
|
14 |
Engineering system design for automated space weather forecast : designing automatic software systems for the large-scale analysis of solar data, knowledge extraction and the prediction of solar activities using machine learning techniquesAlomari, Mohammad Hani January 2009 (has links)
Coronal Mass Ejections (CMEs) and solar flares are energetic events taking place at the Sun that can affect the space weather or the near-Earth environment by the release of vast quantities of electromagnetic radiation and charged particles. Solar active regions are the areas where most flares and CMEs originate. Studying the associations among sunspot groups, flares, filaments, and CMEs is helpful in understanding the possible cause and effect relationships between these events and features. Forecasting space weather in a timely manner is important for protecting technological systems and human life on earth and in space. The research presented in this thesis introduces novel, fully computerised, machine learning-based decision rules and models that can be used within a system design for automated space weather forecasting. The system design in this work consists of three stages: (1) designing computer tools to find the associations among sunspot groups, flares, filaments, and CMEs (2) applying machine learning algorithms to the associations' datasets and (3) studying the evolution patterns of sunspot groups using time-series methods. Machine learning algorithms are used to provide computerised learning rules and models that enable the system to provide automated prediction of CMEs, flares, and evolution patterns of sunspot groups. These numerical rules are extracted from the characteristics, associations, and time-series analysis of the available historical solar data. The training of machine learning algorithms is based on data sets created by investigating the associations among sunspots, filaments, flares, and CMEs. Evolution patterns of sunspot areas and McIntosh classifications are analysed using a statistical machine learning method, namely the Hidden Markov Model (HMM).
|
15 |
Machine learning for epigenetics : algorithms for next generation sequencing dataMayo, Thomas Richard January 2018 (has links)
The advent of Next Generation Sequencing (NGS), a little over a decade ago, has led to a vast and rapid increase in the generation of genomic data. The drastically reduced cost has in turn enabled powerful modifications that can be used to investigate not just genetic, but epigenetic, phenomena. Epigenetics refers to the study of mechanisms effecting gene expression other than the genetic code itself and thus, at the transcription level, incorporates DNA methylation, transcription factor binding and histone modifications amongst others. This thesis outlines and tackles two major challenges in the computational analysis of such data using techniques from machine learning. Firstly, I address the problem of testing for differential methylation between groups of bisulfite sequencing data sets. DNA methylation plays an important role in genomic imprinting, X-chromosome inactivation and the repression of repetitive elements, as well as being implicated in numerous diseases, such as cancer. Bisulfite sequencing provides single nucleotide resolution methylation data at the whole genome scale, but a sensitive analysis of such data is difficult. I propose a solution that uses a powerful kernel-based machine learning technique, the Maximum Mean Discrepancy, to leverage well-characterised spatial correlations in DNA methylation, and adapt the method for this particular use. I use this tailored method to analyse a novel data set from a study of ageing in three different tissues in the mouse. This study motivates further modifications to the method and highlights the utility of the underlying measure as an exploratory tool for methylation analysis. Secondly, I address the problem of predictive and explanatory modelling of chromatin immunoprecipitation sequencing data (ChIP-Seq). ChIP-Seq is typically used to assay the binding of a protein of interest, such as a transcription factor or histone, to the DNA, and as such is one of the most widely used sequencing assays. While peak callers are a powerful tool in identifying binding sites of sparse and clean ChIPSeq profiles, more broad signals defy analysis in this framework. Instead, generative models that explain the data in terms of the underlying sequence can help uncover mechanisms that predicting binding or the lack thereof. I explore current problems with ChIP-Seq analysis, such as zero-inflation and the use of the control experiment, known as the input. I then devise a method for representing k-mers that enables the use of longer DNA sub-sequences within a flexible model development framework, such as generalised linear models, without heavy programming requirements. Finally, I use these insights to develop an appropriate Bayesian generative model that predicts ChIP-Seq count data in terms of the underlying DNA sequence, incorporating DNA methylation information where available, fitting the model with the Expectation-Maximization algorithm. The model is tested on simulated data and real data pertaining to the histone mark H3k27me3. This thesis therefore straddles the fields of bioinformatics and machine learning. Bioinformatics is both plagued and blessed by the plethora of different techniques available for gathering data and their continual innovations. Each technique presents a unique challenge, and hence out-of-the-box machine learning techniques have had little success in solving biological problems. While I have focused on NGS data, the methods developed in this thesis are likely to be applicable to future technologies, such as Third Generation Sequencing methods, and the lessons learned in their adaptation will be informative for the next wave of computational challenges.
|
16 |
An investigation on automatic systems for fault diagnosis in chemical processesMonroy Chora, Isaac 03 February 2012 (has links)
Plant safety is the most important concern of chemical industries. Process faults can cause economic loses as well as human and environmental damages. Most of the operational faults are normally considered in the process design phase by applying methodologies such as Hazard and Operability Analysis (HAZOP). However, it should be expected that failures may occur in an operating plant. For this reason, it is of paramount importance that plant operators can promptly detect and diagnose such faults in order to take the appropriate corrective actions. In addition, preventive maintenance needs to be considered in order to increase plant safety.
Fault diagnosis has been faced with both analytic and data-based models and using several techniques and algorithms. However, there is not yet a general fault diagnosis framework that joins detection and diagnosis of faults, either registered or non-registered in records. Even more, less efforts have been focused to automate and implement the reported approaches in real practice.
According to this background, this thesis proposes a general framework for data-driven Fault Detection and Diagnosis (FDD), applicable and susceptible to be automated in any industrial scenario in order to hold the plant safety. Thus, the main requirement for constructing this system is the existence of historical process data. In this sense, promising methods imported from the Machine Learning field are introduced as fault diagnosis methods. The learning algorithms, used as diagnosis methods, have proved to be capable to diagnose not only the modeled faults, but also novel faults. Furthermore, Risk-Based Maintenance (RBM) techniques, widely used in petrochemical industry, are proposed to be applied as part of the preventive maintenance in all industry sectors. The proposed FDD system together with an appropriate preventive maintenance program would represent a potential plant safety program to be implemented.
Thus, chapter one presents a general introduction to the thesis topic, as well as the motivation and scope. Then, chapter two reviews the state of the art of the related fields. Fault detection and diagnosis methods found in literature are reviewed. In this sense a taxonomy that joins both Artificial Intelligence (AI) and Process Systems Engineering (PSE) classifications is proposed. The fault diagnosis assessment with performance indices is also reviewed. Moreover, it is exposed the state of the art corresponding to Risk Analysis (RA) as a tool for taking corrective actions to faults and the Maintenance Management for the preventive actions. Finally, the benchmark case studies against which FDD research is commonly validated are examined in this chapter.
The second part of the thesis, integrated by chapters three to six, addresses the methods applied during the research work. Chapter three deals with the data pre-processing, chapter four with the feature processing stage and chapter five with the
diagnosis algorithms. On the other hand, chapter six introduces the Risk-Based Maintenance techniques for addressing the plant preventive maintenance. The third part includes chapter seven, which constitutes the core of the thesis. In this chapter the proposed general FD system is outlined, divided in three steps: diagnosis model construction, model validation and on-line application. This scheme includes a fault detection module and an Anomaly Detection (AD) methodology for the detection of novel faults. Furthermore, several approaches are derived from this general scheme for continuous and batch processes. The fourth part of the thesis presents the validation of the approaches. Specifically, chapter eight presents the validation of the proposed approaches in continuous processes and chapter nine the validation of batch process approaches. Chapter ten raises the AD methodology in real scaled batch processes. First, the methodology is applied to a lab heat exchanger and then it is applied to a Photo-Fenton pilot plant, which corroborates its potential and success in real practice. Finally, the fifth part, including chapter eleven, is dedicated to stress the final conclusions and the main contributions of the thesis. Also, the scientific production achieved during the research period is listed and prospects on further work are envisaged. / La seguridad de planta es el problema más inquietante para las industrias químicas. Un fallo en planta puede causar pérdidas económicas y daños humanos y al medio ambiente. La mayoría de los fallos operacionales son previstos en la etapa de diseño de un proceso mediante la aplicación de técnicas de Análisis de Riesgos y de Operabilidad (HAZOP). Sin embargo, existe la probabilidad de que pueda originarse un fallo en una planta en operación. Por esta razón, es de suma importancia que una planta pueda detectar y diagnosticar fallos en el proceso y tomar las medidas correctoras adecuadas para mitigar los efectos del fallo y evitar lamentables consecuencias. Es entonces también importante el mantenimiento preventivo para aumentar la seguridad y prevenir la ocurrencia de fallos.
La diagnosis de fallos ha sido abordada tanto con modelos analíticos como con modelos basados en datos y usando varios tipos de técnicas y algoritmos. Sin embargo, hasta ahora no existe la propuesta de un sistema general de seguridad en planta que combine detección y diagnosis de fallos ya sea registrados o no registrados anteriormente. Menos aún se han reportado metodologías que puedan ser automatizadas e implementadas en la práctica real.
Con la finalidad de abordar el problema de la seguridad en plantas químicas, esta tesis propone un sistema general para la detección y diagnosis de fallos capaz de implementarse de forma automatizada en cualquier industria. El principal requerimiento para la construcción de este sistema es la existencia de datos históricos de planta sin previo filtrado. En este sentido, diferentes métodos basados en datos son aplicados como métodos de diagnosis de fallos, principalmente aquellos importados del campo de “Aprendizaje Automático”. Estas técnicas de aprendizaje han resultado ser capaces de detectar y diagnosticar no sólo los fallos modelados o “aprendidos”, sino también nuevos fallos no incluidos en los modelos de diagnosis. Aunado a esto, algunas técnicas de mantenimiento basadas en riesgo (RBM) que son ampliamente usadas en la industria petroquímica, son también propuestas para su aplicación en el resto de sectores industriales como parte del mantenimiento preventivo. En conclusión, se propone implementar en un futuro no lejano un programa general de seguridad de planta que incluya el sistema de detección y diagnosis de fallos propuesto junto con un adecuado programa de mantenimiento preventivo.
Desglosando el contenido de la tesis, el capítulo uno presenta una introducción general al tema de esta tesis, así como también la motivación generada para su desarrollo y el alcance delimitado. El capítulo dos expone el estado del arte de las áreas relacionadas al tema de tesis. De esta forma, los métodos de detección y diagnosis de fallos encontrados en la literatura son examinados en este capítulo. Asimismo, se propone una
taxonomía de los métodos de diagnosis que unifica las clasificaciones propuestas en el área de Inteligencia Artificial y de Ingeniería de procesos. En consecuencia, se examina también la evaluación del performance de los métodos de diagnosis en la literatura. Además, en este capítulo se revisa y reporta el estado del arte correspondiente al “Análisis de Riesgos” y a la “Gestión del Mantenimiento” como técnicas complementarias para la toma de medidas correctoras y preventivas. Por último se abordan los casos de estudio considerados como puntos de referencia en el campo de investigación para la aplicación del sistema propuesto. La tercera parte incluye el capítulo siete, el cual constituye el corazón de la tesis. En este capítulo se presenta el esquema o sistema general de diagnosis de fallos propuesto. El sistema es dividido en tres partes: construcción de los modelos de diagnosis, validación de los modelos y aplicación on-line. Además incluye un modulo de detección de fallos previo a la diagnosis y una metodología de detección de anomalías para la detección de nuevos fallos. Por último, de este sistema se desglosan varias metodologías para procesos continuos y por lote. La cuarta parte de esta tesis presenta la validación de las metodologías propuestas. Específicamente, el capítulo ocho presenta la validación de las metodologías propuestas para su aplicación en procesos continuos y el capítulo nueve presenta la validación de las metodologías correspondientes a los procesos por lote. El capítulo diez valida la metodología de detección de anomalías en procesos por lote reales. Primero es aplicada a un intercambiador de calor escala laboratorio y después su aplicación es escalada a un proceso Foto-Fenton de planta piloto, lo cual corrobora el potencial y éxito de la metodología en la práctica real. Finalmente, la quinta parte de esta tesis, compuesta por el capítulo once, es dedicada a presentar y reafirmar las conclusiones finales y las principales contribuciones de la tesis. Además, se plantean las líneas de investigación futuras y se lista el trabajo desarrollado y presentado durante el periodo de investigación.
|
17 |
Understanding the relationships between aesthetic properties of shapes and geometric quantities of free-form curves and surfaces using Machine Learning Techniques / Exploitation de techniques d’apprentissage artificiel pour la compréhension des liens entre les propriétés esthétiques des formes et les grandeurs géométriques de courbes et surfaces gauchesPetrov, Aleksandar 25 January 2016 (has links)
Aujourd’hui, sur le marché, on peut trouver une vaste gamme de produits différents ou des formes variées d’un même produit et ce grand assortiment fatigue les clients. Il est clair que la décision des clients d’acheter un produit dépend de l'aspect esthétique de la forme du produit et de l’affection émotionnelle. Par conséquent, il est très important de comprendre les propriétés esthétiques et de les adopter dans la conception du produit, dès le début. L'objectif de cette thèse est de proposer un cadre générique pour la cartographie des propriétés esthétiques des formes gauches en 3D en façon d'être en mesure d’extraire des règles de classification esthétiques et des propriétés géométriques associées. L'élément clé du cadre proposé est l'application des méthodologies de l’Exploration des données (Data Mining) et des Techniques d’apprentissage automatiques (Machine Learning Techniques) dans la cartographie des propriétés esthétiques des formes. L'application du cadre est d'étudier s’il y a une opinion commune pour la planéité perçu de la part des concepteurs non-professionnels. Le but de ce cadre n'est pas seulement d’établir une structure pour repérer des propriétés esthétiques des formes gauches, mais aussi pour être utilisé comme un chemin guidé pour l’identification d’une cartographie entre les sémantiques et les formes gauches différentes. L'objectif à long terme de ce travail est de définir une méthodologie pour intégrer efficacement le concept de l’Ingénierie affective (c.à.d. Affective Engineering) dans le design industriel. / Today on the market we can find a large variety of different products and differentshapes of the same product and this great choice overwhelms the customers. It is evident that the aesthetic appearance of the product shape and its emotional affection will lead the customers to the decision for buying the product. Therefore, it is very important to understand the aesthetic proper-ties and to adopt them in the early product design phases. The objective of this thesis is to propose a generic framework for mapping aesthetic properties to 3D freeform shapes, so as to be able to extract aesthetic classification rules and associated geometric properties. The key element of the proposed framework is the application of the Data Mining (DM) methodology and Machine Learning Techniques (MLTs) in the mapping of aesthetic properties to the shapes. The application of the framework is to investigate whether there is a common judgment for the flatness perceived from non-professional designers. The aim of the framework is not only to establish a structure for mapping aesthetic properties to free-form shapes, but also to be used as a guided path for identifying a mapping between different semantics and free-form shapes. The long-term objective of this work is to define a methodology to efficiently integrate the concept of Affective Engineering in the Industrial Designing.
|
18 |
An Empirical Study of Machine Learning Techniques for Classifying Emotional States from EEG DataSohaib, Ahmad Tauseef, Qureshi, Shahnawaz January 2012 (has links)
With the great advancement in robot technology, smart human-robot interaction is considered to be the most wanted success by the researchers these days. If a robot can identify emotions and intentions of a human interacting with it, that would make robots more useful. Electroencephalography (EEG) is considered one effective way of recording emotions and motivations of a human using brain. Various machine learning techniques are used successfully to classify EEG data accurately. K-Nearest Neighbor, Bayesian Network, Artificial Neural Networks and Support Vector Machine are among the suitable machine learning techniques to classify EEG data. The aim of this thesis is to evaluate different machine learning techniques to classify EEG data associated with specific affective/emotional states. Different methods based on different signal processing techniques are studied to find a suitable method to process the EEG data. Various number of EEG data features are used to identify those which give best results for different classification techniques. Different methods are designed to format the dataset for EEG data. Formatted datasets are then evaluated on various machine learning techniques to find out which technique can accurately classify EEG data according to associated affective/emotional states. Research method includes conducting an experiment. The aim of the experiment was to find the various emotional states in subjects as they look on different pictures and record the EEG data. The obtained EEG data is processed, formatted and evaluated on various machine learning techniques to find out which technique can accurately classify EEG data according to associated affective/emotional states. The experiment confirms the choice of a technique for improving the accuracy of results. According to the results, Support Vector Machine is the first and Regression Tree is the second best to classify EEG data associated with specific affective/emotional states with accuracies up to 70.00% and 60.00% respectively. SVM is better in performance than RT. However, RT is famous for providing better accuracies for diverse EEG data.
|
19 |
Algoritmos OPWI e LDM-GA para sistemas de conversão texto-fala de alta qualidade empregando a tecnologia SCAUS / Algorithm OPWI and LDM-GA for high quality text-to-speech synthesis based on automatic unit selectionMorais, Edmilson da Silva 20 April 2006 (has links)
Orientador: Fabio Violaro / Tese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Eletrica e de Computação / Made available in DSpace on 2018-08-07T23:29:06Z (GMT). No. of bitstreams: 1
Morais_EdmilsondaSilva_D.pdf: 5597078 bytes, checksum: 4dec4fc5a56d9d1ff2204a3c1d7dd86b (MD5)
Previous issue date: 2006 / Resumo: Esta Tese apresenta dois novos algoritmos denominados OPWI (Optimized Prototype Waveform Interpolation) e LDM-GA (Linguistic Data Mining Using Genetic Algorithm). Estes algoritmos são formulados no contexto de sistemas CTF-SCAUS (sistemas de Conversão Texto-Fala empregando a tecnologia de Seleção e Concatenação Automática de Unidades de Síntese). O algoritmo OPWI é apresentado como uma nova alternativa para o módulo de Back-End de sistemas CTF-SCAUS, permitindo modificações prosódicas e suavizações espectrais de alta qualidade. O algoritmo LDM-GA foi desenvolvido com o objetivo de minimizar problemas de treinamento, em sistemas CTF-SCAUS, relacionados a distribuições de probabilidade com características LNRE (Large Number of Rare Events). Resultados da avaliação dos algoritmos OPWI e LDM-GA são apresentados e discutidos detalhadamente. Além destes dois algoritmos, esta Tese apresenta uma ampla revisão bibliográfica sobre os principais módulos de um sistema CTF-SCAUS, módulos de Front-End (Módulo lingüístico), módulo prasódico, módulo de seleção de unidades de síntese e módulo de Back-End (Módulo de síntese) / Abstract: This Thesis presents two new algorithms for Unit Selection Based Text-to-Speech systems (USBTTS). The first algorithm is the OPWI (Optimized Prototype Waveform Interpolation), which was designed to be used as a Back-End module for USB-TTS. The second algorithm is the LDM-GA (Linguistic Data Mining Using Genetic AIgorithm), which was designed to minimize training problems related to LNRE (Large Number of Rare Events) distributions. Experimental results and analysis of the OPWI and LDM-GA algorithms are presented in detail. The OPWI algorithm is evaluated under operations af analysisjre-synthesis and pr~sodic modifications, TSM (Time Scale Modifications) and PSM (Pitch Scale Modifications). The LDM-GA is evaluated in the context of phaneme segmental duration prediction based on linear regression mo de!. In addition to these two new algorithms (OPWI and LDM-GA), this Thesis presents a large review of the main modules of a USB-TTS system,Front-End Module (Linguistic module), prosodic module, unit-selection module and Back-End module (Synthesis module) / Doutorado / Telecomunicações e Telemática / Doutor em Engenharia Elétrica
|
20 |
Hypervisor-based cloud anomaly detection using supervised learning techniquesNwamuo, Onyekachi 23 January 2020 (has links)
Although cloud network flows are similar to conventional network flows in many ways, there are some major differences in their statistical characteristics. However, due to the lack of adequate public datasets, the proponents of many existing cloud intrusion detection systems (IDS) have relied on the DARPA dataset which was obtained by simulating a conventional network environment. In the current thesis, we show empirically that the DARPA dataset by failing to meet important statistical characteristics of real-world cloud traffic data centers is inadequate for evaluating cloud IDS. We analyze, as an alternative, a new public dataset collected through cooperation between our lab and a non-profit cloud service provider, which contains benign data and a wide variety of attack data. Furthermore, we present a new hypervisor-based cloud IDS using an instance-oriented feature model and supervised machine learning techniques. We investigate 3 different classifiers: Logistic Regression (LR), Random Forest (RF), and Support Vector Machine (SVM) algorithms. Experimental evaluation on a diversified dataset yields a detection rate of 92.08% and a false-positive rate of 1.49% for the random forest, the best performing of the three classifiers. / Graduate
|
Page generated in 0.1083 seconds