Spelling suggestions: "subject:"[een] DECISION TREE"" "subject:"[enn] DECISION TREE""
171 |
Reservoir screening criteria for deep slurry injectionNadeem, Muhammad January 2005 (has links)
Deep slurry injection is a process of solid waste disposal that involves grinding the solid waste to a relatively fine-grained consistency, mixing the ground waste with water and/or other liquids to form slurry, and disposing of the slurry by pumping it down a well at a high enough pressure that fractures are created within the target formation.
This thesis describes the site assessment criteria involved in selecting a suitable target reservoir for deep slurry injection. The main goals of this study are the follows: <ul> <li>Identify the geological parameters important for a prospective injection site</li> <li>Recognize the role of each parameter</li> <li>Determine the relationships among different parameters</li> <li>Design and develop a model which can assemble all the parameters into a semi-quantitative evaluation process that could allow site ranking and elimination of sites that are not suitable</li> <li>Evaluate the model against several real slurry injection cases and several prospective cases where slurry injection may take place in future</li> </ul> The quantitative and qualitative parameters that are recognized as important for making a decision regarding a target reservoir for deep slurry injection operations are permeability, porosity, depth, areal extent, thickness, mechanical strength, and compressibility of a reservoir; thickness and flow properties of the cap rock; geographical distance between an injection well and a waste source or collection centre; and, regional and detailed structural and tectonic setup of an area. Additional factors affecting the security level of a site include the details of the lithostratigraphic column overlying the target reservoir and the presence of overlying fracture blunting horizons. Each parameter is discussed in detail to determine its role in site assessment and also its relationship with other parameters. A geological assessment model is developed and is divided into two components; a decision tree and a numerical calculation system. The decision tree deals with the most critical parameters, those that render a site unsuitable or suitable, but of unspecified quality. The numerical calculation gives a score to a prospective injection site based on the rank numbers and weighting factors for the various parameters. The score for a particular site shows its favourability for the injection operation, and allows a direct comparison with other available sites. Three categories have been defined for this purpose, i. e. average, below average, and above average. A score range of 85 to 99 of 125 places a site in the ?average? category; a site will be unsuitable for injection if it belongs to the ?below average? category, i. e. if the total score is less than 85, and the best sites will generally have scores that are in the ?above average? category, with a score of 100 or higher. One may assume that for sites that fall in the ?average? category there will have to be more detailed tests and assessments. The geological assessment model is evaluated using original geological data from North America and Indonesia for sites that already have undergone deep slurry injection operations and also for some possible prospective sites. The results obtained from the model are satisfactory as they are in agreement with the empirical observations. Areas for future work consist of the writing of a computer program for the geological model, and further evaluation of the model using original data from more areas representing more diverse geology from around the world.
|
172 |
Structures Markoviennes cachées et modèles à corrélations conditionnelles dynamiques : extensions et applications aux corrélations d'actifs financiers / Hidden Markov Models and dynamic conditional correlations models : extensions et application to stock market time seriesCharlot, Philippe 25 November 2010 (has links)
L'objectif de cette thèse est d'étudier le problème de la modélisation des changements de régime dans les modèles a corrélations conditionnelles dynamiques en nous intéressant plus particulièrement a l'approche Markov-switching. A la différence de l'approche standard basée sur le modèle à chaîne de Markov caché (HMM) de base, nous utilisons des extensions du modèle HMM provenant des modèles graphiques probabilistes. Cette discipline a en effet proposé de nombreuses dérivations du modèle de base permettant de modéliser des structures complexes. Cette thèse se situe donc a l'interface de deux disciplines: l'économétrie financière et les modèles graphiques probabilistes.Le premier essai présente un modèle construit a partir d'une structure hiérarchique cachée markovienne qui permet de définir différents niveaux de granularité pour les régimes. Il peut être vu comme un cas particulier du modèle RSDC (Regime Switching for Dynamic Correlations). Basé sur le HMM hiérarchique, notre modèle permet de capter des nuances de régimes qui sont ignorées par l'approche Markov-Switching classique.La seconde contribution propose une version Markov-switching du modèle DCC construite a partir du modèle HMM factorise. Alors que l'approche Markov-switching classique suppose que les tous les éléments de la matrice de corrélation suivent la même dynamique, notre modèle permet à tous les éléments de la matrice de corrélation d'avoir leur propre dynamique de saut. Markov-switching. A la différence de l'approche standard basée sur le modèle à chaîne de Markov caché (HMM) de base, nous utilisons des extensions du modèle HMM provenant des modèles graphiques probabilistes. Cette discipline a en effet propose de nombreuses dérivations du modèle de base permettant de modéliser des structures complexes. Cette thèse se situe donc a l'interface de deux disciplines: l'économétrie financière et les modèles graphiques probabilistes.Le premier essai présente un modèle construit a partir d'une structure hiérarchique cachée markovienne qui permet de définir différents niveaux de granularité pour les régimes. Il peut ^etre vu commeun cas particulier du modele RSDC (Regime Switching for Dynamic Correlations). Base sur le HMMhierarchique, notre modele permet de capter des nuances de regimes qui sont ignorees par l'approcheMarkov-Switching classique.La seconde contribution propose une version Markov-switching du modele DCC construite a partir dumodele HMM factorise. Alors que l'approche Markov-switching classique suppose que les tous les elementsde la matrice de correlation suivent la m^eme dynamique, notre modele permet a tous les elements de lamatrice de correlation d'avoir leur propre dynamique de saut.Dans la derniere contribution, nous proposons un modele DCC construit a partir d'un arbre dedecision. L'objectif de cet arbre est de relier le niveau des volatilites individuelles avec le niveau descorrelations. Pour cela, nous utilisons un arbre de decision Markovien cache, qui est une extension de HMM. / The objective of this thesis is to study the modelling of change in regime in the dynamic conditional correlation models. We focus particularly on the Markov-switching approach. Unlike the standard approach based on the Hidden Markov Model (HMM), we use extensions of HMM coming from probabilistic graphical models theory. This discipline has in fact proposed many derivations of the basic model to model complex structures. Thus, this thesis can be view at the interface of twodisciplines: financial econometrics and probabilistic graphical models.The first essay presents a model constructed from a hierarchical hidden Markov which allows to increase the granularity of the regimes. It can be seen as a special case of RSDC model (Regime Switching for Dynamic Correlations). Based on the hierarchical HMM, our model can capture nuances of regimes that are ignored by the classical Markov-Switching approach.The second contribution proposes a Markov-switching version of the DCC model that is built from the factorial HMM. While the classical Markov-switching approach assumes that all elements of the correlation matrix follow the same switching dynamic, our model allows all elements of the correlation matrix to have their own switching dynamic.In the final contribution, we propose a model DCC constructed based on a decision tree. The objective of this tree is to link the level of volatility with the level of individual correlations. For this, we use a hidden Markov decision tree, which is an extension of HMM.
|
173 |
[en] RELATIONSHIP MARKETING: CROSS-SELLING ON MOBILE TELECOM / [pt] MARKETING DE RELACIONAMENTO: CROSS-SELLING NA TELEFONIA MÓVELMANOELA BRANDAO DE OLIVEIRA 20 April 2015 (has links)
[pt] Com rápido crescimento nos últimos anos, o mercado de telecomunicações está ficando cada vez mais saturado. Como a comunicação tradicional por meio de serviços de voz já é amplamente utilizada, as operadoras têm enfrentado dificuldades em atrair novos usuários. Neste cenário, as operadoras têm direcionado cada vez mais esforços nas ações de cross-selling para rentabilizar sua base de clientes, oferecendo e estimulando o uso de novos serviços. Nesta pesquisa, serão utilizados dados existentes no banco de dados de uma operadora de telefonia móvel do mercado brasileiro para testar um modelo que facilita a identificação dos clientes mais propensos à contratação de novos serviços. Os dados foram tratados por meio de técnicas de mineração de dados e árvore de decisão. Os resultados sugerem que, com base na modelagem proposta, ações de cross-selling podem ser otimizadas com o aumento da taxa de retorno e, conseqüentemente, redução no custo das abordagens e menos desgaste da base de clientes com contatos irrelevantes. / [en] Due to its fast growth in recent years, the wireless market is becoming increasingly saturated. Since traditional communication through voice services is already widely used by most individuals, wireless carriers are facing difficulties in finding and attracting new users for such services. Given this scenario, enterprises are turning their attention to cross-selling campaigns to monetize their client base, offering and stimulating the use of new services. In this research, an existent data set from a Brazilian mobile telecom carrier was used to test a model that could facilitate the identification of current customers more likely to be interested in acquiring new services. The data were analyzed and modeled via data mining and decision tree. The results suggest that, if the proposed model is used, cross-selling campaigns could be optimized, achieving an increased rate of return, reduction in the cost of contacts and less wear of the client base with irrelevant offers.
|
174 |
Anomaly Detection in Categorical Data with Interpretable Machine Learning : A random forest approach to classify imbalanced dataYan, Ping January 2019 (has links)
Metadata refers to "data about data", which contains information needed to understand theprocess of data collection. In this thesis, we investigate if metadata features can be usedto detect broken data and how a tree-based interpretable machine learning algorithm canbe used for an effective classification. The goal of this thesis is two-fold. Firstly, we applya classification schema using metadata features for detecting broken data. Secondly, wegenerate the feature importance rate to understand the model’s logic and reveal the keyfactors that lead to broken data. The given task from the Swedish automotive company Veoneer is a typical problem oflearning from extremely imbalanced data set, with 97 percent of data belongs healthy dataand only 3 percent of data belongs to broken data. Furthermore, the whole data set containsonly categorical variables in nominal scales, which brings challenges to the learningalgorithm. The notion of handling imbalanced problem for continuous data is relativelywell-studied, but for categorical data, the solution is not straightforward. In this thesis, we propose a combination of tree-based supervised learning and hyperparametertuning to identify the broken data from a large data set. Our methods arecomposed of three phases: data cleaning, which is eliminating ambiguous and redundantinstances, followed by the supervised learning algorithm with random forest, lastly, weapplied a random search for hyper-parameter optimization on random forest model. Our results show empirically that tree-based ensemble method together with a randomsearch for hyper-parameter optimization have made improvement to random forest performancein terms of the area under the ROC. The model outperformed an acceptableclassification result and showed that metadata features are capable of detecting brokendata and providing an interpretable result by identifying the key features for classificationmodel.
|
175 |
[en] CONTRACTING STRATEGIES IN ENERGY AUCTIONS FOR DISTRIBUTION COMPANIES UNDER DEMAND UNCERTAINTY / [pt] ESTRATÉGIA DE CONTRATAÇÃO DAS DISTRIBUIDORAS EM LEILÕES DE ENERGIA SOB INCERTEZA NA DEMANDAANDRE RESENDE GUIMARAES 16 October 2006 (has links)
[pt] O objetivo desta dissertação de mestrado é analisar o novo
marco regulatório do
setor elétrico brasileiro e seus impactos para as empresas
distribuidoras de energia.
Para isto, foi desenvolvida uma ferramenta computacional
para elaborar estratégias de
atuação das distribuidoras nos leilões de compra de
energia instituídos pela nova
regulamentação. Desta forma, é possível simular o processo
de contratação das
distribuidoras no âmbito do ACR e, com os resultados,
realizar análises do impacto
das novas regras na alocação dos riscos as distribuidoras.
O problema consiste, em um
ambiente de incerteza da demanda e dado um conjunto de
instrumentos de risco,
determinar a estratégia de contratação das distribuidoras,
fornecendo o montante de
energia a ser comprado em cada leilão anteriormente
descrito e resultado da melhor
compra dados os contratos candidatos. A metodologia de
solução é otimização
estocástica multi-estágio, levando em consideração,
principalmente, os diversos
horizontes de contratação e preços da energia, visando
minimizar uma ponderação
entre tarifa para consumidor e custos para distribuidora. / [en] The objective of this work is to analyze the new
regulatory framework of the
Brazilian electric sector. In this sense, it was developed
a computational tool in order
to elaborate strategies for the distribution companies
(DISCOs) in the energy auctions
instituted by the new regulation. The computational tool
was used to simulate the
contracts acquisition process by the DISCOs and the
results were analyzed to measure
impact of new rules and risks allocation for the
distribution companies. The problem
consists, considering the demand uncertainty and the
available risk management
instruments, in determining the contracting strategy of
the DISCOs, i.e., the amount
of energy to be bought in each auction that results from
the best purchase given the
candidate contracts. The solution methodology is based on
a multi-stage stochastic
optimization algorithm, minimizing the tariff for consumer
and costs for DISCO,
taking into account different prices and horizons of the
energy contracts.
|
176 |
Metodologia baseada em medidas dispersas de tensão e árvores de decisão para localização de faltas em sistemas de distribuição modernos / Methodology based on dispersed voltage measures and decision trees for fault location in modern distribution systemsAraújo, Marcel Ayres de 06 October 2017 (has links)
Nos sistemas de distribuição, a grande ramificação, radialidade, heterogeneidade, dinâmica das cargas e demais particularidades, impõem dificuldades à localização de faltas, representando um desafio permanente na busca por melhores indicadores de continuidade e confiabilidade no fornecimento de energia elétrica. A regulação incisiva dos órgãos do setor, a penetração de geração distribuída e a tendência de modernização trazida pelas redes inteligentes, demandam detalhados estudos para readequação dos sistemas elétricos a conjuntura atual. Neste contexto, esta tese propõe o desenvolvimento de uma metodologia para localização de faltas em sistemas de distribuição empregando a capacidade dos medidores inteligentes de monitoramento e de aquisição de tensão em diferentes pontos da rede elétrica. A abordagem proposta baseia-se na estimação, por ferramentas de aprendizado de máquina, das impedâncias de sequência zero e positiva entre os pontos de alocação dos medidores inteligentes e de ocorrência de falta, e do estado de sensibilização destes medidores frente a correntes de falta. Assim, calculando-se as respectivas distâncias elétricas em função das impedâncias estimadas e definidas as direções das mesmas em relação a topologia da rede, busca-se identificar o ponto ou área com maior sobreposição de distâncias elétricas como o local ou a região de maior probabilidade da falta em relação aos medidores inteligentes. Para tanto, faz-se uso combinado de ferramentas convencionais e inteligentes pela aplicação dos conceitos de análise de sistemas elétricos, diagnóstico dos desvios de tensão, e classificação de padrões por meio da técnica de aprendizado de máquina denominada Árvore de Decisão. Os resultados obtidos pela aplicação desta metodologia demonstram que o uso de informações redundantes fornecidas pelos medidores inteligentes minimiza os erros de estimação. Além disso, para a maior parte dos casos testados o erro absoluto máximo de localização da falta se concentra entre 200 m e 1000 m, o que reduz a busca pelo local de ocorrência da falta pelas equipes de manutenção da rede elétrica. / In distribution systems, the dense branching, radial pattern, heterogeneity, dynamic of the loads, and other characteristics create several difficulties in defining the fault location, representing a great challenge in the search for better continuity and reliability indicators of the electrical energy supply. The intense government regulations, the increasing use of distributed generation, and the trend towards modernization via smart grids require a detailed study in order to upgrade the current systems. In this context, this thesis proposes a methodology development for fault location in distribution systems with the use of smart meters monitors and the acquisition of voltage at different points in the electrical network. The proposed method is based on the estimation, using machine learning, of the state of awareness of smart meters across the fault currents and of the zero and positive sequence impedance between the location of these meters and of the fault occurrence. Therefore, by calculating the electrical distances as a function of the estimated impedances and defining its the direction in relation to the network topology, the point/region with the biggest superposition of the electrical distances can be assigned as the point/region with the highest probability of fault occurrence in relation to the smart probes. For this purpose, a machine learning technique named decision tree is used to apply concept analyses to the electrical systems, diagnosis of voltage deviations, and pattern recognition of the electrical systems. The results obtained by the application of this methodology demonstrate that the use of redundant information provided by the smart meters minimizes estimation errors. In addition, for most of the cases tested, the maximum absolute error of the fault location is concentrated between 200 m and 1000 m, which reduces the search for the fault location by the maintenance teams of the electrical network.
|
177 |
Análise inteligente de dados em um banco de dados de procedimentos em cardiologia intervencionista / Intelligent data analysis in an interventional cardiology procedures databaseCampos Neto, Cantídio de Moura 02 August 2016 (has links)
O tema deste estudo abrange duas áreas do conhecimento: a Medicina e a Ciência da Computação. Consiste na aplicação do processo de descoberta de conhecimento em base de Dados (KDD - Knowledge Discovery in Databases), a um banco de dados real na área médica denominado Registro Desire. O Registro Desire é o registro mais longevo da cardiologia intervencionista mundial, unicêntrico e acompanha por mais de 13 anos 5.614 pacientes revascularizados unicamente pelo implante de stents farmacológicos. O objetivo é criar por meio desta técnica um modelo que seja descritivo e classifique os pacientes quanto ao risco de ocorrência de eventos cardíacos adversos maiores e indesejáveis, e avaliar objetivamente seu desempenho. Posteriormente, apresentar as regras extraídas deste modelo aos usuários para avaliar o grau de novidade e de concordância do seu conteúdo com o conhecimento dos especialistas. Foram criados modelos simbólicos de classificação pelas técnicas da árvore de decisão e regras de classificação utilizando para a etapa de mineração de dados os algoritmos C4.5, Ripper e CN2, em que o atributo-classe foi a ocorrência ou não do evento cardíaco adverso. Por se tratar de uma classificação binária, os modelos foram avaliados objetivamente pelas métricas associadas à matriz de confusão como acurácia, sensibilidade, área sob a curva ROC e outras. O algoritmo de mineração processa automaticamente todos os atributos de cada paciente exaustivamente para identificar aqueles fortemente associados com o atributo-classe (evento cardíaco) e que irão compor as regras. Foram extraídas as principais regras destes modelos de modo indireto, por meio da árvore de decisão ou diretamente pela regra de classificação, que apresentaram as variáveis mais influentes e preditoras segundo o algoritmo de mineração. Os modelos permitiram entender melhor o domínio de aplicação, relacionando a influência de detalhes da rotina e as situações associadas ao procedimento médico. Pelo modelo, foi possível analisar as probabilidades da ocorrência e da não ocorrência de eventos em diversas situações. Os modelos induzidos seguiram uma lógica de interpretação dos dados e dos fatos com a participação do especialista do domínio. Foram geradas 32 regras das quais três foram rejeitadas, 20 foram regras esperadas e sem novidade, e 9 foram consideradas regras não tão esperadas, mas que tiveram grau de concordância maior ou igual a 50%, o que as tornam candidatas à investigação para avaliar sua eventual importância. Tais modelos podem ser atualizados ao aplicar novamente o algoritmo de mineração ao banco com os dados mais recentes. O potencial dos modelos simbólicos e interpretáveis é grande na Medicina quando aliado à experiência do profissional, contribuindo para a Medicina baseada em evidência. / The main subject of this study comprehends two areas of knowledge, the Medical and Computer Science areas. Its purpose is to apply the Knowledge Discovery Database-KDD to the DESIRE Registry, an actual Database in Medical area. The DESIRE Registry is the oldest world\'s registry in interventional cardiology, is unicentric, which has been following up 5.614 resvascularized patients for more then 13 years, solely with pharmacological stent implants. The goal is to create a model using this technique that is meaningful to classify patients as the risk of major adverse cardiac events (MACE) and objectively evaluate their performance. Later present rules drawn from this model to the users to assess the degree of novelty and compliance of their content with the knowledge of experts. Symbolic classification models were created using decision tree model, and classification rules using for data mining step the C4.5 algorithms, Ripper and CN2 where the class attribute is the presence or absence of a MACE. As the classification is binary, the models where objectively evaluated by metrics associated to the Confusion Matrix, such as accuracy, sensitivity, area under the ROC curve among others. The data mining algorithm automatically processes the attributes of each patient, who are thoroughly tested in order to identify the most predictive to the class attribute (MACE), whom the rules will be based on. Indirectly, using decision tree, or directly, using the classification rules, the main rules of these models were extracted to show the more predictable and influential variables according to the mining algorithm. The models allowed better understand the application range, creating a link between the influence of the routine details and situations related to the medical procedures. The model made possible to analyse the probability of occurrence or not of events in different situations. The induction of the models followed an interpretation of the data and facts with the participation of the domain expert. Were generated 32 rules of which only three were rejected, 20 of them were expected rules and without novelty and 9 were considered rules not as expected but with a degree of agreement higher or equal 50%, which became candidates for an investigation to assess their possible importance. These models can be easily updated by reapplying the mining process to the database with the most recent data. There is a great potential of the interpretable symbolic models when they are associated with professional background, contributing to evidence-based medicine.
|
178 |
Técnicas de Data Mining na aquisição de clientes para financiamento de Crédito Direto ao Consumidor - CDC / Data Mining Techniques to acquire new customers for financing of Consumer CreditSilva, Adriana Maria Marques da 27 September 2012 (has links)
O trabalho busca dissertar sobre as técnicas de data mining mais difundidas: regressão logística, árvore de decisão e rede neural, além de avaliar se tais técnicas oferecem ganhos financeiros para instituições privadas que contam com processos ativos de conquista de clientes. Uma empresa do setor financeiro será utilizada como objeto de estudo, especificamente nos seus processos de aquisição de novos clientes para adesão do Crédito Direto ao Consumidor (CDC). Serão mostrados os resultados da aplicação nas três técnicas mencionadas, para que seja possível verificar se o emprego de modelos estatísticos discriminam os clientes potenciais mais propensos dos menos propensos à adesão do CDC e, então, verificar se tal ação impulsiona na obtenção de ganhos financeiros. Esses ganhos poderão vir mediante redução dos custos de marketing abordando-se somente os clientes com maiores probabilidades de responderem positivamente à campanha. O trabalho apresentará o funcionamento de cada técnica teoricamente, e conforme os resultados indicam, data mining é uma grande oportunidade para ganhos financeiros em uma empresa. / The paper intends to discourse about most widespread data mining techniques: logistic regression, decision tree and neural network, and assess whether these techniques provide financial gains for private institutions that have active processes for business development. A company of the financial sector is used as object of study, specifically in the processes of acquiring new customers for adhesion to consumer credit (in Brazil CDC). This research will show the results of the three above mentioned techniques, to check whether the statistical models point out relevant differences between prospects´ intentions to adhere to consumer credit. In the meantime, the techniques are checked whether they leverage financial gain. These gains are expected to came from better focused and directed marketing efforts. The paper presents the operation of each technique theoretically, and as the results indicate, data mining is a great opportunity for a company boost profits.
|
179 |
Commande prédictive hybride et apprentissage pour la synthèse de contrôleurs logiques dans un bâtiment. / Hybrid Model Predictive Control and Machine Learning for development of logical controllers in buildingsLe, Duc Minh Khang 09 February 2016 (has links)
Une utilisation efficace et coordonnée des systèmes installés dans le bâtiment doit permettre d’améliorer le confort des occupants tout en consommant moins d’énergie. Ces objectifs à optimiser sont pourtant antagonistes. Le problème résultant peut être alors vu comme un problème d’optimisation multicritères. Par ailleurs, pour répondre aux enjeux industriels, il devra être résolu non seulement dans une optique d’implémentation simple et peu coûteuse, avec notamment un nombre réduit de capteurs, mais aussi dans un souci de portabilité pour que le contrôleur résultant puisse être implanté dans des bâtiments d’orientation différente et situés dans des lieux géographiques variés.L’approche choisie est de type commande prédictive (MPC, Model Predictive Control) dont l’efficacité pour le contrôle du bâtiment a déjà été illustrée dans de nombreux travaux, elle requiert cependant des efforts de calcul trop important. Cette thèse propose une méthodologie pour la synthèse des contrôleurs, qui doivent apporter une performance satisfaisante en imitant les comportements du MPC, tout en répondant à des contraintes industriels. Elle est divisée deux grandes étapes :1. La première étape consiste à développer un contrôleur MPC. De nombreux défis doivent être relevés tels que la modélisation, le réglage des paramètres et la résolution du problème d’optimisation.2. La deuxième étape applique différents algorithmes d’apprentissage automatique (l’arbre de décision, AdaBoost et SVM) sur une base de données obtenue à partir de simulations utilisant le contrôleur prédictif développé. Les grands points levés sont la construction de la base de données, le choix de l’algorithme de l’apprentissage et le développement du contrôleur logique.La méthodologie est appliquée dans un premier temps à un cas simple pour piloter un volet,puis validée dans un cas plus complexe : le contrôle coordonné du volet, de l’ouvrant et dusystème de ventilation. / An efficient and coordinated control of systems in buildings should improve occupant comfort while consuming less energy. However, these objectives are antagonistic. It can then be formulated as a multi-criteria optimization problem. Moreover, it should be solved not only in a simple and cheap implementation perspective, but also for the sake of adaptability of the controller which can be installed in buildings with different orientations and different geographic locations.The MPC (Model Predictive Control) approach is shown well suited for building control in the state of the art but it requires a big computing effort. This thesis presents a methodology to develop logical controllers for equipments in buildings. It helps to get a satisfactory performance by mimicking the MPC behaviors while dealing with industrial constraints. Two keys steps are required :1. In the first step, an optimal controller is developed with hybrid MPC technique. There are challenges in modeling, parameters tuning and solving the optimization problem.2. In the second step, different Machine Learning algorithms (Decision tree, AdaBoost, SVM) are tested on database which is obtained with the simulation with the MPC controller. The main points are the construction of the database, the choice of learning algorithm and the development of logic controller.First, our methodology is tested on a simple case study to control a blind. Then, it is validatedwith a more complex case : development of a coordinated controller for a blind, natural ventilationand mechanical ventilation.
|
180 |
Estimation of distribution algorithms for clustering and classificationCagnini, Henry Emanuel Leal 20 March 2017 (has links)
Submitted by Caroline Xavier (caroline.xavier@pucrs.br) on 2017-06-29T11:51:00Z
No. of bitstreams: 1
DIS_HENRY_EMANUEL_LEAL_CAGNINI_COMPLETO.pdf: 3650909 bytes, checksum: 55d52061a10460875dba677a9812fe9c (MD5) / Made available in DSpace on 2017-06-29T11:51:00Z (GMT). No. of bitstreams: 1
DIS_HENRY_EMANUEL_LEAL_CAGNINI_COMPLETO.pdf: 3650909 bytes, checksum: 55d52061a10460875dba677a9812fe9c (MD5)
Previous issue date: 2017-03-20 / Extrair informa??es relevantes a partir de dados n?o ? uma tarefa f?cil. Tais dados podem vir a partir
de lotes ou em fluxos cont?nuos, podem ser completos ou possuir partes faltantes, podem ser duplicados,
e tamb?m podem ser ruidosos. Ademais, existem diversos algoritmos que realizam tarefas de
minera??o de dados e, segundo o teorema do "Almo?o Gr?tis", n?o existe apenas um algoritmo que
venha a solucionar satisfatoriamente todos os poss?veis problemas. Como um obst?culo final, algoritmos
geralmente necessitam que hiper-par?metros sejam definidos, o que n?o surpreendentemente
demanda um m?nimo de conhecimento sobre o dom?nio da aplica??o para que tais par?metros sejam
corretamente definidos. J? que v?rios algoritmos tradicionais empregam estrat?gias de busca local
gulosas, realizar um ajuste fino sobre estes hiper-par?metros se torna uma etapa crucial a fim de obter
modelos preditivos de qualidade superior. Por outro lado, Algoritmos de Estimativa de Distribui??o
realizam uma busca global, geralmente mais eficiente que realizar uma buscam exaustiva sobre todas
as poss?veis solu??es para um determinado problema. Valendo-se de uma fun??o de aptid?o, algoritmos
de estimativa de distribui??o ir?o iterativamente procurar por melhores solu??es durante seu
processo evolutivo. Baseado nos benef?cios que o emprego de algoritmos de estimativa de distribui??o
podem oferecer para as tarefas de agrupamento e indu??o de ?rvores de decis?o, duas tarefas
de minera??o de dados consideradas NP-dif?cil e NP-dif?cil/completo respectivamente, este trabalho
visa desenvolver novos algoritmos de estimativa de distribui??o a fim de obter melhores resultados
em rela??o a m?todos tradicionais que empregam estrat?gias de busca local gulosas, e tamb?m sobre
outros algoritmos evolutivos. / Extracting meaningful information from data is not an easy task. Data can come in batches or through
a continuous stream, and can be incomplete or complete, duplicated, or noisy. Moreover, there are
several algorithms to perform data mining tasks, and the no-free lunch theorem states that there
is not a single best algorithm for all problems. As a final obstacle, algorithms usually require hyperparameters
to be set in order to operate, which not surprisingly often demand a minimum knowledge
of the application domain to be fine-tuned. Since many traditional data mining algorithms employ a
greedy local search strategy, fine-tuning is a crucial step towards achieving better predictive models.
On the other hand, Estimation of Distribution Algorithms perform a global search, which often is
more efficient than performing a wide search through the set of possible parameters. By using a
quality function, estimation of distribution algorithms will iteratively seek better solutions throughout
its evolutionary process. Based on the benefits that estimation of distribution algorithms may offer
to clustering and decision tree-induction, two data mining tasks considered to be NP-hard and NPhard/
complete, respectively, this works aims at developing novel algorithms in order to obtain better
results than traditional, greedy algorithms and baseline evolutionary approaches.
|
Page generated in 0.0334 seconds