Global ETD Search

41	Seleção e construção de features relevantes para o aprendizado de máquina. / Relevant feature selection and construction for machine learning. Huei Diana Lee 27 April 2000 (has links) No Aprendizado de Máquina Supervisionado - AM - é apresentado ao algoritmo de indução um conjunto de instâncias de treinamento, no qual cada instância é um vetor de features rotulado com a classe. O algoritmo de indução tem como tarefa induzir um classificador que será utilizado para classificar novas instâncias. Algoritmos de indução convencionais baseam-se nos dados fornecidos pelo usuário para construir as descrições dos conceitos. Uma representação inadequada do espaço de busca ou da linguagem de descrição do conjunto de instâncias, bem como erros nos exemplos de treinamento, podem tornar os problemas de aprendizado difícies. Um dos problemas centrais em AM é a Seleção de um Subconjunto de Features - SSF - na qual o objetivo é tentar diminuir o número de features que serão fornecidas ao algoritmo de indução. São várias as razões para a realização de SSF. A primeira é que a maioria dos algoritmos de AM, computacionalmente viáveis, não trabalham bem na presença de muitas features, isto é a precisão dos classificadores gerados pode ser melhorada com a aplicação de SSF. Ainda, com um número menor de features, a compreensibilidade do conceito induzido pode ser melhorada. Uma terceira razão é o alto custo para coletar e processar grande quantidade de dados. Existem, basicamente, três abordagens para a SSF: embedded, filtro e wrapper. Por outro lado, se as features utilizadas para descrever os exemplos de treinamento são inadequadas, os algoritmos de aprendizado estão propensos a criar descrições excessivamente complexas e imprecisas. Porém, essas features, individualmente inadequadas, podem algumas vezes serem, convenientemente, combinadas gerando novas features que podem mostrar-se altamente representativas para a descrição de um conceito. O processo de construção de novas features é conhecido como Construção de Features ou Indução Construtiva - IC. Neste trabalho são enfocadas as abordagens filtro e wrapper para a realização de SSF, bem como a IC guiada pelo conhecimento. É descrita uma série de experimentos usando SSF e IC utilizando quatro conjuntos de dados naturais e diversos algoritmos simbólicos de indução. Para cada conjunto de dados e cada indutor, são realizadas várias medidas, tais como, precisão, tempo de execução do indutor e número de features selecionadas pelo indutor. São descritos também diversos experimentos realizados utilizando três conjuntos de dados do mundo real. O foco desses experimentos não está somente na avaliação da performance dos algoritmos de indução, mas também na avaliação do conhecimento extraído. Durante a extração de conhecimento, os resultados foram apresentados aos especialistas para que fossem feitas sugestões para experimentos futuros. Uma parte do conhecimento extraído desses três estudos de casos foram considerados muito interessantes pelos especialistas. Isso mostra que a interação de diferentes áreas de conhecimento, neste caso específico, áreas médica e computacional, pode produzir resultados interessantes. Assim, para que a aplicação do Aprendizado de Máquina possa gerar frutos é necessário que dois grupos de pesquisadores sejam unidos: aqueles que conhecem os métodos de AM existentes e aqueles com o conhecimento no domínio da aplicação para o fornecimento de dados e a avaliação do conhecimento adquirido. / In supervised Machine Learning - ML - an induction algorithm is typically presented with a set of training instances, where each instance is described by a vector of feature values and a class label. The task of the induction algorithm (inducer) is to induce a classifier that will be useful in classifying new cases. Conventional inductive-learning algorithms rely on existing (user) provided data to build their descriptions. Inadequate representation space or description language as well as errors in training examples can make learning problems be difficult. One of the main problems in ML is the Feature Subset Selection - FSS - problem, i.e. the learning algorithm is faced with the problem of selecting some subset of features upon which to focus its attention, while ignoring the rest. There are a variety of reasons that justify doing FSS. The first reason that can be pointed out is that most of the ML algorithms, that are computationally feasible, do not work well in the presence of a very large number of features. This means that FSS can improve the accuracy of the classifiers generated by these algorithms. Another reason to use FSS is that it can improve comprehensibility, i.e. the human ability of understanding the data and the rules generated by symbolic ML algorithms. A third reason for doing FSS is the high cost in some domains for collecting data. Finally, FSS can reduce the cost of processing huge quantities of data. Basically, there are three approaches in Machine Learning for FSS: embedded, filter and wrapper approaches. On the other hand, if the provided features for describing the training examples are inadequate, the learning algorithms are likely to create excessively complex and inaccurate descriptions. These individually inadequate features can sometimes be combined conveniently, generating new features which can turn out to be highly representative to the description of the concept. The process of constructing new features is called Constructive Induction - CI. Is this work we focus on the filter and wrapper approaches for FSS as well as Knowledge-driven CI. We describe a series of experiments for FSS and CI, performed on four natural datasets using several symbolic ML algorithms. For each dataset, various measures are taken to compare the inducers performance, for example accuracy, time taken to run the inducers and number of selected features by each evaluated induction algorithm. Several experiments using three real world datasets are also described. The focus of these three case studies is not only comparing the induction algorithms performance, but also the evaluation of the extracted knowledge. During the knowledge extraction step results were presented to the specialist, who gave many suggestions for the development of further experiments. Some of the knowledge extracted from these three real world datasets were found very interesting by the specialist. This shows that the interaction between different areas, in this case, medical and computational areas, may produce interesting results. Thus, two groups of researchers need to be put together if the application of ML is to bear fruit: those that are acquainted with the existing ML methods, and those with expertise in the given application domain to provide training data. aprendizado de máquina bases de dados médicos construção de features extração de conhecimentos seleção de features Feature Construction Feature Selection knowledge extraction machine learning medical databases
42	Extração de conhecimento simbólico em técnicas de aprendizado de máquina caixa-preta por similaridade de rankings / Symbolic knowledge extraction from black-box machine learning techniques with ranking similarities Rodrigo Elias Bianchi 26 September 2008 (has links) Técnicas de Aprendizado de Máquina não-simbólicas, como Redes Neurais Artificiais, Máquinas de Vetores de Suporte e combinação de classificadores têm mostrado um bom desempenho quando utilizadas para análise de dados. A grande limitação dessas técnicas é a falta de compreensibilidade do conhecimento armazenado em suas estruturas internas. Esta Tese apresenta uma pesquisa realizada sobre métodos de extração de representações compreensíveis do conhecimento armazenado nas estruturas internas dessas técnicas não-simbólicas, aqui chamadas de caixa preta, durante seu processo de aprendizado. A principal contribuição desse trabalho é a proposta de um novo método pedagógico para extração de regras que expliquem o processo de classificação seguido por técnicas não-simbólicas. Esse novo método é baseado na otimização (maximização) da similaridade entre rankings de classificação produzidos por técnicas de Aprendizado de Máquina simbólicas e não simbólicas (de onde o conhecimento interno esta sendo extraído). Experimentos foram realizados com vários conjuntos de dados e os resultados obtidos sugerem um bom potencial para o método proposto / Non-symbolic Machine Learning techniques, like Artificial Neural Networks, Support Vector Machines and Ensembles of classifiers have shown a good performance when they are used in data analysis. The strong limitation regarding the use of these techniques is the lack of comprehensibility of the knowledge stored in their internal structure. This Thesis presents an investigation of methods capable of extracting comprehensible representations of the knowledge acquired by these non-symbolic techniques, here named black box, during their learning process. The main contribution of this work is the proposal of a new pedagogical method for rule extraction that explains the classification process followed by non-symbolic techniques. This new method is based on the optimization (maximization) of the similarity between classification rankings produced by symbolic and non-symbolic (from where the internal knowledge is being extracted) Machine Learning techniques. Experiments were performed for several datasets and the results obtained suggest a good potential of the proposed method Aprendizado de máquina Extração de conhecimento Extração de regras Máquinas de vetores suporte Redes neurais Knowledge extraction Machine learning Neural networks Rule extraction Support vector machines
43	Uma investigação sobre o processo migrátorio para a plataforma de computação em nuvem no Brasil SILVA, Hilson Barbosa da 22 January 2016 (has links) Submitted by Fabio Sobreira Campos da Costa (fabio.sobreira@ufpe.br) on 2016-10-31T12:50:59Z No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) UMA INVESTIGAÇÃO SOBRE O PROCESSO MIGRATÓRIO PARA A PLATAFORMA DE COMPUTAÇÃO EM NUVEM NO BRASIL.pdf: 2425763 bytes, checksum: 20f3a5ca31db4bf99450bc873fe1b9d3 (MD5) / Made available in DSpace on 2016-10-31T12:50:59Z (GMT). No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) UMA INVESTIGAÇÃO SOBRE O PROCESSO MIGRATÓRIO PARA A PLATAFORMA DE COMPUTAÇÃO EM NUVEM NO BRASIL.pdf: 2425763 bytes, checksum: 20f3a5ca31db4bf99450bc873fe1b9d3 (MD5) Previous issue date: 2016-01-22 / Contexto: A Computação em Nuvem apresenta um novo conceito de terceirização na contratação de serviço, esses avanços vêm sendo vistos como uma nova possibilidade para a redução nos volumes dos investimentos em TIC, proporcionados pela maior flexibilidade nos serviços ofertados sob demanda, tendo na redução de custo seu apelo mais forte. Mesmo sabendo dos benefícios do investimento em nuvem, presume-se que algumas empresas são receosas na contratação de serviços e/ou infraestruturas de TIC da computação em nuvem. Essa realidade, apresentada na pesquisa da Tech Supply, especializada em Inteligência Tecnológica para Auditoria e Integridade Corporativa e TI, segundo a qual 43% das empresas brasileiras não se sentem seguras para migrar os seus sistemas para nuvem. Objetivo: Nesse contexto geral, apresentam-se dois objetivos: investigar os indícios pelos quais algumas empresas podem estar propensas a contratarem ou não os serviços de Computação em Nuvem no Brasil. Adicionalmente, identificar i e e j de sua satisfação ou insatisfação em relação aos serviços de nuvem contratados no Brasil. Método: Para este estudo, definiu-se o tipo de pesquisa realizada como exploratória de natureza descritiva e explicativa, com ênfase na abordagem quantitativa. Quanto ao procedimento técnico, aplicou-se um levantamento através de um Survey, utilizando-se o instrumento de um questionário com 14 (quatorze) itens. Referente à coleta dessas informações, disponibilizou-se através de um formulário WEB (Online). E, por fim, quanto ao tipo de análise aplicada aos resultados, utilizou-se o aprendizado automático para extração dos resultados. Com o uso de aprendizado automático, faz-se necessário o estabelecimento de algumas definições em relação aos métodos de aprendizagem a serem aplicados, como tarefa de classificação por árvore de decisão com algoritmo de classificação J48, método de aprendizagem por indução. Para o modo de treinamento, aplicou-se o não incremental. Na hierarquia do aprendizado, utilizou-se o aprendizado supervisionado e para o paradigma de aprendizado, usou-se o simbólico. Definiram-se também as variáveis classificadoras para cada linha de investigação: “SIM” en c n “NÃO”, para as empresas que não usam; e “SATISFEITO” ou “INSATISFEITO” c n e , para as empresas que já usam. Resultado: Descobriu-se que as características das empresas que estão propensas a contratar a nuvem são garantia de entrega e qualidade dos serviços. Em contrapartida, as empresas que não estão propensas a contratar os serviços da nuvem têm como características o baixo faturamento e poucos colaboradores associados à confiabilidade e segurança da informação. Para a outra linha de investigação, em relação à satisfação, os motivos são o preço da nuvem associado aos modelos de Infraestrutura e Software como Serviço. Por outro lado, para as empresas que estão insatisfeitas, os motivos são segurança da informação, disponibilidade dos serviços associados à redução de custo. / Context: Cloud computing presents a new concept of outsourcing at hiring services, these advances have been seen as a new possibility for reduction at volume of investments in ICT, provided for greater flexibility in offered on-demand services, with cost reduction its strongest appeal. Even though the c d in e en benefi i ‟ assumed that some companies are afraid for contracting services and / or cloud c ing ICT inf c e. Thi e i y e en ed in he Tech S y‟ e e ch specializing in Technology Intelligence for Audit and Corporate Integrity and IT, according to which 43% of Brazilian companies do not feel safe to migrate their cloud systems. Objective: In general, there are two objectives: to investigate the evidence by which some companies may be prone to hire or not the Computing Cloud services in Brazil. In addition, identify the reasons for those that already use their satisfaction or dissatisfaction with the cloud services contracted in Brazil. Method: For this study, the type of research conducted was defined as exploratory of descriptive and explanatory nature, with an emphasis on quantitative approach. As for the technical procedure, was applied a survey through a Survey, using the instrument of a questionnaire with 14 (fourteen) items. Concerning the collection of this information, it made available through a web form (Online). Finally, the type of analysis applied to the results, we used the automatic learning for extracting results. With the use of automatic learning, it is necessary to establish some definitions regarding learning methods to applied as a classification task by decision tree classification algorithm J48¹, learning method for induction. For the training mode, applied to the non-incremental. In the learning hierarchy, we used supervised learning and the learning paradigm, was used the symbolic. The classification variables was defined for each research line: "YES" likely to hire or "NO" for companies that do not use; and "SATISFIED" or "DISSATISFIED" with the cloud, for companies that already use. Result: It found that the characteristics of companies that are likely to hire the cloud are delivery assurance and service quality. Conversely, companies that are not likely to hire cloud services characterized by low turnover and few employees associated with the reliability and information security. For another line of research in relation to satisfaction, the reasons are the price associated with cloud models Infrastructure and Software as a Service. On the other hand, for companies that are dissatisfied, the reasons are information security, availability of services associated with cost reduction. Computação em Nuvem Algoritmo Inteligente Classificação Extração do conhecimento Aprendizado automático Survey Cloud Computing Intelligent Algorithm Classification Knowledge Extraction Automatic Learning Survey
44	Algorithms for knowledge discovery using relation identification methods Tomczak, Jakub January 2009 (has links) In this work a coherent survey of problems connected with relational knowledge representation and methods for achieving relational knowledge representation were presented. Proposed approach was shown on three applications: economic case, biomedical case and benchmark dataset. All crucial definitions were formulated and three main methods for relation identification problem were shown. Moreover, for specific relational models and observations’ types different identification methods were presented. / Double Diploma Programme, polish supervisor: prof. Jerzy Świątek, Wrocław University of Technology knowledge extraction from data relational knowledge representation relational system relation identification method Computer Sciences Datavetenskap (datalogi)
45	Agile Prototyping : A combination of different approaches into one main process Abu Baker, Mohamed January 2009 (has links) Software prototyping is considered to be one of the most important tools that are used by software engineersnowadays to be able to understand the customer’s requirements, and develop software products that are efficient,reliable, and acceptable economically. Software engineers can choose any of the available prototyping approaches tobe used, based on the software that they intend to develop and how fast they would like to go during the softwaredevelopment. But generally speaking all prototyping approaches are aimed to help the engineers to understand thecustomer’s true needs, examine different software solutions and quality aspect, verification activities…etc, that mightaffect the quality of the software underdevelopment, as well as avoiding any potential development risks.A combination of several prototyping approaches, and brainstorming techniques which have fulfilled the aim of theknowledge extraction approach, have resulted in developing a prototyping approach that the engineers will use todevelop one and only one throwaway prototype to extract more knowledge than expected, in order to improve thequality of the software underdevelopment by spending more time studying it from different points of view.The knowledge extraction approach, then, was applied to the developed prototyping approach in which thedeveloped model was treated as software prototype, in order to gain more knowledge out of it. This activity hasresulted in several points of view, and improvements that were implemented to the developed model and as a resultAgile Prototyping AP, was developed. AP integrated more development approaches to the first developedprototyping model, such as: agile, documentation, software configuration management, and fractional factorialdesign, in which the main aim of developing one, and only one prototype, to help the engineers gaining moreknowledge, and reducing effort, time, and cost of development was accomplished but still developing softwareproducts with satisfying quality is done by developing an evolutionary prototyping and building throwawayprototypes on top of it. Agile Prototyping AP agile brainstorming documentation evolutionary prototype fractional factorial design knowledge extraction approach prototyping approach requirements software configuration management software prototyping throwaway prototype. Computer Sciences Datavetenskap (datalogi)
46	Contribution de l'apprentissage par simulation à l'auto-adaptation des systèmes de production / Simulation-based machine learning for the self-adaptation of manufacturing systems Belisário, Lorena Silva 12 November 2015 (has links) Pour rester performants et compétitifs, les systèmes de production doivent être capables de s’adapter pour faire face aux changements tels que l’évolution de la demande des clients. Il leur est essentiel de pouvoir déterminer quand et comment s’adapter (capacités, etc.). Malheureusement, de tels problèmes sont connus pour être difficiles. Les systèmes de production étant complexes, dynamiques et spécifiques, leurs gestionnaires n’ont pas toujours l’expertise nécessaire ni les prévisions suffisantes concernant l’évolution de leur système. Cette thèse vise à étudier la contribution que peut apporter l’apprentissage automatique à l’auto-adaptation des systèmes de production. Dans un premier temps, nous étudions la façon dont la littérature aborde ce domaine et en proposons un cadre conceptuel dans le but de faciliter l’analyse et la formalisation des problèmes associés. Ensuite, nous étudions une stratégie d’apprentissage à partir de modèles qui ne nécessite pas d’ensemble d’apprentissage. Nous nous intéressons plus précisément à une nouvelle approche basée sur la programmation génétique linéaire visant à extraire des connaissances itérativement à partir d’un modèle de simulation pour déterminer quand et quoi faire évoluer. Notre approche est implémentée à l’aide d’Arena et μGP. Nous l’appliquons à différents exemples qui concernent l’ajout/retrait de cartes dans un système à flux tiré, le déménagement de machines ou encore le changement de politique de réapprovisionnement. Les connaissances qui en sont extraites s’avèrent pertinentes et permettent de déterminer en continu comment chaque système peut s’adapter à des évolutions. De ce fait, elles peuvent contribuer à doter un système d’une forme d’intelligence. Exprimées sous forme d’un arbre de décision, elles sont par ailleurs facilement communicables à un gestionnaire de production. Les résultats obtenus montrent ainsi l’intérêt de notre approche tout en ouvrant de nombreuses voies de recherche. / Manufacturing systems must be able to continuously adapt their characteristics to cope with the different changes that occur along their life, in order to remain efficient and competitive. These changes can take the form of the evolution of customers demand for instance. It is essential for these systems to determine when and how to adapt (e.g., through changes in capacities). Unfortunately, such issues are known to be difficult. As manufacturing systems are complex, dynamic and specific in nature, their managers do not always have all the necessary expertise nor accurate enough forecasts on the evolution of their system. This thesis aims at studying the possible contributions of machine learning to the self-adaptation of manufacturing systems. We first study how the literature deals with self-adaptation and we propose a conceptual framework to facilitate the analysis and the formalization of the associated problems. Then, we study a learning strategy relying on models, which presents the advantage of not requiring any training set. We focus more precisely on a new approach based on linear genetic programming that iteratively extracts knowledge from a simulation model. Our approach is implemented using Arena and μGP. We show its benefits by applying it to increase/decrease the number of cards in a pull control system, to move machines or to change the inventory replenishment policy. The extracted knowledge is found to be relevant for continuously determining how each system can adapt to evolutions. It can therefore contribute to provide these systems with some intelligent capabilities. Moreover, this knowledge is expressed in the simple and understandable form of a decision tree, so that it can also be easily communicated to production managers in view of their everyday use. Our results thus show the interest of our approach while opening many research directions. Systèmes de production Auto-adaptation Aide à la décision Extraction de connaissances Apprentissage automatique Simulation Programmation génétique linéaire Manufacturing systems Self-adaptation Decision support Knowledge extraction Machine learning Simulation Linear genetic programming
47	Information diffusion, information and knowledge extraction from social networks / Diffusion d'information, extraction d'information et de connaissance sans les réseaux sociaux Hoang 1985-...., Thi Bich Ngoc 28 September 2018 (has links) La popularité des réseaux sociaux a rapidement augmenté au cours de la dernière décennie. Selon Statista, environ 2 milliards d'utilisateurs utiliseront les réseaux sociaux d'ici janvier 2018 et ce nombre devrait encore augmenter au cours des prochaines années. Tout en gardant comme objectif principal de connecter le monde, les réseaux sociaux jouent également un rôle majeur dans la connexion des commerçants avec les clients, les célébrités avec leurs fans, les personnes ayant besoin d'aide avec les personnes désireuses d'aider, etc.. Le succès de ces réseaux repose principalement sur l'information véhiculée ainsi que sur la capacité de diffusion des messages dans les réseaux sociaux. Notre recherche vise à modéliser la diffusion des messages ainsi qu'à extraire et à représenter l'information des messages dans les réseaux sociaux. Nous introduisons d'abord une approche de prédiction de la diffusion de l'information dans les réseaux sociaux. Plus précisément, nous prédisons si un tweet va être re-tweeté ou non ainsi que son niveau de diffusion. Notre modèle se base sur trois types de caractéristiques: basées sur l'utilisateur, sur le temps et sur le contenu. Nous avons évalué notre modèle sur différentes collections correspondant à une douzaine de millions de tweets. Nous avons montré que notre modèle améliore significativement la F-mesure par rapport à l'état de l'art, à la fois pour prédire si un tweet va être re-tweeté et pour prédire le niveau de diffusion. La deuxième contribution de cette thèse est de fournir une approche pour extraire des informations dans les microblogs. Plusieurs informations importantes sont incluses dans un message relatif à un événement, telles que la localisation, l'heure et les entités associées. Nous nous concentrons sur l'extraction de la localisation qui est un élément primordial pour plusieurs applications, notamment les applications géospatiales et les applications liées aux événements. Nous proposons plusieurs combinaisons de méthodes existantes d'extraction de localisation dans des tweets en ciblant des applications soit orientées rappel soit orientées précision. Nous présentons également un modèle pour prédire si un tweet contient une référence à un lieu ou non. Nous montrons que nous améliorons significativement la précision des outils d'extraction de lieux lorsqu'ils se focalisent sur les tweets que nous prédisons contenir un lieu. Notre dernière contribution présente une base de connaissances permettant de mieux représenter l'information d'un ensemble de tweets liés à des événements. Nous combinons une collection de tweets de festivals avec d'autres ressources issues d'Internet pour construire une ontologie de domaine. Notre objectif est d'apporter aux utilisateurs une image complète des événements référencés au sein de cette collection. / The popularity of online social networks has rapidly increased over the last decade. According to Statista, approximated 2 billion users used social networks in January 2018 and this number is still expected to grow in the next years. While serving its primary purpose of connecting people, social networks also play a major role in successfully connecting marketers with customers, famous people with their supporters, need-help people with willing-help people. The success of online social networks mainly relies on the information the messages carry as well as the spread speed in social networks. Our research aims at modeling the message diffusion, extracting and representing information and knowledge from messages on social networks. Our first contribution is a model to predict the diffusion of information on social networks. More precisely, we predict whether a tweet is going to be diffused or not and the level of the diffusion. Our model is based on three types of features: user-based, time-based and content-based features. Being evaluated on various collections corresponding to dozen millions of tweets, our model significantly improves the effectiveness (F-measure) compared to the state-of-the-art, both when predicting if a tweet is going to be retweeted or not, and when predicting the level of retweet. The second contribution of this thesis is to provide an approach to extract information from microblogs. While several pieces of important information are included in a message about an event such as location, time, related entities, we focus on location which is vital for several applications, especially geo-spatial applications and applications linked to events. We proposed different combinations of various existing methods to extract locations in tweets targeting either recall-oriented or precision-oriented applications. We also defined a model to predict whether a tweet contains a location or not. We showed that the precision of location extraction tools on the tweets we predict to contain a location is significantly improved as compared when extracted from all the tweets.Our last contribution presents a knowledge base that better represents information from a set of tweets on events. We combined a tweet collection with other Internet resources to build a domain ontology. The knowledge base aims at bringing users a complete picture of events referenced in the tweet collection (we considered the CLEF 2016 festival tweet collection). Diffusion de l'information Extraction de l'information Extraction des connaissances Apprentissage machine Modèle prédictif Réseaux sociaux Information diffusion Information extraction Knowledge extraction Machine learning Predictive model Social networks
48	Production optimization using Discrete Event Simulation : Case study of Volvo Penta engine production line Ramos Calderón, Antonio José January 2022 (has links) Simulation allows decision-makers in modern industries analyse the outputs of specific systems, and predict future ones based on certain decision variables when in combination with optimization and Lean. This project studies the production line of boat engines in Volvo Penta at Vara (Sweden), carrying out different experiments based on a verified and validated model of the real production system using FACTS Analyzer in order to improve the key performance indicators whilst comparing the optimization performance of four genetic algorithms (NSGA-II, NSGA-III, PSO and DE) and extracting knowledge using data mining. Following the experimental design methodology, four main conclusions are obtained regarding the production line: simulation can be a successful tool to detect bottlenecks, optimization based on the resources such as the number of operators; forklifts and buffers can improve the outputs of the current system 22% just by rearranging the assets around the bottleneck and also helps predicts the behaviour of the system when these available resources are increased or decreased, when optimizing as they do not have the same performance it is important to cross-check the results of different algorithms to ensure the validity of the results, and lastly knowledge extraction can help decision[1]makers by providing sets of rules that selected solution areas of the optimization follow. DES optimization genetic algorithms knowledge extraction FACTS MIMER Computer Sciences Datavetenskap (datalogi) Computer Systems Datorsystem
49	Data-Driven Simulation Modeling of Construction and Infrastructure Operations Using Process Knowledge Discovery Akhavian, Reza 01 January 2015 (has links) Within the architecture, engineering, and construction (AEC) domain, simulation modeling is mainly used to facilitate decision-making by enabling the assessment of different operational plans and resource arrangements, that are otherwise difficult (if not impossible), expensive, or time consuming to be evaluated in real world settings. The accuracy of such models directly affects their reliability to serve as a basis for important decisions such as project completion time estimation and resource allocation. Compared to other industries, this is particularly important in construction and infrastructure projects due to the high resource costs and the societal impacts of these projects. Discrete event simulation (DES) is a decision making tool that can benefit the process of design, control, and management of construction operations. Despite recent advancements, most DES models used in construction are created during the early planning and design stage when the lack of factual information from the project prohibits the use of realistic data in simulation modeling. The resulting models, therefore, are often built using rigid (subjective) assumptions and design parameters (e.g. precedence logic, activity durations). In all such cases and in the absence of an inclusive methodology to incorporate real field data as the project evolves, modelers rely on information from previous projects (a.k.a. secondary data), expert judgments, and subjective assumptions to generate simulations to predict future performance. These and similar shortcomings have to a large extent limited the use of traditional DES tools to preliminary studies and long-term planning of construction projects. In the realm of the business process management, process mining as a relatively new research domain seeks to automatically discover a process model by observing activity records and extracting information about processes. The research presented in this Ph.D. Dissertation was in part inspired by the prospect of construction process mining using sensory data collected from field agents. This enabled the extraction of operational knowledge necessary to generate and maintain the fidelity of simulation models. A preliminary study was conducted to demonstrate the feasibility and applicability of data-driven knowledge-based simulation modeling with focus on data collection using wireless sensor network (WSN) and rule-based taxonomy of activities. The resulting knowledge-based simulation models performed very well in properly predicting key performance measures of real construction systems. Next, a pervasive mobile data collection and mining technique was adopted and an activity recognition framework for construction equipment and worker tasks was developed. Data was collected using smartphone accelerometers and gyroscopes from construction entities to generate significant statistical time- and frequency-domain features. The extracted features served as the input of different types of machine learning algorithms that were applied to various construction activities. The trained predictive algorithms were then used to extract activity durations and calculate probability distributions to be fused into corresponding DES models. Results indicated that the generated data-driven knowledge-based simulation models outperform static models created based upon engineering assumptions and estimations with regard to compatibility of performance measure outputs to reality. Construction infrastructure decision support data driven simulation knowledge extraction activity recognition machine learning smartphone sensors process mining big data analytics Civil Engineering Engineering
50	財務報表舞弊之探索研究 / Exploring financial reporting fraud 徐國英 Unknown Date (has links) Financial reporting fraud leads to not only significant investment risks for external stockholders, but also financial crises for the capital market. Although the issue of fraudulent financial reporting has drawn much attention, relevant research is much less than issues of predicting financial distress or bankruptcy. Furthermore, one purpose of exploring the financial reporting fraud with various forms is to obtain a better understand of the corporate through investigating its financial and corporate governance indicators. This study addresses the challenge with proposing an approach with the following four phases: (1) to identify a set of financial and corporate governance indicators that are significantly correlated with the financial reporting fraud; (2) to use the Growing Hierarchical Self-Organizing Map (GHSOM) to cluster the normal and fraud listed corporate data; (3) to extract knowledge about the financial reporting fraud through observing the hierarchical relationship displayed in the trained GHSOM; and (4) to make the justification of the extracted knowledge. The proposed approach is feasible because researchers claim that the GHSOM can discover the hidden hierarchical relationship from data with high dimensionality. 財務報表舞弊成長階層式自我組織圖知識擷取 Financial Reporting Fraud Knowledge Extraction

Search results