Global ETD Search

1241	Porovnanie metód machine learningu pre analýzu kreditného rizika / Comparison of machine learning methods for credit risk analysis Bušo, Bohumír January 2015 (has links) Recently, machine learning has been put into connection with a field called ,,Big Data'' more and more. Usually, in this field, a lot of data is available and we need to gather useful information based on this data. Nowadays, when still more and more data is generated by use of mobile phones, credit cards, etc., a need for high-performance methods is serious. In this work, we describe six different methods that serve this purpose. These are logistic regression, neural networks and deep neural networks, bagging, boosting and stacking. Last three methods compose a group called Ensemble Learning. We apply all six methods on real data, which were generously provided by one of the loan providers. These methods can help them to distinguish between good and bad potential takers of loans, when the decision about the loan is being made. Lastly, the results of particular methods are compared and we also briefly outline possible ways of interpretation.
1242	A probabilistic perspective on ensemble diversity Zanda, Manuela January 2010 (has links) We study diversity in classifier ensembles from a broader perspectivethan the 0/1 loss function, the main reason being that the bias-variance decomposition of the 0/1 loss function is not unique, and therefore the relationship between ensemble accuracy and diversity is still unclear. In the parallel field of regression ensembles, where the loss function of interest is the mean squared error, this decomposition not only exists, but it has been shown that diversity can be managed via the Negative Correlation (NC) framework. In the field of probabilistic modelling the expected value of the negative log-likelihood loss function is given by its conditional entropy; this result suggests that interaction information might provide some insight into the trade off between accuracy and diversity. Our objective is to improve our understanding of classifier diversity by focusing on two different loss functions - the mean squared error and the negative log-likelihood. In a study of mean squared error functions, we reformulate the Tumer & Ghosh model for the classification error as a regression problem, and we show how the NC learning framework can be deployed to manage diversity in classification problems. In an empirical study of classifiers that minimise the negative log-likelihood loss function, we discuss model diversity as opposed to error diversity in ensembles of Naive Bayes classifiers. We observe that diversity in low-variance classifiers has to be structurally inferred. We apply interaction information to the problem of monitoring diversity in classifier ensembles. We present empirical evidence that interaction information can capture the trade-off between accuracy and diversity, and that diversity occurs at different levels of interactions between base classifiers. We use interaction information properties to build ensembles of structurally diverse averaged Augmented Naive Bayes classifiers. Our empirical study shows that this novel ensemble approach is computationally more efficient than an accuracy based approach and at the same time it does not negatively affect the ensemble classification performance. 006.3
1243	Quantitative planetary image analysis via machine learning Tar, Paul David January 2014 (has links) Over recent decades enormous quantities of image data have been acquired from planetary missions. High resolution imagery is available for many of the inner planets, gas giant systems, and some asteroids and comets. Yet, the scientific value of these images will only be fully realised if sufficient analytic power can be applied to their large scale and detailed interpretation. Unfortunately, the quantity of data has now surpassed researchers' abilities to manually analyse each image, whilst available automated approaches are limited in their scope and reliability. To mitigate against this citizen science projects are becoming increasingly common allowing large numbers of volunteers, using web-based resources, to assist in image interpretation. Yet human involvement, expert or otherwise, introduces additional problems of subjectivity and consistency. This thesis argues that what is required is an objective, quantitative, automated alternative. This thesis advocates a quantitative approach to making automated measurements from a range of surface features, including varied terrains and the counting of impact craters. Existing pattern recognition systems, and established practices, found within the imaging science and machine learning communities will be critically assessed with reference to strict quantitative criteria. This criteria is designed to accommodate the needs of scientists wishing to undertake quantitative research into the evolution of planetary surfaces, permitting measurements to be used with confidence. A new and unique method of pattern recognition, facilitating the meaningful interpretation of extracted information, will be presented. What makes the new system unique is the inclusion of a comprehensive predictive theory of measurement errors and additional safeguards to ensure the trustworthiness and integrity of results. The resulting supervised machine learning/pattern recognition system is applied to Monte-Carlo distributions, martian image data and citizen science lunar crater data. Conclusions are drawn that applying such quantitative techniques in practice is difficult, but possible, given appropriately encoded data and application specific extensions to theories and methods. It is also concluded that existing imaging science practices and methods would benefit from a change in ethos towards a quantitative agenda, and that planetary scientists wishing to use such methods will need to develop an understanding of their properties and limitations. 006.3
1244	Systematic approaches for modelling and visualising responses to perturbation of transcriptional regulatory networks Han, Nam Shik January 2013 (has links) One of the greatest challenges in modern biology is to understand quantitatively the mechanisms underlying messenger Ribonucleic acid (mRNA) transcription within the cell. To this end, integrated functional genomics attempts to use the vast wealth of data produced by modern large scale genomic projects to understand how the genome is deployed to create a diversity of tissues and species. The expression levels of tens or hundreds of thousands genes are profiled at multiple time points or different experimental conditions in the genomic projects. The profiling results are deposited in large scale quantitative data files that are not possible to analyse without systematic computational methods. In particular, it is much more difficult to experimentally measure the concentration level of transcription factor proteins and their affinity for the promoter region of genes, while it is relatively easy to measure the result of transcription using experimental techniques such as microarrays. In the absence of such biological experiments, it becomes necessary to use in silico techniques to determine the transcription factor regulatory activities given existing gene expression profile data. It therefore presents significant challenges and opportunities to the computer science community. This PhD Project made use of one such in silico technique to determine the differences (if any) in transcription factor regulatory activities of different experimental conditions and time points.The research aim of the Project was to understand the transcriptional regulatory mechanism that controls the sophisticated process of gene expression in cells. In particular, differences in the downstream signalling from which transcription factors can play a role in predisposition to diseases such as Parasitic disease, Cancer, and Neuroendocrine disease. To address this question I have had access to large integrated genomics datasets generated in studies on parasitic disease, lung cancer, and endocrine (hormone) disease. The current state-of-the-art takes existing knowledge and asks "How do these data relate to what we already know?" By applying machine learning approaches the project explored the role that such data can play in uncovering new biological knowledge. 610.28
1245	Polarizable multipolar electrostatics driven by kriging machine learning for a peptide force field : assessment, improvement and up-scaling Fletcher, Timothy January 2014 (has links) Typical, potential-driven force fields have been usefully applied to small molecules for decades. However, complex effects such as polarisation, π systems and hydrogen bonding remain difficult to model while these effects become increasingly relevant. In fact, these complex electronic effects become crucial when considering larger biological molecules in solution. Instead, machine learning can be used to recognise patterns in chemical behaviour and predict them, sacrificing computational efficiency for accuracy and completeness of the force field. The kriging machine learning method is capable of taking the geometric features of a molecule and predicting its electrostatic properties after being trained using ab initio data of the same system. We present significant improvements in functionality, application and understanding of the kriging machine learning as part of an electrostatic force field. These improvements are presented alongside an up-scaling of the problems the force field is applied to. The force field predicts electrostatic energies for all common amino acids with a mean error of 4.2 kJmol-1 (1 kcal mol-1), cholesterol with a mean error of 3.9 kJmol-1 and a 10-alanine helix with a mean error of 6.4 kJmol-1. The kriging machine learning has been shown to work identically with charged systems, π systems and hydrogen bonded systems. This work details how different chemical environments and parameters affect the kriging model quality and assesses optimal methods for computationally-efficient kriging of multipole moments. In addition to this, the kriging models have been used to predict moments for atoms they have had no training data for with little loss in accuracy. Thus, the kriging machine learning has been shown to produce transferable models. 541
1246	Knowledge-Driven Board-Level Functional Fault Diagnosis Ye, Fangming January 2014 (has links) <p>The semiconductor industry continues to relentlessly advance silicon technology scaling into the deep-submicron (DSM) era. High integration levels and structured design methods enable complex systems that can be manufactured in high volume. However, due to increasing integration densities and high operating speeds, subtle manifestation of defects leads to functional failures at the board level. Functional fault diagnosis is, therefore, necessary for board-level product qualification. However, ambiguous diagnosis results can lead to long debug times and wrong repair actions, which significantly increase repair cost and adversely impact yield.</p><p>A state-of-the-art diagnosis system involves several key components: (1) design of functional test programs, (2) collection of functional-failure syndromes, (3) building of the diagnosis engine, (4) isolation of root causes, and (5) evaluation of the diagnosis engine. Advances in each of these components can pave the way for a more effective diagnosis system, thus improving diagnosis accuracy and reducing diagnosis time. Machine-learning techniques offer an unprecedented opportunity to develop an automated and adaptive diagnosis system to increase diagnosis accuracy and speed. This dissertation targets all the above components of an advanced diagnosis system by leveraging various machine-learning techniques. </p><p>This thesis first describes a diagnosis system based on support-vector machines (SVMs), multi-kernel SVMs (MK-SVMs) and incremental learning. The MK-SVM method leverages a linear combination of single kernels to achieve accurate root-cause isolation. The MK-SVMs thus generated also can be updated based on incremental learning. Furthermore, a data-fusion technique, namely majority-weighted voting, is used to leverage multiple learning techniques for diagnosis. </p><p>The diagnosis time is considerable for complex boards due to the large number of syndromes that must be used to ensure diagnostic accuracy. Syndrome collection and analysis are major bottlenecks in state-of-the-art diagnosis procedures. Therefore, this thesis describes an adaptive diagnosis method based on decision trees (DT). The number of syndromes required for diagnosis can be significantly reduced compared to the number of syndromes used for system training. Furthermore, an incremental version of DTs is used to facilitate online learning, so as to bridge the knowledge obtained at test-design stage with the knowledge gained during volume production. </p><p>This dissertation also includes an evaluation and enhancement framework based on information theory for guiding diagnosis systems using syndrome and root-cause analysis. Syndrome analysis based on subset selection provides a representative set of syndromes. Root-cause analysis measures the discriminative ability of differentiating a given root cause from others. The metrics obtained from the proposed framework can provide guidelines for test redesign to enhance diagnosis. In addition, traditional diagnosis systems fail to provide appropriate repair suggestions when the diagnostic logs are fragmented and some syndromes are not available. The feature of handling missing syndromes based on imputation methods has therefore been added to the diagnosis system. </p><p>Finally, to tackle the bottleneck of data acquisition during the initial product ramp-up phase, a knowledge-discovery method and a knowledge-transfer method are proposed for enriching the training data set, thus facilitating board-level functional fault diagnosis. In summary, this dissertation targets the realization of an automated diagnosis system with the features of high accuracy, low diagnosis time, self-evaluation, self-learning, and ability of selective learning from other diagnosis systems. Machine learning and information-theoretic techniques have been adopted to enable the above-listed features. The proposed diagnosis system is expected to contribute to quality assurance, accelerated product release, and manufacturing-cost reduction in the semiconductor industry.</p> / Dissertation Computer engineering Board-Level Diagnosis Machine Learning System Design
1247	Adaptive Planning and Prediction in Agent-Supported Distributed Collaboration. Hartness, Ken T. N. 12 1900 (has links) Agents that act as user assistants will become invaluable as the number of information sources continue to proliferate. Such agents can support the work of users by learning to automate time-consuming tasks and filter information to manageable levels. Although considerable advances have been made in this area, it remains a fertile area for further development. One application of agents under careful scrutiny is the automated negotiation of conflicts between different user's needs and desires. Many techniques require explicit user models in order to function. This dissertation explores a technique for dynamically constructing user models and the impact of using them to anticipate the need for negotiation. Negotiation is reduced by including an advising aspect to the agent that can use this anticipation of conflict to adjust user behavior. Intelligent agents (Computer software) groupware intelligent user interface machine learning
1248	End-to-end single-rate multicast congestion detection using support vector machines Liu, Xiaoming January 2008 (has links) >Magister Scientiae - MSc / IP multicast is an efficient mechanism for simultaneously transmitting bulk data to multiple receivers. Many applications can benefit from multicast, such as audio and videoconferencing, multi-player games, multimedia broadcasting, distance education, and data replication. For either technical or policy reasons, IP multicast still has not yet been deployed in today’s Internet. Congestion is one of the most important issues impeding the development and deployment of IP multicast and multicast applications. Multicast Congestion detection Machine learning Accumulation measurement Support vector machines
1249	Computação inteligente no estudo de variantes de hemoglobina / Intelligent computation applied to the study of hemoglobin variants Thaís Helena Samed e Sousa 29 October 2004 (has links) A evolução in vitro é um método laboratorial criado para a evolução de moléculas, principalmente de proteínas. Por meio de mutações, o método busca novas propriedades de moléculas, objetivando criar novas proteínas e, com isso, intensificar o estudo e a cura de doenças, pelo desenvolvimento de novos fármacos. O grande desafio na evolução in vitro é criar o maior número possível de moléculas de proteínas que atinjam propriedades desejadas, uma vez que apenas uma fração infinitesimal das diversidades geradas utilizando-se seqüências de DNA é aproveitada. Para se obter moléculas com funcionalidade adequada por meio dessa técnica, é requerido muito tempo e aporte financeiro. Com o objetivo de avaliar computacionalmente a funcionalidade de proteínas variantes a partir das seqüências de aminoácidos buscando reduzir o custo e o tempo desprendido em laboratório, este trabalho propõe o uso de técnicas de computação inteligentes (evolução in silicio), baseadas em aprendizado de máquina e computação evolutiva. Para o emprego de técnicas de AM, bancos de dados com elevado número de informações são fundamentais. Neste sentido, escolheu-se investigar as moléculas mutantes de hemoglobina, uma vez que a quantidade de informações disponíveis sobre a mesma é bastante extensa na literatura. Os resultados obtidos mostram que é possível desenvolver algoritmos eficientes para determinar a funcionalidade de variantes de hemoglobina. Com esses resultados, busca-se contribuir no desenvolvimento de técnicas de evolução dirigida com suporte computacional / In vitro evolution is a laboratorial method developed to molecule evolution mainly proteins. By producing mutations, this method looks for new molecule properties, aiming achieve new proteins for the development of drugs for diseases. The great challenge of in vitro evolution is the development of the highest possible number of molecules that reaches desired properties. This objective is a great challenge to be transposed, since only one infinitesimal fraction of generated proteins using DNA sequencies is usefull to obtain molecules with the desired function. Besides high financial support and time are required to apply this technique. With the objective of evaluating computacionaly and functionality of proteins mutants starting from aminoacids sequences looking for to reduce the cost and the time loosened at laboratory, this work proposes the use of intelligent computation techniques based on learning of it conspires and evolutionary computation. On the other hand, when machine learning techniques are used, it is fundamental to access data mining with high number of information. In order to reduce these difficulties, this work proposes a machine learning (ML) based on approach to evaluate computationaly hemoglobin variants. ML techniques require, in general, large data base. In order to supply this requirement, hemoglobin variants were used because there is a large number of hemoglobin variants available in the literature. The obtained results shown that is possible to develop efficient algorithms to determine hemoglobin variant function. These results can contribute for development of molecule evolution techniques aprendizado de máquina hemoglobina seqüências mutantes hemoglobin machine learning mutant sequences
1250	Autogenerative Networks Chang, Oscar January 2021 (has links) Artificial intelligence powered by deep neural networks has seen tremendous improvements in the last decade, achieving superhuman performance on a diverse range of tasks. Many worry that it can one day develop the ability to recursively self-improve itself, leading to an intelligence explosion known as the Singularity. Autogenerative networks, or neural networks generating neural networks, is one major plausible pathway towards realizing this possibility. The object of this thesis is to study various challenges and applications of small-scale autogenerative networks in domains such as artificial life, reinforcement learning, neural network initialization and optimization, gradient-based meta-learning, and logical networks. Chapters 2 and 3 describe novel mechanisms for generating neural network weights and embeddings. Chapters 4 and 5 identify problems and propose solutions to fix optimization difficulties in differentiable mechanisms of neural network generation known as Hypernetworks. Chapters 6 and 7 study implicit models of network generation like backpropagating through gradient descent itself and integrating discrete solvers into continuous functions. Together, the chapters in this thesiscontribute novel proposals for non-differentiable neural network generation mechanisms, significant improvements to existing differentiable network generation mechanisms, and an assimilation of different learning paradigms in autogenerative networks. Computer science Artificial intelligence Neural networks (Computer science) Machine learning

Search results