Global ETD Search

481	Adaptive game AI Spronck, Pieter Hubert Marie. January 2005 (has links) Proefschrift Universiteit Maastricht. / Met index, lit. opg. - Met samenvatting in het Nederlands.
482	Machine Learning Approaches to Modeling the Physiochemical Properties of Small Peptides Jensen, Kyle, Styczynski, Mark, Stephanopoulos, Gregory 01 1900 (has links) Peptide and protein sequences are most commonly represented as a strings: a series of letters selected from the twenty character alphabet of abbreviations for the naturally occurring amino acids. Here, we experiment with representations of small peptide sequences that incorporate more physiochemical information. Specifically, we develop three different physiochemical representations for a set of roughly 700 HIV–I protease substrates. These different representations are used as input to an array of six different machine learning models which are used to predict whether or not a given peptide is likely to be an acceptable substrate for the protease. Our results show that, in general, higher–dimensional physiochemical representations tend to have better performance than representations incorporating fewer dimensions selected on the basis of high information content. We contend that such representations are more biologically relevant than simple string–based representations and are likely to more accurately capture peptide characteristics that are functionally important. / Singapore-MIT Alliance (SMA) Machine learning peptides modeling physio-chemical properties
483	Delving deep into fetal neurosonography : an image analysis approach Huang, Ruobing January 2017 (has links) Ultrasound screening has been used for decades as the main modality to examine fetal brain development and to diagnose possible anomalies. However, basic clinical ultrasound examination of the fetal head is limited to axial planes of the brain and linear measurements which may have restrained its potential and efficacy. The recent introduction of three-dimensional (3D) ultrasound provides the opportunity to navigate to different anatomical planes and to evaluate structures in 3D within the developing brain. Regardless of acquisition methods, interpreting 2D/3D ultrasound fetal brain images require considerable skill and time. In this thesis, a series of automatic image analysis algorithms are proposed that exploit the rich sonographic patterns captured by the scans and help to simplify clinical examination. The original contributions include: 1. An original skull detection method for 3D ultrasound images, which achieves mean accuracy of 2.2 ± 1.6 mm compared to the ground truth (GT). In addition, the algorithm is utilised for accurate automated measurement of essential biometry in standard examinations: biparietal diameter (mean accuracy: 2.1 ± 1.4 mm) and head circumference (mean accuracy: 4.5 ± 3.7 mm). 2. A plane detection algorithm. It automatically extracts mid-sagittal plane that provides visualization of midline structures, which are crucial to assess central nervous system malformations. The automated planes are in accordance with manual ones (within 3.0 ± 3.5°). 3. A general segmentation framework for delineating fetal brain structures in 2D images. The automatically generated predictions are found to be agreed with the manual delineations (mean dice-similarity coefficient: 0.79 ± 0.07). As a by-product, the algorithm generated automated biometry. The results might be further utilized for morphological evaluation in future research. 4. An efficient localization model that is able to pinpoint the 3D locations of five key brain structures that are examined in a routine clinical examination. The predictions correlate with the ground truth: the average centre deviation is 1.8 ± 1.4 mm, and the size difference between them is 1.9 ± 1.5 mm. The application of this model may greatly reduce the time required for routine examination in clinical practice. 5. A 3D affine registration pipeline. Leveraging the power of convolutional neural networks, the model takes raw 3D brain images as input and geometrically transforms fetal brains into a unified coordinate system (proposed as a Fetal Brain Talairach system). The integration of these algorithms into computer-assisted analysis tools may greatly reduce the time and effort to evaluate 3D fetal neurosonography for clinicians. Furthermore, they will assist understanding of fetal brain maturation by distilling 2D/3D information directly from the uterus.
484	Local learning by partitioning Wang, Joseph 12 March 2016 (has links) In many machine learning applications data is assumed to be locally simple, where examples near each other have similar characteristics such as class labels or regression responses. Our goal is to exploit this assumption to construct locally simple yet globally complex systems that improve performance or reduce the cost of common machine learning tasks. To this end, we address three main problems: discovering and separating local non-linear structure in high-dimensional data, learning low-complexity local systems to improve performance of risk-based learning tasks, and exploiting local similarity to reduce the test-time cost of learning algorithms. First, we develop a structure-based similarity metric, where low-dimensional non-linear structure is captured by solving a non-linear, low-rank representation problem. We show that this problem can be kernelized, has a closed-form solution, naturally separates independent manifolds, and is robust to noise. Experimental results indicate that incorporating this structural similarity in well-studied problems such as clustering, anomaly detection, and classification improves performance. Next, we address the problem of local learning, where a partitioning function divides the feature space into regions where independent functions are applied. We focus on the problem of local linear classification using linear partitioning and local decision functions. Under an alternating minimization scheme, learning the partitioning functions can be reduced to solving a weighted supervised learning problem. We then present a novel reformulation that yields a globally convex surrogate, allowing for efficient, joint training of the partitioning functions and local classifiers. We then examine the problem of learning under test-time budgets, where acquiring sensors (features) for each example during test-time has a cost. Our goal is to partition the space into regions, with only a small subset of sensors needed in each region, reducing the average number of sensors required per example. Starting with a cascade structure and expanding to binary trees, we formulate this problem as an empirical risk minimization and construct an upper-bounding surrogate that allows for sequential decision functions to be trained jointly by solving a linear program. Finally, we present preliminary work extending the notion of test-time budgets to the problem of adaptive privacy. Electrical engineering Machine learning Pattern recognition
485	Crystallization properties of molecular materials : prediction and rule extraction by machine learning Wicker, Jerome January 2017 (has links) Crystallization is an increasingly important process in a variety of applications from drug development to single crystal X-ray diffraction structure determination. However, while there is a good deal of research into prediction of molecular crystal structure, the factors that cause a molecule to be crystallizable have so far remained poorly understood. The aim of this project was to answer the seemingly straightforward question: can we predict how easily a molecule will crystallize? The Cambridge Structural Database contains almost a million examples of materials from the scientific literature that have crystallized. Models for the prediction of crystallization propensity of organic molecular materials were developed by training machine learning algorithms on carefully curated sets of molecules which are either observed or not observed to crystallize, extracted from a database of commercially available molecules. The models were validated computationally and experimentally, while feature extraction methods and high resolution powder diffraction studies were used to understand the molecular and structural features that determine the ease of crystallization. This led to the development of a new molecular descriptor which encodes information about the conformational flexibility of a molecule. The best models gave error rates of less than 5% for both cross-validation data and previously-unseen test data, demonstrating that crystallization propensity can be predicted with a high degree of accuracy. Molecular size, flexibility and nitrogen atom environments were found to be the most influential factors in determining the ease of crystallization, while microstructural features determined by powder diffraction showed almost no correlation with the model predictions. Further predictions on co-crystals show scope for extending the methodology to other relevant applications.
486	Bayesian matrix factorisation : inference, priors, and data integration Brouwer, Thomas Alexander January 2017 (has links) In recent years the amount of biological data has increased exponentially. Most of these data can be represented as matrices relating two different entity types, such as drug-target interactions (relating drugs to protein targets), gene expression profiles (relating drugs or cell lines to genes), and drug sensitivity values (relating drugs to cell lines). Not only the size of these datasets is increasing, but also the number of different entity types that they relate. Furthermore, not all values in these datasets are typically observed, and some are very sparse. Matrix factorisation is a popular group of methods that can be used to analyse these matrices. The idea is that each matrix can be decomposed into two or more smaller matrices, such that their product approximates the original one. This factorisation of the data reveals patterns in the matrix, and gives us a lower-dimensional representation. Not only can we use this technique to identify clusters and other biological signals, we can also predict the unobserved entries, allowing us to prune biological experiments. In this thesis we introduce and explore several Bayesian matrix factorisation models, focusing on how to best use them for predicting these missing values in biological datasets. Our main hypothesis is that matrix factorisation methods, and in particular Bayesian variants, are an extremely powerful paradigm for predicting values in biological datasets, as well as other applications, and especially for sparse and noisy data. We demonstrate the competitiveness of these approaches compared to other state-of-the-art methods, and explore the conditions under which they perform the best. We consider several aspects of the Bayesian approach to matrix factorisation. Firstly, the effect of inference approaches that are used to find the factorisation on predictive performance. Secondly, we identify different likelihood and Bayesian prior choices that we can use for these models, and explore when they are most appropriate. Finally, we introduce a Bayesian matrix factorisation model that can be used to integrate multiple biological datasets, and hence improve predictions. This model hybridly combines different matrix factorisation models and Bayesian priors. Through these models and experiments we support our hypothesis and provide novel insights into the best ways to use Bayesian matrix factorisation methods for predictive purposes.
487	Learning natural coding conventions Allamanis, Miltiadis January 2017 (has links) Coding conventions are ubiquitous in software engineering practice. Maintaining a uniform coding style allows software development teams to communicate through code by making the code clear and, thus, readable and maintainable—two important properties of good code since developers spend the majority of their time maintaining software systems. This dissertation introduces a set of probabilistic machine learning models of source code that learn coding conventions directly from source code written in a mostly conventional style. This alleviates the coding convention enforcement problem, where conventions need to first be formulated clearly into unambiguous rules and then be coded in order to be enforced; a tedious and costly process. First, we introduce the problem of inferring a variable’s name given its usage context and address this problem by creating Naturalize — a machine learning framework that learns to suggest conventional variable names. Two machine learning models, a simple n-gram language model and a specialized neural log-bilinear context model are trained to understand the role and function of each variable and suggest new stylistically consistent variable names. The neural log-bilinear model can even suggest previously unseen names by composing them from subtokens (i.e. sub-components of code identifiers). The suggestions of the models achieve 90% accuracy when suggesting variable names at the top 20% most confident locations, rendering the suggestion system usable in practice. We then turn our attention to the significantly harder method naming problem. Learning to name methods, by looking only at the code tokens within their body, requires a good understating of the semantics of the code contained in a single method. To achieve this, we introduce a novel neural convolutional attention network that learns to generate the name of a method by sequentially predicting its subtokens. This is achieved by focusing on different parts of the code and potentially directly using body (sub)tokens even when they have never been seen before. This model achieves an F1 score of 51% on the top five suggestions when naming methods of real-world open-source projects. Learning about naming code conventions uses the syntactic structure of the code to infer names that implicitly relate to code semantics. However, syntactic similarities and differences obscure code semantics. Therefore, to capture features of semantic operations with machine learning, we need methods that learn semantic continuous logical representations. To achieve this ambitious goal, we focus our investigation on logic and algebraic symbolic expressions and design a neural equivalence network architecture that learns semantic vector representations of expressions in a syntax-driven way, while solely retaining semantics. We show that equivalence networks learn significantly better semantic vector representations compared to other, existing, neural network architectures. Finally, we present an unsupervised machine learning model for mining syntactic and semantic code idioms. Code idioms are conventional “mental chunks” of code that serve a single semantic purpose and are commonly used by practitioners. To achieve this, we employ Bayesian nonparametric inference on tree substitution grammars. We present a wide range of evidence that the resulting syntactic idioms are meaningful, demonstrating that they do indeed recur across software projects and that they occur more frequently in illustrative code examples collected from a Q&A site. These syntactic idioms can be used as a form of automatic documentation of coding practices of a programming language or an API. We also mine semantic loop idioms, i.e. highly abstracted but semantic-preserving idioms of loop operations. We show that semantic idioms provide data-driven guidance during the creation of software engineering tools by mining common semantic patterns, such as candidate refactoring locations. This gives data-based evidence to tool, API and language designers about general, domain and project-specific coding patterns, who instead of relying solely on their intuition, can use semantic idioms to achieve greater coverage of their tool or new API or language feature. We demonstrate this by creating a tool that suggests loop refactorings into functional constructs in LINQ. Semantic loop idioms also provide data-driven evidence for introducing new APIs or programming language features.
488	Machine learning-based human observer analysis of video sequences Al-Raisi, Seema F. A. R. January 2017 (has links) The research contributes to the field of video analysis by proposing novel approaches to automatically generating human observer performance patterns that can be effectively used in advancing the modern video analytic and forensic algorithms. Eye tracker and eye movement analysis technology are employed in medical research, psychology, cognitive science and advertising. The data collected on human eye movement from the eye tracker can be analyzed using the machine and statistical learning approaches. Therefore, the study attempts to understand the visual attention pattern of people when observing a captured CCTV footage. It intends to prove whether the eye gaze of the observer which determines their behaviour is dependent on the given instructions or the knowledge they learn from the surveillance task. The research attempts to understand whether the attention of the observer on human objects is differently identified and tracked considering the different areas of the body of the tracked object. It attempts to know whether pattern analysis and machine learning can effectively replace the current conceptual and statistical approaches to the analysis of eye-tracking data captured within a CCTV surveillance task. A pilot study was employed that took around 30 minutes for each participant. It involved observing 13 different pre-recorded CCTV clips of public space. The participants are provided with a clear written description of the targets they should find in each video. The study included a total of 24 participants with varying levels of experience in analyzing CCTV video. A Tobii eye tracking system was employed to record the eye movements of the participants. The data captured by the eye tracking sensor is analyzed using statistical data analysis approaches like SPSS and machine learning algorithms using WEKA. The research concluded the existence of differences in behavioural patterns which could be used to classify participants of study is appropriate machine learning algorithms are employed. The research conducted on video analytics was perceived to be limited to few iii projects where the human object being observed was viewed as one object, and hence the detailed analysis of human observer attention pattern based on human body part articulation has not been investigated. All previous attempts in human observer visual attention pattern analysis on CCTV video analytics and forensics either used conceptual or statistical approaches. These methods were limited with regards to making predictions and the detection of hidden patterns. A novel approach to articulating human objects to be identified and tracked in a visual surveillance task led to constrained results, which demanded the use of advanced machine learning algorithms for classification of participants The research conducted within the context of this thesis resulted in several practical data collection and analysis challenges during formal CCTV operator based surveillance tasks. These made it difficult to obtain the appropriate cooperation from the expert operators of CCTV for data collection. Therefore, if expert operators were employed in the study rather than novice operator, a more discriminative and accurate classification would have been achieved. Machine learning approaches like ensemble learning and tree based algorithms can be applied in cases where a more detailed analysis of the human behaviour is needed. Traditional machine learning approaches are challenged by recent advances in the field of convolutional neural networks and deep learning. Therefore, future research can replace the traditional machine learning approaches employed in this study, with convolutional neural networks. The current research was limited to 13 different videos with different descriptions given to the participants for identifying and tracking different individuals. The research can be expanded to include any complicated demands with regards to changes in the analysis process.
489	Optimization and personalization of a web service based on temporal information Wallin, Jonatan January 2018 (has links) Development in information and communication technology has increased the attention of personalization in the 21st century and the benefits to both marketers and customers are claimed to be many. The need to efficiently deliver personalized content in different web applications has increased the interest in the field of machine learning. In this thesis project, the aim is to develop a decision model that autonomously optimizes a commercial web service to increase the click through rate. The model should be based on previously collected data about previous usage of the web service. Different requirements for efficiency and storage must be fulfilled at the same time as the model should produce valuable results. An algorithm for a binary decision tree is presented in this report. The evolution of the binary tree is controlled by an entropy minimizing heuristic approach together with three specified stopping criteria. Tests on both synthetic and real data sets were performed to evaluate the accuracy and efficiency of the algorithm. The results showed that the running time is dominated by different parameters depending on the sizes of the test sets. The model is capable of capturing inherent patterns in the the available data. Machine learning Computer Sciences Datavetenskap (datalogi)
490	Studies on Designing Distributed and Cooperative Systems for Solving Constraint Satisfaction Problems of Container Loading / コンテナ積付の制約充足問題の解決のための分散協調システムの設計に関する研究 / コンテナツミツケノセイヤクジュウソクモンダイノカイケツノタメノブンサンキョウチョウシステムノセッケイニカンスルケンキュウ Liu, Yuan 24 March 2008 (has links) Kyoto University (京都大学) / 0048 / 新制・課程博士 / 博士(工学) / 甲第13812号 / 工博第2916号 / 新制\|\|工\|\|1431(附属図書館) / 26028 / UT51-2008-C728 / 京都大学大学院工学研究科機械理工学専攻 / (主査)教授椹木哲夫, 教授吉村允孝, 教授松原厚 / 学位規則第4条第1項該当 Logistics Supply Chain Management Machine Learning 500

Search results