Global ETD Search

291	Machine learning in systems biology at different scales : from molecular biology to ecology Aderhold, Andrej January 2015 (has links) Machine learning has been a source for continuous methodological advances in the field of computational learning from data. Systems biology has profited in various ways from machine learning techniques but in particular from network inference, i.e. the learning of interactions given observed quantities of the involved components or data that stem from interventional experiments. Originally this domain of system biology was confined to the inference of gene regulation networks but recently expanded to other levels of organization of biological and ecological systems. Especially the application to species interaction networks in a varying environment is of mounting importance in order to improve our understanding of the dynamics of species extinctions, invasions, and population behaviour in general. The aim of this thesis is to demonstrate an extensive study of various state-of-art machine learning techniques applied to a genetic regulation system in plants and to expand and modify some of these methods to infer species interaction networks in an ecological setting. The first study attempts to improve the knowledge about circadian regulation in the plant Arabidopsis thaliana from the view point of machine learning and gives suggestions on what methods are best suited for inference, how the data should be processed and modelled mathematically, and what quality of network learning can be expected by doing so. To achieve this, I generate a rich and realistic synthetic data set that is used for various studies under consideration of different effects and method setups. The best method and setup is applied to real transcriptional data, which leads to a new hypothesis about the circadian clock network structure. The ecological study is focused on the development of two novel inference methods that exploit a common principle from transcriptional time-series, which states that expression profiles over time can be temporally heterogeneous. A corresponding concept in a spatial domain of 2 dimensions is that species interaction dynamics can be spatially heterogeneous, i.e. can change in space dependent on the environment and other factors. I will demonstrate the expansion from the 1-dimensional time domain to the 2-dimensional spatial domain, introduce two distinct space segmentation schemes, and consider species dispersion effects with spatial autocorrelation. The two novel methods display a significant improvement in species interaction inference compared to competing methods and display a high confidence in learning the spatial structure of different species neighbourhoods or environments. 570.285
292	Autotuning wavefront patterns for heterogeneous architectures Mohanty, Siddharth January 2015 (has links) Manual tuning of applications for heterogeneous parallel systems is tedious and complex. Optimizations are often not portable, and the whole process must be repeated when moving to a new system, or sometimes even to a different problem size. Pattern based parallel programming models were originally designed to provide programmers with an abstract layer, hiding tedious parallel boilerplate code, and allowing a focus on only application specific issues. However, the constrained algorithmic model associated with each pattern also enables the creation of pattern-specific optimization strategies. These can capture more complex variations than would be accessible by analysis of equivalent unstructured source code. These variations create complex optimization spaces. Machine learning offers well established techniques for exploring such spaces. In this thesis we use machine learning to create autotuning strategies for heterogeneous parallel implementations of applications which follow the wavefront pattern. In a wavefront, computation starts from one corner of the problem grid and proceeds diagonally like a wave to the opposite corner in either two or three dimensions. Our framework partitions and optimizes the work created by these applications across systems comprising multicore CPUs and multiple GPU accelerators. The tuning opportunities for a wavefront include controlling the amount of computation to be offloaded onto GPU accelerators, choosing the number of CPU and GPU threads to process tasks, tiling for both CPU and GPU memory structures, and trading redundant halo computation against communication for multiple GPUs. Our exhaustive search of the problem space shows that these parameters are very sensitive to the combination of architecture, wavefront instance and problem size. We design and investigate a family of autotuning strategies, targeting single and multiple CPU + GPU systems, and both two and three dimensional wavefront instances. These yield an average of 87% of the performance found by offline exhaustive search, with up to 99% in some cases. 004
293	Exploiting Application Characteristics for Efficient System Support of Data-Parallel Machine Learning Cui, Henggang 01 May 2017 (has links) Large scale machine learning has many characteristics that can be exploited in the system designs to improve its efficiency. This dissertation demonstrates that the characteristics of the ML computations can be exploited in the design and implementation of parameter server systems, to greatly improve the efficiency by an order of magnitude or more. We support this thesis statement with three case study systems, IterStore, GeePS, and MLtuner. IterStore is an optimized parameter server system design that exploits the repeated data access pattern characteristic of ML computations. The designed optimizations allow IterStore to reduce the total run time of our ML benchmarks by up to 50×. GeePS is a parameter server that is specialized for deep learning on distributed GPUs. By exploiting the layer-by-layer data access and computation pattern of deep learning, GeePS provides almost linear scalability from single-machine baselines (13× more training throughput with 16 machines), and also supports neural networks that do not fit in GPU memory. MLtuner is a system for automatically tuning the training tunables of ML tasks. It exploits the characteristic that the best tunable settings can often be decided quickly with just a short trial time. By making use of optimization-guided online trial-and-error, MLtuner can robustly find and re-tune tunable settings for a variety of machine learning applications, including image classification, video classification, and matrix factorization, and is over an order of magnitude faster than traditional hyperparameter tuning approaches. Big Data Analytics Large-Scale Machine Learning
294	Logistic regression with conjugate gradient descent for document classification Namburi, Sruthi January 1900 (has links) Master of Science / Department of Computing and Information Sciences / William H. Hsu / Logistic regression is a model for function estimation that measures the relationship between independent variables and a categorical dependent variable, and by approximating a conditional probabilistic density function using a logistic function, also known as a sigmoidal function. Multinomial logistic regression is used to predict categorical variables where there can be more than two categories or classes. The most common type of algorithm for optimizing the cost function for this model is gradient descent. In this project, I implemented logistic regression using conjugate gradient descent (CGD). I used the 20 Newsgroups data set collected by Ken Lang. I compared the results with those for existing implementations of gradient descent. The conjugate gradient optimization methodology outperforms existing implementations. Document Classification Machine Learning Logistic Regression
295	Predicting sentiment-mention associations in product reviews Vaswani, Vishwas January 1900 (has links) Master of Science / Department of Computing and Information Sciences / Doina Caragea / With the rising trend in social networking, more people express their opinions on the web. As a consequence, there has been an increase in the number of blogs where people write reviews about the products they buy or services they experience. These reviews can be very helpful to other potential customers who want to know the pros and cons of a product, and also to manufacturers who want to get feedback from customers about their products. Sentiment analysis of online data (such as review blogs) is a rapidly growing field of research in Machine Learning, which can leverage online reviews and quickly extract the sentiment of a whole blog. The accuracy of a sentiment analyzer relies heavily on correctly identifying associations between a sentiment (opinion) word and the targeted mention (token or object) in blog sentences. In this work, we focus on the task of automatically identifying sentiment-mention associations, in other words, we identify the target mention that is associated with a sentiment word in a sentence. Support Vector Machines (SVM), a supervised machine learning algorithm, was used to learn classifiers for this task. Syntactic and semantic features extracted from sentences were used as input to the SVM algorithm. The dataset used in the work has reviews from car and camera domain. The work is divided into two phases. In the first phase, we learned domain specific classifiers for the car and camera domains, respectively. To further improve the predictions of the domain specific classifiers we investigated the use of transfer learning techniques in the second phase. More precisely, the goal was to use knowledge from a source domain to improve predictions for a target domain. We considered two transfer learning approaches: a feature level fusion approach and a classifier level fusion approach. Experimental results show that transfer learning can help to improve the predictions made using the domain specific classifier approach. While both the feature level and classifier level fusion approaches were shown to improve the prediction accuracy, the classifier level fusion approach gave better results. Sentiment analysis Machine learning Computer Science (0984)
296	The automatic acquisition of knowledge about discourse connectives Hutchinson, Ben January 2005 (has links) This thesis considers the automatic acquisition of knowledge about discourse connectives. It focuses in particular on their semantic properties, and on the relationships that hold between them. There is a considerable body of theoretical and empirical work on discourse connectives. For example, Knott (1996) motivates a taxonomy of discourse connectives based on relationships between them, such as HYPONYMY and EXCLUSIVE, which are defined in terms of substitution tests. Such work requires either great theoretical insight or manual analysis of large quantities of data. As a result, to date no manual classification of English discourse connectives has achieved complete coverage. For example, Knott gives relationships between only about 18% of pairs obtained from a list of 350 discourse connectives. This thesis explores the possibility of classifying discourse connectives automatically, based on their distributions in texts. This thesis demonstrates that state-of-the-art techniques in lexical acquisition can successfully be applied to acquiring information about discourse connectives. Central to this thesis is the hypothesis that distributional similarity correlates positively with semantic similarity. Support for this hypothesis has previously been found for word classes such as nouns and verbs (Miller and Charles, 1991; Resnik and Diab, 2000, for example), but there has been little exploration of the degree to which it also holds for discourse connectives. We investigate the hypothesis through a number of machine learning experiments. These experiments all use unsupervised learning techniques, in the sense that they do not require any manually annotated data, although they do make use of an automatic parser. First, we show that a range of semantic properties of discourse connectives, such as polarity and veridicality (whether or not the semantics of a connective involves some underlying negation, and whether the connective implies the truth of its arguments, respectively), can be acquired automatically with a high degree of accuracy. Second, we consider the tasks of predicting the similarity and substitutability of pairs of discourse connectives. To assist in this, we introduce a novel information theoretic function based on variance that, in combination with distributional similarity, is useful for learning such relationships. Third, we attempt to automatically construct taxonomies of discourse connectives capturing substitutability relationships. We introduce a probability model of taxonomies, and show that this can improve accuracy on learning substitutability relationships. Finally, we develop an algorithm for automatically constructing or extending such taxonomies which uses beam search to help find the optimal taxonomy. 006.3
297	A recurrent neural network approach to quantification of risks surrounding the Swedish property market Vikström, Filip January 2016 (has links) As the real estate market plays a central role in a countries financial situation, as a life insurer, a bank and a property developer, Skandia wants a method for better assessing the risks connected to the real estate market. The goal of this paper is to increase the understanding of property market risk and its covariate risks and to conduct an analysis of how a fall in real estate prices could affect Skandia’s exposed assets.This paper explores a recurrent neural network model with the aim of quantifying identified risk factors using exogenous data. The recurrent neural network model is compared to a vector autoregressive model with exogenous inputs that represent economic conditions.The results of this paper are inconclusive as to which method that produces the most accurate model under the specified settings. The recurrent neural network approach produces what seem to be better results in out-of-sample validation but both the recurrent neural network model and the vector autoregressive model fail to capture the hypothesized relationship between the exogenous and modeled variables. However producing results that does not fit previous assumptions, further research into artificial neural networks and tests with additional variables and longer sample series for calibration is suggested as the model preconditions are promising. Artificial neural networks Machine learning RNN
298	Smart cropping tools with help of machine learning Kanwar, John January 2019 (has links) Machine learning has been around for a long time, the applications range from a big variety of different subjects, everything from self driving cars to data mining. When a person takes a picture with its mobile phone it easily happens that the photo is a little bit crooked. It does also happen that people takes spontaneous photos with help of their phones, which can result in something irrelevant ending up in the corner of the image. This thesis combines machine learning with photo editing tools. It will explore the possibilities how machine learning can be used to automatically crop images in an aesthetically pleasing way and how machine learning can be used to create a portrait cropping tool. It will also go through how a straighten out function can be implemented with help of machine learning. At last, it is going to compare this tools with other software automatic cropping tools. / Maskinlärning har funnits en lång tid. Deras jobb varierar från flera olika ämnen. Allting från självkörande bilar till data mining. När en person tar en bild med en mobiltelefon händer det lätt att bilden är lite sned. Det händer också att en tar spontana bilder med sin mobil, vilket kan leda till att det kommer med något i kanten av bilden som inte bör vara där. Det här examensarbetet kombinerar maskinlärning med fotoredigeringsverktyg. Det kommer att utforska möjligheterna hur maskinlärning kan användas för att automatiskt beskära bilder estetsikt tilltalande samt hur maskinlärning kan användas för att skapa ett porträttbeskärningsverktyg. Det kommer även att gå igenom hur en räta-till-funktion kan bli implementerad med hjälp av maskinlärning. Till sist kommer det att jämföra dessa verktyg med andra programs automatiska beskärningsverktyg. Machine Learning Cropping Computer Systems Datorsystem
299	Deep learning and SVM methods for lung diseases detection and direction recognition Li, Lei January 2018 (has links) University of Macau / Faculty of Science and Technology. / Department of Computer and Information Science Machine learning Neural networks (Computer science)
300	Machine Learning Methods to Understand Textual Data Unknown Date (has links) The amount of textual data that produce every minute on the internet is extremely high. Processing of this tremendous volume of mostly unstructured data is not a straightforward function. But the enormous amount of useful information that lay down on them motivate scientists to investigate efficient and effective techniques and algorithms to discover meaningful patterns. Social network applications provide opportunities for people around the world to be in contact and share their valuable knowledge, such as chat, comments, and discussion boards. People usually do not care about spelling and accurate grammatical construction of a sentence in everyday life conversations. Therefore, extracting information from such datasets are more complicated. Text mining can be a solution to this problem. Text mining is a knowledge discovery process used to extract patterns from natural language. Application of text mining techniques on social networking websites can reveal a significant amount of information. Text mining in conjunction with social networks can be used for finding a general opinion about any special subject, human thinking patterns, and group identification. In this study, we investigate machine learning methods in textual data in six chapters. / Includes bibliography. / Dissertation (Ph.D.)--Florida Atlantic University, 2018. / FAU Electronic Theses and Dissertations Collection Machine learning Internet--Data processing Text Mining

Search results