Global ETD Search

1271	Earthquake Detection using Deep Learning Based Approaches Audretsch, James 17 March 2020 (has links) Earthquake detection is an important task, focusing on detecting seismic events in past data or in real time from seismic time series. In the past few decades, due to the increasing amount of available seismic data, research in seismic event detection shows remarkable success using neural networks and other machine learning techniques. However, creating high quality labeled data sets is still a manual process that demands tremendous amount of time and expert knowledge, and is stifling big data innovation. When compiling a data set, it is unclear how many earthquakes and noise are mislabeled. Another challenge is how to promote the general applicability of the machine learning based models to different geographical regions. The models trained by data sets from one location should be applicable to the detection at other locations. This thesis explores the most popular deep learning model, convolutional neural networks (CNN), to build a single location detection model. In addition, we build more robust generalized earthquake detection models using transfer learning and meta learning. We also introduce a process for generating high quality labeled datasets. Our technique achieves high detection accuracy even on low signal to noise ratio events. The AI techniques explored in this research have potential to be transferred to other domains that utilize signal processing. There are a myriad of potential applications, with audio processing probably being one of the most directly relevant. Any field that deals with waveforms (e.g. seismic, audio, light) can utilize the developed techniques. Machine Learning Seismology Deep Learning Earthquake Detection Meta Learning
1272	Predicting Gene Functions and Phenotypes by combining Deep Learning and Ontologies Kulmanov, Maxat 08 April 2020 (has links) The amount of available protein sequences is rapidly increasing, mainly as a consequence of the development and application of high throughput sequencing technologies in the life sciences. It is a key question in the life sciences to identify the functions of proteins, and furthermore to identify the phenotypes that may be associated with a loss (or gain) of function in these proteins. Protein functions are generally determined experimentally, and it is clear that experimental determination of protein functions will not scale to the current { and rapidly increasing { amount of available protein sequences (over 300 million). Furthermore, identifying phenotypes resulting from loss of function is even more challenging as the phenotype is modi ed by whole organism interactions and environmental variables. It is clear that accurate computational prediction of protein functions and loss of function phenotypes would be of signi cant value both to academic research and to the biotechnology industry. We developed and expanded novel methods for representation learning, predicting protein functions and their loss of function phenotypes. We use deep neural network algorithm and combine them with symbolic inference into neural-symbolic algorithms. Our work signi cantly improves previously developed methods for predicting protein functions through methodological advances in machine learning, incorporation of broader data types that may be predictive of functions, and improved systems for neural-symbolic integration. The methods we developed are generic and can be applied to other domains in which similar types of structured and unstructured information exist. In future, our methods can be applied to prediction of protein function for metagenomic samples in order to evaluate the potential for discovery of novel proteins of industrial value. Also our methods can be applied to the prediction of loss of function phenotypes in human genetics and incorporate the results in a variant prioritization tool that can be applied to diagnose patients with Mendelian disorders. gene functions phenotypes ontologies embeddings deep neural networks machine learning
1273	An Empirical Analysis of Network Traffic: Device Profiling and Classification Anbazhagan, Mythili Vishalini 02 July 2019 (has links) Time and again we have seen the Internet grow and evolve at an unprecedented scale. The number of online users in 1995 was 40 million but in 2020, number of online devices are predicted to reach 50 billion, which would be 7 times the human population on earth. Up until now, the revolution was in the digital world. But now, the revolution is happening in the physical world that we live in; IoT devices are employed in all sorts of environments like domestic houses, hospitals, industrial spaces, nuclear plants etc., Since they are employed in a lot of mission-critical or even life-critical environments, their security and reliability are of paramount importance because compromising them can lead to grave consequences. IoT devices are, by nature, different from conventional Internet connected devices like laptops, smart phones etc., They have small memory, limited storage, low processing power etc., They also operate with little to no human intervention. Hence it becomes very important to understand IoT devices better. How do they behave in a network? How different are they from traditional Internet connected devices? Can they be identified from their network traffic? Is it possible for anyone to identify them just by looking at the network data that leaks outside the network, without even joining the network? That is the aim of this thesis. To the best of our knowledge, no study has collected data from outside the network, without joining the network, with the intention of finding out if IoT devices can be identified from this data. We also identify parameters that classify IoT and non-IoT devices. Then we do manual grouping of similar devices and then do the grouping automatically, using clustering algorithms. This will help in grouping devices of similar nature and create a profile for each kind of device. IoT Network Data Analytics Machine Learning Clustering K-Means
1274	Information Exploration and Exploitation for Machine Learning with Small Data / 小データを用いた機械学習のための情報の探索と活用 Hayashi, Shogo 23 March 2021 (has links) 京都大学 / 新制・課程博士 / 博士(情報学) / 甲第23313号 / 情博第749号 / 新制\|\|情\|\|128(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)教授鹿島久嗣, 教授山本章博, 教授吉川正俊 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM machine learning small data generalized distillation Bayesian optimization 007
1275	A Semi-Automatic Grading Experience for Digital Ink Quizzes Rhees, Brooke Ellen 01 January 2017 (has links) Teachers who want to assess student learning and provide quality feedback are faced with a challenge when trying to grade assignments quickly. There is currently no system which will provide both a fast-to-grade quiz and a rich testing experience. Previous attempts to speed up grading time include NLP-based text analysis to automate grading and scanning in documents for manual grading with recyclable feedback. However, automated NLP systems all focus solely on text-based problems, and manual grading is still linear in the number of students. Machine learning algorithms exist which can interactively train a computer quickly classify digital ink strokes. We used stroke recognition and interactive machine learning concepts to build a grading interface for digital ink quizzes, to allow non-text open-ended questions that can then be semiautomatically graded. We tested this system on a Computer Science class with 361 students using a set of quiz questions which their teacher provided, evaluated its effectiveness, and determined some of its limitations. Adaptations to the interface and the training process as well as further work to resolve intrinsic stroke perversity are required to make this a truly effective system. However, using the system we were able to reduce grading time by as much as 10x for open-ended responses. education grading digital ink interactive machine learning Computer Sciences
1276	Machine Learning Applications in Proteins: Interaction Prediction and Structure Prediction Sun, Mengzhen January 2021 (has links) This thesis focuses on the two research projects which have applied machine learning techniques to the protein-related topics. The first project is to use protein sequences and the interaction graph to address the protein-protein interaction prediction problem. The second project is to leverage the sequences of protein loops within and beyond homologs to predict the protein loop structures. In the protein-protein interaction prediction project, we applied the pretrained language models, which were trained on large sets of protein sequences, as one of the protein feature extraction methods. Another feature extraction method is the graph learning on the protein interaction graph. The graph learning embeddings and the language model embeddings were fed into classification models to predict if two proteins are interacting or not. We trained and tested our methods on the S. cerevisiae dataset and the human dataset. Our results are comparable to or better than other state-of-art methods, with the advantages that our method is faster at the sample preparation step and has a larger application scope for requiring only protein sequences. We also did experiments with datasets from different similarity cutoffs between the train and test set of the human dataset, and our method has shown an effective prediction ability even with a strict similarity cutoff. In the protein loop prediction project, we utilized the attention-based encoder-decoder language models to predict the protein loop inter-residue distances from the protein loop sequences. We fed the model with the loop sequences and received arrays of numbers representing the distances between each C_α pair in the loops. We utilized two different strategies to reconstruct the loops from the predicted distances. One was firstly to calculate the C_α coordinates from the predicted distances, and then apply a fast full-atom reconstruction method starting from C_α coordinates to build the local loop structures. Our local loop structure prediction results of this method are very competitive with low local RMSDs, especially with the lowest standard deviations. The second method was to integrate the predicted inter-residue distances as constraints to the de novo loop prediction method PLOP (Jacobson et al. 2004). We tested the loop reconstruction process on the 8-res and 12-res loop benchmark sets. This method has the best performance compared to other state-of-art methods, and the incorporation of such machine learning step decreased the computing time of the standalone PLOP program. Chemistry Machine learning Amino acid sequence Protein-protein interactions
1277	MACHINE LEARNING APPROACH FOR VEGETATION CLASSIFICATION USING UAS MULTISPECTRAL IMAGERY Unknown Date (has links) Vegetation monitoring plays a significant role in improving the quality of life above the earth's surface. However, vegetation resources management is challenging due to climate change, global warming, and urban development. The research aims to identify and extract vegetation communities for Jupiter Inlet Lighthouse Outstanding Natural Area (JILONA) using developed Unmanned Aerial Systems (UAS) deployed with five bands of RedEdge Micasence Multispectral Sensor. UAS has a lot of potential for various applications as it provides high-resolution imagery at lower altitudes. In this study, spectral reflectance values for each vegetation species were collected using a spectroradiometer instrument. Those values were correlated with five band UAS Image values to understand the sensor's performance, also added with reflectance’s similarities and divergence for vegetation species. Pixel and Object-based classification methods were performed using 0.15 ft Multispectral Imagery to identify the vegetation classes. Supervised Machine Learning Support Vector Machine (SVM) and Random Forest (RF) algorithms with topographical information were used to produce thematic vegetation maps. The Pixel-based procedure using the SVM algorithm generated an overall accuracy and kappa coefficient of above 90 percent. Both classification approaches have provided aesthetic vegetation thematic maps. According to statistical cross-validation findings and visual interpretation of vegetation communities, the pixel classification method outperformed object-based classification. / Includes bibliography. / Thesis (M.S.)--Florida Atlantic University, 2021. / FAU Electronic Theses and Dissertations Collection Vegetation classification Machine learning Multispectral imaging Unmanned aerial vehicles
1278	Efficient Machine Learning Algorithms for Identifying Risk Factors of Prostate and Breast Cancers among Males and Females Unknown Date (has links) One of the most common types of cancer among women is breast cancer. It represents one of the diseases leading to a high number of mortalities among women. On the other hand, prostate cancer is the second most frequent malignancy in men worldwide. The early detection of prostate cancer is fundamental to reduce mortality and increase the survival rate. A comparison between six types of machine learning models as Logistic Regression, Decision Tree, Random Forest, Gradient Boosting, k Nearest Neighbors, and Naïve Bayes has been performed. This research aims to identify the most efficient machine learning algorithms for identifying the most significant risk factors of prostate and breast cancers. For this reason, National Health Interview Survey (NHIS) and Prostate, Lung, Colorectal, and Ovarian (PLCO) datasets are used. A comprehensive comparison of risk factors leading to these two crucial cancers can significantly impact early detection and progressive improvement in survival. / Includes bibliography. / Thesis (P.S.M.)--Florida Atlantic University, 2021. / FAU Electronic Theses and Dissertations Collection Machine learning Algorithms Cancer--Risk factors Breast--Cancer Prostate--Cancer
1279	Utilization of Machine Learning to Predict Bio-Oil and Biochar Yields from CoPyrolysis of Biomass with Waste Polymers Alabdrabalnabi, Aessa 11 1900 (has links) With 220 billion dry tons available, biomass is one of the world’s most abundant energy source; it also could be a reliable energy source. The human population annual rate of production is 275 million tons of plastic waste as of the year 2019, which has to be managed to facilitate circular carbon economy. Pyrolysis of biomass has emerged as an attractive option for converting waste into bioenergy. Because of its high oxygen content, acidity and viscosity, pyrolysis bio-oil is generally a low-quality product that requires upgrading before being used directly as a drop-in fuel and a fuel additive; this upgrade is achieved by co-pyrolysis of biomass with waste polymers. Since polymers are a rich source of hydrogen, pyrolysis vapors are upgrade; the advantage of co-pyrolysis is that a separate hydroprocessing unit becomes unnecessary after process optimization. Machine learning is emerging as a growing field to predict and optimize the energy related processes. The process can be finetuned using the models trained on the existing experimental data. In this research, machine learning models were developed to predict product yields from the co-pyrolysis of biomass and polymers. Data from the literature on co-pyrolysis of lignocellulosic biomass and polymer co-pyrolysis provided a tool to predict these outcomes. Machine learning algorithms were examined and trained with datasets acquired for biochar and bio-oil yields, with cross-validation and hyperparameters to fit the ultimate and proximate analysis of the reactants and physical conditions of the reactions. XGBoost predicted a biochar yield with RMSE of 1.77 and R$^2$ of 0.96, and a dense neural network predicted a bio-oil yield with RMSE 2.6 and R$^2$ of 0.96. Proximate analysis features were a necessary addition to the bio-oil model. SHAP (SHapley Additive exPlanations) analysis for the DNN liquid model found biomass fixed carbon, biomass moisture and biomass volatile matter with 0.11, 0.09, and 0.06 mean absolute SHAP values, respectively. The machine learning models provided a convenient and predictive tool for co-pyrolysis reaction within the range of the model’s errors and training features. These models also offered insight into the development of municipal solid waste pyrolysis in a circular carbon economy. Machine Learning Co-pyrolysis Biochar Bio-oil Biomass Waste Polymers
1280	The prediction of condensation flow patterns by using artificial intelligence (AI) techniques Seal, Michael Kevin January 2021 (has links) Multiphase flow provides a solution to the high heat flux and precision required by modern-day gadgets and heat transfer devices as phase change processes make high heat transfer rates achievable at moderate temperature differences. An application of multiphase flow commonly used in industry is the condensation of refrigerants in inclined tubes. The identification of two-phase flow patterns, or flow regimes, is fundamental to the successful design and subsequent optimisation given that the heat transfer efficiency and pressure gradient are dependent on the flow structure of the working fluid. This study showed that with visualisation data and artificial neural networks (ANN), a machine could learn, and subsequently classify the separate flow patterns of condensation of R-134a refrigerant in inclined smooth tubes with more than 98% accuracy. The study considered 10 classes of flow pattern images acquired from previous experimental works that cover a wide range of flow conditions and the full range of tube inclination angles. Two types of classifiers were considered, namely multilayer perceptron (MLP) and convolutional neural networks (CNN). Although not the focus of this study, the use of a principal component analysis (PCA) allowed feature dimensionality reduction, dataset visualisation, and decreased associated computational cost when used together with multilayer perceptron neural networks. The superior two-dimensional spatial learning capability of convolutional neural networks allowed improved image classification and generalisation performance across all 10 flow pattern classes. In both cases, the classification was done sufficiently fast to enable real-time implementation in two-phase flow systems. The analysis sequence led to the development of a predictive tool for the classification of multiphase flow patterns in inclined tubes, with the goal that the features learnt through visualisation would apply to a broad range of flow conditions, fluids, tube geometries and orientations, and would even generalise well to identify adiabatic and boiling two-phase flow patterns. The method was validated by the prediction of flow pattern images found in the existing literature. / Dissertation (MEng)--University of Pretoria, 2021. / NRF / Mechanical and Aeronautical Engineering / MEng / Restricted convolutional neural network condensation flow pattern machine learning UCTD

Search results