Spelling suggestions: "subject:": cachine 1earning"" "subject:": cachine c1earning""
121 |
On Platforms and Algorithms for Human-Centric SensingShaabana, Ala 05 1900 (has links)
The decreasing cost of chip manufacturing has greatly increased their distribution and availability such that sensors have become embedded in virtually all physical objects and are able to send and receive data -- giving rise to the Internet of Things (IoT). These embedded sensors are typically endowed with intelligent algorithms to transform information into real-time actionable insights. Recently, humans have taken on a larger role in the information-to-action path with the emergence of human-centric sensing. This has made it possible to observe various processes and infer information in complex personal and social spaces that may not be possible to obtain otherwise. However, a caveat of human-centric sensing is the high cost associated with high precision systems.
In this dissertation, we present two low cost and high performing end-to-end solutions for human-centric sensing of physiological phenomena. Additionally, we present a post-hoc data-driven sensor synchronization framework that exploits independent, omni-present information in the data to synchronize multiple sensors. We first propose XTREMIS -- a low-cost and portable ECG/EMG/EEG platform with a small form factor that has a sample rate comparable to research-grade EMG machines. We evaluate XTREMIS on a signal level as well as utilize it in tandem with a Gaussian Mixture Hidden Markov Model to detect finger movements in a rapid, fine-grained activity -- typing on a keyboard. Experiments show that not only does XTREMIS functionally outperforms current wearable technologies, its signal quality is high enough to achieve classification accuracy similar to research-grade EMG machines, making it a suitable platform for further research. We then present SiCILIA -- a platform that extracts physical and personal variables of a user's thermal environment to infer their clothing insulation. An individual's thermal sensation is directly correlated with the amount of clothing they are wearing. Indeed, a person's thermal comfort is crucial to their productivity and physical wellness, and is directly correlated with morale. Therefore it becomes important to be aware of actions such as adding or removing clothing as they are indicators of current thermal sensation. The proposed inference algorithm builds upon theories of body heat transfer, and is corroborated by empirical data. SiCILIA was tested in a vehicle with a passenger-controlled HVAC system. Experimental results show that the algorithm is capable of accurately predicting an occupant's thermal insulation with a low mean prediction error. In the third part of the thesis we present CRONOS -- a sensor data synchronization framework that takes advantage of events observed by two or more sensors to synchronize their internal clocks using only their data streams. Experimental results on pairwise and multi-sensor synchronization show a significant drift improvement for total drift and a very low mean absolute synchronization error for multi-sensor synchronization. / Thesis / Doctor of Philosophy (PhD)
|
122 |
Clustering Gaussian Processes: A Modified EM Algorithm for Functional Data Analysis with Application to British Columbia Coastal Rainfall PatternsPaton, Forrest January 2018 (has links)
Functional data analysis is a statistical framework where data are assumed to follow some functional form. This method of analysis is commonly applied to time series data, where time, measured continuously or in discrete intervals, serves as the lo- cation for a function’s value. In this thesis Gaussian processes, a generalization of the multivariate normal distribution to function space, are used. When multiple processes are observed on a comparable interval, clustering them into sub-populations can provide significant insights. A modified EM algorithm is developed for cluster- ing processes. The model presented clusters processes based on how similar their underlying covariance kernel is. In other words, cluster formation arises from modelling correlation between inputs (as opposed to magnitude between process values). The method is applied to both simulated data and British Columbia coastal rainfall patterns. Results show clustering yearly processes can accurately classify extreme weather patterns. / Thesis / Master of Science (MSc)
|
123 |
Towards Automating Code ReviewsFadhel, Muntazir January 2020 (has links)
Existing software engineering tools have proved useful in automating some aspects of the code review process, from uncovering defects to refactoring code. However, given that software teams still spend large amounts of time performing code reviews despite the use of such tools, much more research remains to be carried out in this area. This dissertation present two major contributions to this field. First, we perform a text classification experiment over thirty thousand GitHub review comments to understand what code reviewers typically discuss in reviews. Next, in an attempt to offer an innovative, data-driven approach to automating code reviews, we leverage probabilistic models of source code and graph embedding techniques to perform human-like code inspections. Our experimental results indicate that the proposed algorithm is able to emulate human-like code inspection behaviour in code reviews with a macro f1-score of 62%, representing an impressive contribution towards the relatively unexplored research domain of automated code reviewing tools. / Thesis / Master of Applied Science (MASc)
|
124 |
Application of deep learning in geotechnical engineering with a focus on constitutive modelsMotevali Haghighi, Ehsan January 2024 (has links)
Constitutive models, which provide a relationship between stress and strain to predict the
response of a material to external stimuli, are essential to solving boundary value problems.
Constitutive models were traditionally developed by selecting analytical relationships
whose parameters were obtained from experimental observations. Due to the limitations of
traditional experimental setups, the constitutive models were initially limited to certain
loading and boundary conditions. With the advent of new experimental setups such as
digital image correlation, X-ray computed tomography, digital volume correlation, and
computational methods, the potential to obtain large stress-strain databases that account for
complex loading and boundary conditions, has significantly increased. Moreover, the
advances in statistical modeling, specifically deep learning methods, along with computing
capabilities have provided new tools for predicting insights and patterns from datasets. As
such, deep learning methods have yielded improved accuracy of traditional constitutive
models by either replacing or complementing classical constitutive models. Although deep
learning-derived constitutive models have been shown to yield cohesive and complete
frameworks, the reliability of their predictions is linked to the quality of the training
dataset. Accordingly, the objectives of this study are to identify methods and test their
effectiveness in evaluating the quality, completeness, and consistency, of databases for
developing stress-strain relationships via deep learning. The study includes elastic linear
and nonlinear constitutive models, domain heterogeneity, and load path dependency, along
with different machine learning techniques. Complete, biased, and distorted stress-strain
datasets were constructed to evaluate the effectiveness of various methods in determining
the quality of the dataset. Lastly, deep learning constitutive model predictions were
assessed using simulations for well-documented geotechnical engineering problems. / Thesis / Doctor of Philosophy (PhD)
|
125 |
Essays in Econometrics and Machine Learning:Yao, Qingsong January 2024 (has links)
Thesis advisor: Shakeeb Khan / Thesis advisor: Zhijie Xiao / This dissertation consists of three chapters demonstrating how the current econometric problems can be solved by using machine learning techniques. In the first chapter, I propose new approaches to estimating large dimensional monotone index models. This class of models has been popular in the applied and theoretical econometrics literatures as it includes discrete choice, nonparametric transformation, and duration models. A main advantage of my approach is computational. For instance, rank estimation procedures such as those proposed in Han (1987) and Cavanagh and Sherman (1998) that optimize a nonsmooth, non convex objective function are difficult to use with more than a few regressors and so limits their use in with economic data sets. For such monotone index models with increasing dimension, we propose to use a new class of estimators based on batched gradient descent (BGD) involving nonparametric methods such as kernel estimation or sieve estimation, and study their asymptotic properties. The BGD algorithm uses an iterative procedure where the key step exploits a strictly convex objective function, resulting in computational advantages. A contribution of my approach is that the model is large dimensional and semiparametric and so does not require the use of parametric distributional assumptions. The second chapter studies the estimation of semiparametric monotone index models when the sample size n is extremely large and conventional approaches fail to work due to devastating computational burdens. Motivated by the mini-batch gradient descent algorithm (MBGD) that is widely used as a stochastic optimization tool in the machine learning field, this chapter proposes a novel subsample- and iteration-based estimation procedure. In particular, starting from any initial guess of the true parameter, the estimator is progressively updated using a sequence of subsamples randomly drawn from the data set whose sample size is much smaller than n. The update is based on the gradient of some well-chosen loss function, where the nonparametric component in the model is replaced with its Nadaraya-Watson kernel estimator that is also constructed based on the random subsamples. The proposed algorithm essentially generalizes MBGD algorithm to the semiparametric setup. Since the new method uses only a subsample to perform Nadaraya-Watson kernel estimation and conduct the update, compared with the full-sample-based iterative method, the new method reduces the computational time by roughly n times if the subsample size and the kernel function are chosen properly, so can be easily applied when the sample size n is large. Moreover, this chapter shows that if averages are further conducted across the estimators produced during iterations, the difference between the average estimator and full-sample-based estimator will be 1/\sqrt{n}-trivial. Consequently, the averaged estimator is 1/\sqrt{n}-consistent and asymptotically normally distributed. In other words, the new estimator substantially improves the computational speed, while at the same time maintains the estimation accuracy. Finally, extensive Monte Carlo experiments and real data analysis illustrate the excellent performance of novel algorithm in terms of computational efficiency when the sample size is extremely large. Finally, the third chapter studies robust inference procedure for treatment effects in panel data with flexible relationship across units via the random forest method. The key contribution of this chapter is twofold. First, it proposes a direct construction of prediction intervals for the treatment effect by exploiting the information of the joint distribution of the cross-sectional units to construct counterfactuals using random forest. In particular, it proposes a Quantile Control Method (QCM) using the Quantile Random Forest (QRF) to accommodate flexible cross-sectional structure as well as high dimensionality. Second, it establishes the asymptotic consistency of QRF under the panel/time series setup with high dimensionality, which is of theoretical interest on its own right. In addition, Monte Carlo simulations are conducted and show that prediction intervals via the QCM have excellent coverage probability for the treatment effects comparing to existing methods in the literature, and are robust to heteroskedasticity, autocorrelation, and various types of model misspecifications. Finally, an empirical application to study the effect of the economic integration between Hong Kong and mainland China on Hong Kong’s economy is conducted to highlight the potential of the proposed method. / Thesis (PhD) — Boston College, 2024. / Submitted to: Boston College. Graduate School of Arts and Sciences. / Discipline: Economics.
|
126 |
Advances to Convolutional Neural Network Architectures for Prediction and Classification with Applications in the First Dimensional SpaceKim, Hae Jin 08 1900 (has links)
In the vast field of signal processing, machine learning is rapidly expanding its domain into all realms. As a constituent of this expansion, this thesis presents contributive work on advancements in machine learning algorithms by building on the shoulder of giants. The first chapter of this thesis contains enhancements to a CNN (convolutional neural network) for better classification of heartbeat arrhythmia. The network goes through a two stage development, the first being augmentations to the network and the second being the implementation of dropout. Chapter 2 involves the combination of CNN and LSTM (long short term memory) networks for the task of short-term energy use data regression. Exploiting the benefits of two of the most powerful neural networks, a unique, novel neural network is created to effectually predict future energy use. The final section concludes this work with directions for future works.
|
127 |
Flexible Sparse Learning of Feature SubspacesMa, Yuting January 2017 (has links)
It is widely observed that the performances of many traditional statistical learning methods degenerate when confronted with high-dimensional data. One promising approach to prevent this downfall is to identify the intrinsic low-dimensional spaces where the true signals embed and to pursue the learning process on these informative feature subspaces. This thesis focuses on the development of flexible sparse learning methods of feature subspaces for classification. Motivated by the success of some existing methods, we aim at learning informative feature subspaces for high-dimensional data of complex nature with better flexibility, sparsity and scalability.
The first part of this thesis is inspired by the success of distance metric learning in casting flexible feature transformations by utilizing local information. We propose a nonlinear sparse metric learning algorithm using a boosting-based nonparametric solution to address metric learning problem for high-dimensional data, named as the sDist algorithm. Leveraged a rank-one decomposition of the symmetric positive semi-definite weight matrix of the Mahalanobis distance metric, we restructure a hard global optimization problem into a forward stage-wise learning of weak learners through a gradient boosting algorithm. In each step, the algorithm progressively learns a sparse rank-one update of the weight matrix by imposing an L-1 regularization. Nonlinear feature mappings are adaptively learned by a hierarchical expansion of interactions integrated within the boosting framework. Meanwhile, an early stopping rule is imposed to control the overall complexity of the learned metric. As a result, without relying on computationally intensive tools, our approach automatically guarantees three desirable properties of the final metric: positive semi-definiteness, low rank and element-wise sparsity. Numerical experiments show that our learning model compares favorably with the state-of-the-art methods in the current literature of metric learning.
The second problem arises from the observation of high instability and feature selection bias when applying online methods to highly sparse data of large dimensionality for sparse learning problem. Due to the heterogeneity in feature sparsity, existing truncation-based methods incur slow convergence and high variance. To mitigate this problem, we introduce a stabilized truncated stochastic gradient descent algorithm. We employ a soft-thresholding scheme on the weight vector where the imposed shrinkage is adaptive to the amount of information available in each feature. The variability in the resulted sparse weight vector is further controlled by stability selection integrated with the informative truncation. To facilitate better convergence, we adopt an annealing strategy on the truncation rate. We show that, when the true parameter space is of low dimension, the stabilization with annealing strategy helps to achieve lower regret bound in expectation.
|
128 |
A STUDY OF REAL TIME SEARCH IN FLOOD SCENES FROM UAV VIDEOS USING DEEP LEARNING TECHNIQUESGagandeep Singh Khanuja (7486115) 17 October 2019 (has links)
<div>Following a natural disaster, one of the most important facet that influence a persons chances of survival/being found out is the time with which they are rescued. Traditional means of search operations involving dogs, ground robots, humanitarian intervention; are time intensive and can be a major bottleneck in search operations. The main aim of these operations is to rescue victims without critical delay in the shortest time possible which can be realized in real-time by using UAVs. With advancements in computational devices and the ability to learn from complex data, deep learning can be leveraged in real time environment for purpose of search and rescue operations. This research aims to solve the traditional means of search operation using the concept of deep learning for real time object detection and Photogrammetry for precise geo-location mapping of the objects(person,car) in real time. In order to do so, various pre-trained algorithms like Mask-RCNN, SSD300, YOLOv3 and trained algorithms like YOLOv3 have been deployed with their results compared with means of addressing the search operation in</div><div>real time.</div><div><br></div>
|
129 |
Graph based semi-supervised learning in computer visionHuang, Ning, January 2009 (has links)
Thesis (Ph. D.)--Rutgers University, 2009. / "Graduate Program in Biomedical Engineering." Includes bibliographical references (p. 54-55).
|
130 |
Kernel methods in supervised and unsupervised learning /Tsang, Wai-Hung. January 2003 (has links)
Thesis (M. Phil.)--Hong Kong University of Science and Technology, 2003. / Includes bibliographical references (leaves 46-49). Also available in electronic version. Access restricted to campus users.
|
Page generated in 0.0872 seconds