Global ETD Search

1	Data Reduction Techniques in Classification Processes Lozano Albalate, Maria Teresa 25 July 2007 (has links) The learning process consists of different steps: building a Training Set (TS), training the system, testing its behaviour and finally classifying unknown objects. When using a distance based rule as a classifier, i.e. 1-Nearest Neighbour (1-NN), the first step (building a training set) includes editing and condensing data. The main reason for that is that the rules based on distance need many time to classify each unlabelled sample, x, as each distance from x to each point in the training set should be calculated. So, the more reduced the training set, the shorter the time needed for each new classification process. This thesis is mainly focused on building a training set from some already given data, and specially on condensing it; however different classification techniques are also compared.The aim of any condensing technique is to obtain a reduced training set in order to spend as few time as possible in classification. All that without a significant loss in classification accuracy. Somenew approaches to training set size reduction based on prototypes are presented. These schemes basically consist of defining a small number of prototypes that represent all the original instances. That includes those approaches that select among the already existing examples (selective condensing algorithms), and those which generate new representatives (adaptive condensing algorithms).Those new reduction techniques are experimentally compared to some traditional ones, for data represented in feature spaces. In order to test them, the classical 1-NN rule is here applied. However, other classifiers (fast classifiers) have been considered here, as linear and quadratic ones constructed in dissimilarity spaces based on prototypes, in order to realize how editing and condensing concepts work for this different family of classifiers.Although the goal of the algorithms proposed in this thesis is to obtain a strongly reduced set of representatives, the performance is empirically evaluated over eleven real data sets by comparing not only the reduction rate but also the classification accuracy with those of other condensing techniques. Therefore, the ultimate aim is not only to find a strongly reduced set, but also a balanced one.Several ways to solve the same problem could be found. So, in the case of using a rule based on distance as a classifier, not only the option of reducing the training set can be afford. A different family of approaches consists of applying several searching methods. Therefore, results obtained by the use of the algorithms here presented are compared in terms of classification accuracy and time, to several efficient search techniques.Finally, the main contributions of this PhD report could be briefly summarised in four principal points. Firstly, two selective algorithms based on the idea of surrounding neighbourhood. They obtain better results than other algorithms presented here, as well as better than other traditional schemes. Secondly, a generative approach based on mixtures of Gaussians. It presents better results in classification accuracy and size reduction than traditional adaptive algorithms, and similar to those of the LVQ. Thirdly, it is shown that classification rules other than the 1-NN can be used, even leading to better results. And finally, it is deduced from the experiments carried on, that with some databases (as the ones used here) the approaches here presented execute the classification processes in less time that the efficient search techniques. NCN dissimilarity condensing reduction mixtures of Gaussians sourrounding neighbourhood Gaussians NN Llenguatges i Sistemes Informàtics 004
2	On the Sample Complexity of Privately Learning Gaussians and their Mixtures / Privately Learning Gaussians and their Mixtures Aden-Ali, Ishaq January 2021 (has links) Multivariate Gaussians: We provide sample complexity upper bounds for semi-agnostically learning multivariate Gaussians under the constraint of approximate differential privacy. These are the first finite sample upper bounds for general Gaussians which do not impose restrictions on the parameters of the distribution. Our bounds are near-optimal in the case when the covariance is known to be the identity, and conjectured to be near-optimal in the general case. From a technical standpoint, we provide analytic tools for arguing the existence of global "locally small" covers from local covers of the space. These are exploited using modifications of recent techniques for for differentially private hypothesis selection. Mixtures of Gaussians: We consider the problem of learning mixtures of Gaussians under the constraint of approximate differential privacy. We provide the first sample complexity upper bounds for privately learning mixtures of unbounded axis-aligned (or even unbounded univariate) Gaussians. To prove our results, we design a new technique for privately learning mixture distributions. A class of distributions F is said to be list-decodable if there is an algorithm that, given "heavily corrupted" samples from a distribution f in F, outputs a list of distributions, H, such that one of the distributions in H approximates f. We show that if F is privately list-decodable then we can privately learn mixtures of distributions in F. Finally, we show axis-aligned Gaussian distributions are privately list-decodable, thereby proving mixtures of such distributions are privately learnable. / Thesis / Master of Science (MSc) / Is it possible to estimate an unknown probability distribution given random samples from it? This is a fundamental problem known as distribution learning (or density estimation) that has been studied by statisticians for decades, and in recent years has become a topic of interest for computer scientists. While distribution learning is a mature and well understood problem, in many cases the samples (or data) we observe may consist of sensitive information belonging to individuals and well-known solutions may inadvertently result in the leakage of private information. In this thesis we study distribution learning under the assumption that the data is generated from high-dimensional Gaussians (or their mixtures) with the aim of understanding how many samples an algorithm needs before it can guarantee a good estimate. Furthermore, to protect against leakage of private information, we consider approaches that satisfy differential privacy — the gold standard for modern private data analysis.
3	Laser Scanning Imaging for Increased Depth-Of-Focus Shin, Dong-Ik 20 August 2002 (has links) Throughout the decades, different techniques have been proposed to improve the depth-of-focus in optical microscopy. Common techniques like optical sectioning microscopy and scanning confocal microscopy have innate problems. By simply modifying the pupil function in microscope imaging system, we can also extend the depth-of-focus. The scanning system with a thin annular pupil has a high depth-of-focus and can scan the whole object, but the output light is too dim to be detected well by a photodetector. In this thesis, we propose a scanning technique employing an optical heterodyne scanning system using a difference-of-Gaussians (DoG) pupil. The object is illuminated by the combined beam which consists of two Gaussian beams with different waists, frequencies, and amplitudes. This system does not block most light like the annular pupil system and can obtain high depth-of-focus. The main objective of the thesis is to extend the depth-of-focus using the proposed system. The depth-of-focus characteristics of the DoG pupil function are examined and compared with those of well-known functions such as the circular, annular, and Gaussian pupils. / Master of Science depth-of-focus microscopy difference-of-gaussians optical scanning holography
4	Information theoretic approach to tactile encoding and discrimination Saal, Hannes January 2011 (has links) The human sense of touch integrates feedback from a multitude of touch receptors, but how this information is represented in the neural responses such that it can be extracted quickly and reliably is still largely an open question. At the same time, dexterous robots equipped with touch sensors are becoming more common, necessitating better methods for representing sequentially updated information and new control strategies that aid in extracting relevant features for object manipulation from the data. This thesis uses information theoretic methods for two main aims: First, the neural code for tactile processing in humans is analyzed with respect to how much information is transmitted about tactile features. Second, machine learning approaches are used in order to influence both what data is gathered by a robot and how it is represented by maximizing information theoretic quantities. The first part of this thesis contains an information theoretic analysis of data recorded from primary tactile neurons in the human peripheral somatosensory system. We examine the differences in information content of two coding schemes, namely spike timing and spike counts, along with their spatial and temporal characteristics. It is found that estimates of the neurons’ information content based on the precise timing of spikes are considerably larger than for spikes counts. Moreover, the information estimated based on the timing of the very first elicited spike is at least as high as that provided by spike counts, but in many cases considerably higher. This suggests that first spike latencies can serve as a powerful mechanism to transmit information quickly. However, in natural object manipulation tasks, different tactile impressions follow each other quickly, so we asked whether the hysteretic properties of the human fingertip affect neural responses and information transmission. We find that past stimuli affect both the precise timing of spikes and spike counts of peripheral tactile neurons, resulting in increased neural noise and decreased information about ongoing stimuli. Interestingly, the first spike latencies of a subset of afferents convey information primarily about past stimulation, hinting at a mechanism to resolve ambiguity resulting from mechanical skin properties. The second part of this thesis focuses on using machine learning approaches in a robotics context in order to influence both what data is gathered and how it is represented by maximizing information theoretic quantities. During robotic object manipulation, often not all relevant object features are known, but have to be acquired from sensor data. Touch is an inherently active process and the question arises of how to best control the robot’s movements so as to maximize incoming information about the features of interest. To this end, we develop a framework that uses active learning to help with the sequential gathering of data samples by finding highly informative actions. The viability of this approach is demonstrated on a robotic hand-arm setup, where the task involves shaking bottles of different liquids in order to determine the liquid’s viscosity from tactile feedback only. The shaking frequency and the rotation angle of shaking are optimized online. Additionally, we consider the problem of how to better represent complex probability distributions that are sequentially updated, as approaches for minimizing uncertainty depend on an accurate representation of that uncertainty. A mixture of Gaussians representation is proposed and optimized using a deterministic sampling approach. We show how our method improves on similar approaches and demonstrate its usefulness in active learning scenarios. The results presented in this thesis highlight how information theory can provide a principled approach for both investigating how much information is contained in sensory data and suggesting ways for optimization, either by using better representations or actively influencing the environment. 621.382
5	Active Learning with Statistical Models Cohn, David A., Ghahramani, Zoubin, Jordan, Michael I. 21 March 1995 (has links) For many types of learners one can compute the statistically 'optimal' way to select data. We review how these techniques have been used with feedforward neural networks. We then show how the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate. AI MIT Artificial Intelligence active learning queries locally weighted regression LOESS mixtures of gaussians exploration robotics
6	Trasování pohybu objektů s pomocí počítačového vidění / Object tracking using computer vision Klapal, Matěj January 2017 (has links) This diploma thesis deals with posibilities of tracking object movement using computer vision algorithms. First chapters contain review of methods used for background subtraction, there are also listed basic detection approaches and thesis also mentions algorithms which allows tracking and movement prediction. Next part of this work informs about algoritms implemented in resulting software and its graphical user interface. Evaluation and comparison of original and modified algorithms is stationed at the end of this text.
7	THE FREE PROBABILISTIC ANALYSIS OF RATIONAL FUNCTIONS AND q-DEFORMATION / 有理関数とq変形の自由確率解析 Miyagawa, Akihiro 25 March 2024 (has links) 京都大学 / 新制・課程博士 / 博士(理学) / 甲第25090号 / 理博第4997号 / 新制\|\|理\|\|1714(附属図書館) / 京都大学大学院理学研究科数学・数理解析専攻 / (主査)教授 COLLINSBenoit Vincent Pierre, 教授泉正己, 准教授窪田陽介 / 学位規則第4条第1項該当 / Doctor of Agricultural Science / Kyoto University / DFAM Free Probability Non-commutative rational functions Kronecker's theorem Free semicircles q-Gaussians Conjugate system Strong convergence 400
8	Inversion of seismic attributes for petrophysical parameters and rock facies Shahraeeni, Mohammad Sadegh January 2011 (has links) Prediction of rock and fluid properties such as porosity, clay content, and water saturation is essential for exploration and development of hydrocarbon reservoirs. Rock and fluid property maps obtained from such predictions can be used for optimal selection of well locations for reservoir development and production enhancement. Seismic data are usually the only source of information available throughout a field that can be used to predict the 3D distribution of properties with appropriate spatial resolution. The main challenge in inferring properties from seismic data is the ambiguous nature of geophysical information. Therefore, any estimate of rock and fluid property maps derived from seismic data must also represent its associated uncertainty. In this study we develop a computationally efficient mathematical technique based on neural networks to integrate measured data and a priori information in order to reduce the uncertainty in rock and fluid properties in a reservoir. The post inversion (a posteriori) information about rock and fluid properties are represented by the joint probability density function (PDF) of porosity, clay content, and water saturation. In this technique the a posteriori PDF is modeled by a weighted sum of Gaussian PDF’s. A so-called mixture density network (MDN) estimates the weights, mean vector, and covariance matrix of the Gaussians given any measured data set. We solve several inverse problems with the MDN and compare results with Monte Carlo (MC) sampling solution and show that the MDN inversion technique provides good estimate of the MC sampling solution. However, the computational cost of training and using the neural network is much lower than solution found by MC sampling (more than a factor of 104 in some cases). We also discuss the design, implementation, and training procedure of the MDN, and its limitations in estimating the solution of an inverse problem. In this thesis we focus on data from a deep offshore field in Africa. Our goal is to apply the MDN inversion technique to obtain maps of petrophysical properties (i.e., porosity, clay content, water saturation), and petrophysical facies from 3D seismic data. Petrophysical facies (i.e., non-reservoir, oil- and brine-saturated reservoir facies) are defined probabilistically based on geological information and values of the petrophysical parameters. First, we investigate the relationship (i.e., petrophysical forward function) between compressional- and shear-wave velocity and petrophysical parameters. The petrophysical forward function depends on different properties of rocks and varies from one rock type to another. Therefore, after acquisition of well logs or seismic data from a geological setting the petrophysical forward function must be calibrated with data and observations. The uncertainty of the petrophysical forward function comes from uncertainty in measurements and uncertainty about the type of facies. We present a method to construct the petrophysical forward function with its associated uncertainty from the both sources above. The results show that introducing uncertainty in facies improves the accuracy of the petrophysical forward function predictions. Then, we apply the MDN inversion method to solve four different petrophysical inverse problems. In particular, we invert P- and S-wave impedance logs for the joint PDF of porosity, clay content, and water saturation using a calibrated petrophysical forward function. Results show that posterior PDF of the model parameters provides reasonable estimates of measured well logs. Errors in the posterior PDF are mainly due to errors in the petrophysical forward function. Finally, we apply the MDN inversion method to predict 3D petrophysical properties from attributes of seismic data. In this application, the inversion objective is to estimate the joint PDF of porosity, clay content, and water saturation at each point in the reservoir, from the compressional- and shear-wave-impedance obtained from the inversion of AVO seismic data. Uncertainty in the a posteriori PDF of the model parameters are due to different sources such as variations in effective pressure, bulk modulus and density of hydrocarbon, uncertainty of the petrophysical forward function, and random noise in recorded data. Results show that the standard deviations of all model parameters are reduced after inversion, which shows that the inversion process provides information about all parameters. We also applied the result of the petrophysical inversion to estimate the 3D probability maps of non-reservoir facies, brine- and oil-saturated reservoir facies. The accuracy of the predicted oil-saturated facies at the well location is good, but due to errors in the petrophysical inversion the predicted non-reservoir and brine-saturated facies are ambiguous. Although the accuracy of results may vary due to different sources of error in different applications, the fast, probabilistic method of solving non-linear inverse problems developed in this study can be applied to invert well logs and large seismic data sets for petrophysical parameters in different applications. 622
9	Multiple Hypothesis Tracking For Multiple Visual Targets Turker, Burcu 01 April 2010 (has links) (PDF) Visual target tracking problem consists of two topics: Obtaining targets from camera measurements and target tracking. Even though it has been studied for more than 30 years, there are still some problems not completely solved. Especially in the case of multiple targets, association of measurements to targets, creation of new targets and deletion of old ones are among those. What is more, it is very important to deal with the occlusion and crossing targets problems suitably. We believe that a slightly modified version of multiple hypothesis tracking can successfully deal with most of the aforementioned problems with sufficient success. Distance, track size, track color, gate size and track history are used as parameters to evaluate the hypotheses generated for measurement to track association problem whereas size and color are used as parameters for occlusion problem. The overall tracker has been fine tuned over some scenarios and it has been observed that it performs well over the testing scenarios as well. Furthermore the performance of the tracker is analyzed according to those parameters in both association and occlusion handling situations.
10	Multiple hypothesis tracking for multiple visual targets Turker, Burcu 01 April 2010 (has links) (PDF) Visual target tracking problem consists of two topics: Obtaining targets from camera measurements and target tracking. Even though it has been studied for more than 30 years, there are still some problems not completely solved. Especially in the case of multiple targets, association of measurements to targets, creation of new targets and deletion of old ones are among those. What is more, it is very important to deal with the occlusion and crossing targets problems suitably. We believe that a slightly modified version of multiple hypothesis tracking can successfully deal with most of the aforementioned problems with sufficient success. Distance, track size, track color, gate size and track history are used as parameters to evaluate the hypotheses generated for measurement to track association problem whereas size and color are used as parameters for occlusion problem. The overall tracker has been fine tuned over some scenarios and it has been observed that it performs well over the testing scenarios as well. Furthermore the performance of the tracker is analyzed according to those parameters in both association and occlusion handling situations.

Search results