• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 203
  • 33
  • 32
  • 27
  • 10
  • 6
  • 5
  • 4
  • 4
  • 4
  • 2
  • 2
  • 2
  • 2
  • 1
  • Tagged with
  • 379
  • 200
  • 188
  • 100
  • 94
  • 91
  • 80
  • 76
  • 76
  • 68
  • 66
  • 57
  • 57
  • 56
  • 53
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Voronoi-based nearest neighbor search for multi-dimensional uncertain databases

Zhang, Peiwu., 张培武. January 2012 (has links)
In Voronoi-based nearest neighbor search, the Voronoi cell of every point p in a database can be used to check whether p is the closest to some query point q. We extend the notion of Voronoi cells to support uncertain objects, whose attribute values are inexact. Particularly, we propose the Possible Voronoi cell (or PV-cell). A PV-cell of a multi-dimensional uncertain object o is a region R, such that for any point p ∈ R, o may be the nearest neighbor of p. If the PV-cells of all objects in a database S are known, they can be used to identify objects that have a chance to be the nearest neighbor of q. However, there is no efficient algorithm for computing an exact PV-cell. We hence study how to derive an axis-parallel hyper-rectangle (called the Uncertain Bounding Rectangle, or UBR) that tightly contains a PV-cell. We further develop the PV-index, a structure that stores UBRs, to evaluate probabilistic nearest neighbor queries over uncertain data. An advantage of the PV-index is that upon updates on S, it can be incrementally updated. Extensive experiments on both synthetic and real datasets are carried out to validate the performance of the PV-index. / published_or_final_version / Computer Science / Master / Master of Philosophy
32

Location Sensing Using Bluetooth for GPS Suppression

Mair, Nicholas 06 September 2012 (has links)
With the ubiquity of mobile devices, there has been increased interest in determining how they can be used with location-based services. These types of services work best when the device has the ability to sense its location frequently, while still maintaining enough battery life to carry out its normal daily functions. Since the life of the battery on a mobile device is already so limited, ways of preserving that energy has become an important issue. The goal of this thesis is to demonstrate that Bluetooth can assist in providing energy efficient mobile device localization. This goal is achieved through a proposed Bluetooth Location Service Discovery framework which provides an API that can be incorporated into third party applications. The API allows BlackBerry devices to use surrounding Bluetooth devices in order to make a prediction about its current location. Predictions are completed with the assistance of the K-Nearest Neighbour data mining algorithm, and can be used as an alternative to invoking the GPS. The results obtained through experiments demonstrate that the results are comparable to those obtained with GPS.
33

Integrated feature, neighbourhood, and model optimization for personalised modelling and knowledge discovery

Liang, Wen January 2009 (has links)
“Machine learning is the process of discovering and interpreting meaningful information, such as new correlations, patterns and trends by sifting through large amounts of data stored in repositories, using pattern recognition technologies as well as statistical and mathematical techniques” (Larose, 2005). From my understanding, machine learning is a process of using different analysis techniques to observe previously unknown, potentially meaningful information, and discover strong patterns and relationships from a large dataset. Professor Kasabov (2007b) classified computational models into three categories (e.g. global, local, and personalised) which have been widespread and used in the areas of data analysis and decision support in general, and in the areas of medicine and bioinformatics in particular. Most recently, the concept of personalised modelling has been widely applied to various disciplines such as personalised medicine, personalised drug design for known diseases (e.g. cancer, diabetes, brain disease, etc.) as well as for other modelling problems in ecology, business, finance, crime prevention, and so on. The philosophy behind the personalised modelling approach is that every person is different from others, thus he/she will benefit from having a personalised model and treatment. However, personalised modelling is not without issues, such as defining the correct number of neighbours or defining an appropriate number of features. As a result, the principal goal of this research is to study and address these issues and to create a novel framework and system for personalised modelling. The framework would allow users to select and optimise the most important features and nearest neighbours for a new input sample in relation to a certain problem based on a weighted variable distance measure in order to obtain more precise prognostic accuracy and personalised knowledge, when compared with global modelling and local modelling approaches.
34

Integrated feature, neighbourhood, and model optimization for personalised modelling and knowledge discovery

Liang, Wen January 2009 (has links)
“Machine learning is the process of discovering and interpreting meaningful information, such as new correlations, patterns and trends by sifting through large amounts of data stored in repositories, using pattern recognition technologies as well as statistical and mathematical techniques” (Larose, 2005). From my understanding, machine learning is a process of using different analysis techniques to observe previously unknown, potentially meaningful information, and discover strong patterns and relationships from a large dataset. Professor Kasabov (2007b) classified computational models into three categories (e.g. global, local, and personalised) which have been widespread and used in the areas of data analysis and decision support in general, and in the areas of medicine and bioinformatics in particular. Most recently, the concept of personalised modelling has been widely applied to various disciplines such as personalised medicine, personalised drug design for known diseases (e.g. cancer, diabetes, brain disease, etc.) as well as for other modelling problems in ecology, business, finance, crime prevention, and so on. The philosophy behind the personalised modelling approach is that every person is different from others, thus he/she will benefit from having a personalised model and treatment. However, personalised modelling is not without issues, such as defining the correct number of neighbours or defining an appropriate number of features. As a result, the principal goal of this research is to study and address these issues and to create a novel framework and system for personalised modelling. The framework would allow users to select and optimise the most important features and nearest neighbours for a new input sample in relation to a certain problem based on a weighted variable distance measure in order to obtain more precise prognostic accuracy and personalised knowledge, when compared with global modelling and local modelling approaches.
35

Aggregate nearest neighbor queries /

Hui, Michael Chun Kit. January 2004 (has links)
Thesis (M. Phil.)--Hong Kong University of Science and Technology, 2004. / Includes bibliographical references (leaves 91-95). Also available in electronic version. Access restricted to campus users.
36

K-nearest-neighbor queries with non-spatial predicates on range attributes /

Wong, Wing Sing. January 2005 (has links)
Thesis (M.Phil.)--Hong Kong University of Science and Technology, 2005. / Includes bibliographical references (leaves 60-61). Also available in electronic version.
37

New LSH-based Algorithm for Approximate Nearest Neighbor

Andoni, Alexandr, Indyk, Piotr 04 November 2005 (has links)
We present an algorithm for c-approximate nearest neighbor problem in a d-dimensional Euclidean space, achieving query time ofO(dn^{1/c^2+o(1)}) and space O(dn + n^{1+1/c^2+o(1)}).
38

Classification Analytics in Functional Neuroimaging: Calibrating Signal Detection Parameters

Fisher, Julia Marie January 2015 (has links)
Classification analyses are a promising way to localize signal, especially scattered signal, in functional magnetic resonance imaging data. However, there is not yet a consensus on the most effective analysis pathway. We explore the efficacy of k-Nearest Neighbors classifiers on simulated functional magnetic resonance imaging data. We utilize a novel construction of the classification data. Additionally, we vary the spatial distribution of signal, the design matrix of the linear model used to construct the classification data, and the feature set available to the classifier. Results indicate that the k-Nearest Neighbors classifier is not sufficient under the current paradigm to adequately classify neural data and localize signal. Further exploration of the data using k-means clustering indicates that this is likely due in part to the amount of noise present in each data point. Suggestions are made for further research.
39

Improved tree species discrimination at leaf level with hyperspectral data combining binary classifiers

Dastile, Xolani Collen January 2011 (has links)
The purpose of the present thesis is to show that hyperspectral data can be used for discrimination between different tree species. The data set used in this study contains the hyperspectral measurements of leaves of seven savannah tree species. The data is high-dimensional and shows large within-class variability combined with small between-class variability which makes discrimination between the classes challenging. We employ two classification methods: G-nearest neighbour and feed-forward neural networks. For both methods, direct 7-class prediction results in high misclassification rates. However, binary classification works better. We constructed binary classifiers for all possible binary classification problems and combine them with Error Correcting Output Codes. We show especially that the use of 1-nearest neighbour binary classifiers results in no improvement compared to a direct 1-nearest neighbour 7-class predictor. In contrast to this negative result, the use of neural networks binary classifiers improves accuracy by 10% compared to a direct neural networks 7-class predictor, and error rates become acceptable. This can be further improved by choosing only suitable binary classifiers for combination.
40

k-Nearest Neighbour Classification of Datasets with a Family of Distances

Hatko, Stan January 2015 (has links)
The k-nearest neighbour (k-NN) classifier is one of the oldest and most important supervised learning algorithms for classifying datasets. Traditionally the Euclidean norm is used as the distance for the k-NN classifier. In this thesis we investigate the use of alternative distances for the k-NN classifier. We start by introducing some background notions in statistical machine learning. We define the k-NN classifier and discuss Stone's theorem and the proof that k-NN is universally consistent on the normed space R^d. We then prove that k-NN is universally consistent if we take a sequence of random norms (that are independent of the sample and the query) from a family of norms that satisfies a particular boundedness condition. We extend this result by replacing norms with distances based on uniformly locally Lipschitz functions that satisfy certain conditions. We discuss the limitations of Stone's lemma and Stone's theorem, particularly with respect to quasinorms and adaptively choosing a distance for k-NN based on the labelled sample. We show the universal consistency of a two stage k-NN type classifier where we select the distance adaptively based on a split labelled sample and the query. We conclude by giving some examples of improvements of the accuracy of classifying various datasets using the above techniques.

Page generated in 0.0599 seconds