Spelling suggestions: "subject:"data minining"" "subject:"data chanining""
441 |
Entropy based techniques with applications in data miningOkafor, Anthony. January 2005 (has links)
Thesis (Ph. D.)--University of Florida, 2005. / Title from title page of source document. Document formatted into pages; contains 97 pages. Includes vita. Includes bibliographical references.
|
442 |
Association rule based classificationPalanisamy, Senthil Kumar. January 2006 (has links)
Thesis (M.S.)--Worcester Polytechnic Institute. / Keywords: Itemset Pruning, Association Rules, Adaptive Minimal Support, Associative Classification, Classification. Includes bibliographical references (p.70-74).
|
443 |
Pattern discovery in spatial, image, and biological data /Qian, Yu., January 2006 (has links)
Thesis (Ph. D.)--University of Texas at Dallas, 2006. / Includes vita. Includes bibliographical references (leaves 195-205).
|
444 |
Applying data mining to job-shop scheduling using regression analysisInnani, Alok D. January 2004 (has links)
Thesis (M.S.)--Ohio University, August, 2004. / Title from PDF t.p. Includes bibliographical references (p. 84-87)
|
445 |
Use of data mining for investigation of crime patternsPadhye, Manoday D. January 2006 (has links)
Thesis (M.S.)--West Virginia University, 2006. / Title from document title page. Document formatted into pages; contains viii, 108 p. : ill. (some col.). Includes abstract. Includes bibliographical references (p. 80-81).
|
446 |
Online Aggregation über Datenströmen mit Verfahren der mathematischen Statistik in grossen DatenbanksystemenBlohsfeld, Björn. Unknown Date (has links)
Universiẗat, Diss., 2002--Marburg.
|
447 |
On building predictive models with company annual reportsQiu, Xin Ying. January 2007 (has links)
Thesis (Ph. D.)--University of Iowa, 2007. / Supervisor: Padmini Srinivasan. Includes bibliographical references (leaves 93-100).
|
448 |
Beschleunigte Entwicklung von Katalysatorsystemen und Polymeren durch Automatisierung, kombinatorische Methoden, schnelle Analytik und DatenanalyseTuchbreiter, Arno. Unknown Date (has links) (PDF)
Universiẗat, Diss., 2003--Freiburg (Breisgau).
|
449 |
Συγκριτική μελέτη κατανεμημένων και παράλληλων αλγόριθμων παραγωγής κανόνων συσχέτισηςΓερολυμάτος, Αντώνιος 23 August 2010 (has links)
- / -
|
450 |
Pivot-based Data Partitioning for Distributed k Nearest Neighbor MiningKuhlman, Caitlin Anne 20 January 2017 (has links)
This thesis addresses the need for a scalable distributed solution for k-nearest-neighbor (kNN) search, a fundamental data mining task. This unsupervised method poses particular challenges on shared-nothing distributed architectures, where global information about the dataset is not available to individual machines. The distance to search for neighbors is not known a priori, and therefore a dynamic data partitioning strategy is required to guarantee that exact kNN can be found autonomously on each machine. Pivot-based partitioning has been shown to facilitate bounding of partitions, however state-of-the-art methods suffer from prohibitive data duplication (upwards of 20x the size of the dataset). In this work an innovative method for solving exact distributed kNN search called PkNN is presented. The key idea is to perform computation over several rounds, leveraging pivot-based data partitioning at each stage. Aggressive data-driven bounds limit communication costs, and a number of optimizations are designed for efficient computation. Experimental study on large real-world data (over 1 billion points) compares PkNN to the state-of-the-art distributed solution, demonstrating that the benefits of additional stages of computation in the PkNN method heavily outweigh the added I/O overhead. PkNN achieves a data duplication rate close to 1, significant speedup over previous solutions, and scales effectively in data cardinality and dimension. PkNN can facilitate distributed solutions to other unsupervised learning methods which rely on kNN search as a critical building block. As one example, a distributed framework for the Local Outlier Factor (LOF) algorithm is given. Testing on large real-world and synthetic data with varying characteristics measures the scalability of PkNN and the distributed LOF framework in data size and dimensionality.
|
Page generated in 0.1006 seconds