• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1399
  • 504
  • 176
  • 126
  • 107
  • 86
  • 72
  • 68
  • 63
  • 59
  • 41
  • 35
  • 34
  • 21
  • 21
  • Tagged with
  • 3119
  • 3119
  • 557
  • 510
  • 450
  • 440
  • 396
  • 381
  • 344
  • 337
  • 320
  • 320
  • 260
  • 250
  • 226
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
441

Entropy based techniques with applications in data mining

Okafor, Anthony. January 2005 (has links)
Thesis (Ph. D.)--University of Florida, 2005. / Title from title page of source document. Document formatted into pages; contains 97 pages. Includes vita. Includes bibliographical references.
442

Association rule based classification

Palanisamy, Senthil Kumar. January 2006 (has links)
Thesis (M.S.)--Worcester Polytechnic Institute. / Keywords: Itemset Pruning, Association Rules, Adaptive Minimal Support, Associative Classification, Classification. Includes bibliographical references (p.70-74).
443

Pattern discovery in spatial, image, and biological data /

Qian, Yu., January 2006 (has links)
Thesis (Ph. D.)--University of Texas at Dallas, 2006. / Includes vita. Includes bibliographical references (leaves 195-205).
444

Applying data mining to job-shop scheduling using regression analysis

Innani, Alok D. January 2004 (has links)
Thesis (M.S.)--Ohio University, August, 2004. / Title from PDF t.p. Includes bibliographical references (p. 84-87)
445

Use of data mining for investigation of crime patterns

Padhye, Manoday D. January 2006 (has links)
Thesis (M.S.)--West Virginia University, 2006. / Title from document title page. Document formatted into pages; contains viii, 108 p. : ill. (some col.). Includes abstract. Includes bibliographical references (p. 80-81).
446

Online Aggregation über Datenströmen mit Verfahren der mathematischen Statistik in grossen Datenbanksystemen

Blohsfeld, Björn. Unknown Date (has links)
Universiẗat, Diss., 2002--Marburg.
447

On building predictive models with company annual reports

Qiu, Xin Ying. January 2007 (has links)
Thesis (Ph. D.)--University of Iowa, 2007. / Supervisor: Padmini Srinivasan. Includes bibliographical references (leaves 93-100).
448

Beschleunigte Entwicklung von Katalysatorsystemen und Polymeren durch Automatisierung, kombinatorische Methoden, schnelle Analytik und Datenanalyse

Tuchbreiter, Arno. Unknown Date (has links) (PDF)
Universiẗat, Diss., 2003--Freiburg (Breisgau).
449

Συγκριτική μελέτη κατανεμημένων και παράλληλων αλγόριθμων παραγωγής κανόνων συσχέτισης

Γερολυμάτος, Αντώνιος 23 August 2010 (has links)
- / -
450

Pivot-based Data Partitioning for Distributed k Nearest Neighbor Mining

Kuhlman, Caitlin Anne 20 January 2017 (has links)
This thesis addresses the need for a scalable distributed solution for k-nearest-neighbor (kNN) search, a fundamental data mining task. This unsupervised method poses particular challenges on shared-nothing distributed architectures, where global information about the dataset is not available to individual machines. The distance to search for neighbors is not known a priori, and therefore a dynamic data partitioning strategy is required to guarantee that exact kNN can be found autonomously on each machine. Pivot-based partitioning has been shown to facilitate bounding of partitions, however state-of-the-art methods suffer from prohibitive data duplication (upwards of 20x the size of the dataset). In this work an innovative method for solving exact distributed kNN search called PkNN is presented. The key idea is to perform computation over several rounds, leveraging pivot-based data partitioning at each stage. Aggressive data-driven bounds limit communication costs, and a number of optimizations are designed for efficient computation. Experimental study on large real-world data (over 1 billion points) compares PkNN to the state-of-the-art distributed solution, demonstrating that the benefits of additional stages of computation in the PkNN method heavily outweigh the added I/O overhead. PkNN achieves a data duplication rate close to 1, significant speedup over previous solutions, and scales effectively in data cardinality and dimension. PkNN can facilitate distributed solutions to other unsupervised learning methods which rely on kNN search as a critical building block. As one example, a distributed framework for the Local Outlier Factor (LOF) algorithm is given. Testing on large real-world and synthetic data with varying characteristics measures the scalability of PkNN and the distributed LOF framework in data size and dimensionality.

Page generated in 0.1006 seconds