Global ETD Search

21	COPLINK Knowledge Management for Law Enforcement: Text Analysis, Visualization and Collaboration Atabakhsh, Homa, Schroeder, Jennifer, Chen, Hsinchun, Chau, Michael, Xu, Jennifer J., Zhang, Jing, Bi, Haidong January 2001 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / Crime and police report information is rapidly migrating from paper records to automated records management databases. Most mid and large sized police agencies have such systems that provide access to information by their own personnel, but lack any efficient manner by which to provide that information to other agencies. Criminals show no regard for jurisdictional boundaries and in fact take advantage of the lack of communication across jurisdictions. Federal standards initiatives such as the National Incident Based Reporting System (NIBRS, US Department of Justice 1998), are attempting to provide reporting standards to police agencies to facilitate future reporting and information sharing among agencies as these electronic reporting systems become more widespread. We integrated platform-independence, stability, scalability, and an intuitive graphical user interface to develop the COPLINK system, which is currently being deployed at Tucson Police Department (TPD). User evaluations of the application allowed us to study the impact of COPLINK on law enforcement personnel as well as to identify requirements for improving the system and extending the project. We are currently in the process of extending the functionality of COPLINK in several areas. These include textual analysis, collaboration, visualization and geo-mapping. Database Searching Instructions Data Mining
22	Efficient algorithms for mining association rules in large databases of cutomer transactions Savasere, Ashok January 1998 (has links) No description available. Database management Database searching Algorithms
23	DataMapX a tool for cross-mapping entities and attributes between bioinformatics databases / Kanchinadam, Krishna M. January 2008 (has links) Thesis (M.S.)--George Mason University, 2008. / Vita: p. 29. Thesis director: Jennifer Weller. Submitted in partial fulfillment of the requirements for the degree of Master of Science in Bioinformatics. Title from PDF t.p. (viewed July 7, 2008). Includes bibliographical references (p. 28). Also issued in print.
24	Use of probabilistic topic models for search Draeger, Marco. January 2009 (has links) (PDF) Thesis (M.S. in Operations Research)--Naval Postgraduate School, September 2009. / Thesis Advisor(s): Squire, Kevin M. "September 2009." Description based on title screen as viewed on November 5, 2009. Author(s) subject terms: Document modeling, information retrieval, semantic search, Bayesian nonparametric methods, hierarchical Bayes. Includes bibliographical references (p. 67-71). Also available in print.
25	The effects of cognitive complexity, need for cognition, and orientation toward learning on information search strategies Chang, Ching-Kuch, January 1991 (has links) Thesis (Ph. D.)--Purdue University, 1991. / Description based on print version record. Includes bibliographical references (p. 84-85).
26	Similarity searching in sequence databases under time warping. January 2004 (has links) Wong, Siu Fung. / Thesis submitted in: December 2003. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2004. / Includes bibliographical references (leaves 77-84). / Abstracts in English and Chinese. / Abstract --- p.ii / Acknowledgement --- p.vi / Chapter 1 --- Introduction --- p.1 / Chapter 2 --- Preliminary --- p.6 / Chapter 2.1 --- Dynamic Time Warping (DTW) --- p.6 / Chapter 2.2 --- Spatial Indexing --- p.10 / Chapter 2.3 --- Relevance Feedback --- p.11 / Chapter 3 --- Literature Review --- p.13 / Chapter 3.1 --- Searching Sequences under Euclidean Metric --- p.13 / Chapter 3.2 --- Searching Sequences under Dynamic Time Warping Metric --- p.17 / Chapter 4 --- Subsequence Matching under Time Warping --- p.21 / Chapter 4.1 --- Subsequence Matching --- p.22 / Chapter 4.1.1 --- Sequential Search --- p.22 / Chapter 4.1.2 --- Indexing Scheme --- p.23 / Chapter 4.2 --- Lower Bound Technique --- p.25 / Chapter 4.2.1 --- Properties of Lower Bound Technique --- p.26 / Chapter 4.2.2 --- Existing Lower Bound Functions --- p.27 / Chapter 4.3 --- Point-Based indexing --- p.28 / Chapter 4.3.1 --- Lower Bound for subsequences matching --- p.28 / Chapter 4.3.2 --- Algorithm --- p.35 / Chapter 4.4 --- Rectangle-Based indexing --- p.37 / Chapter 4.4.1 --- Lower Bound for subsequences matching --- p.37 / Chapter 4.4.2 --- Algorithm --- p.41 / Chapter 4.5 --- Experimental Results --- p.43 / Chapter 4.5.1 --- Candidate ratio vs Width of warping window --- p.44 / Chapter 4.5.2 --- CPU time vs Number of subsequences --- p.45 / Chapter 4.5.3 --- CPU time vs Width of warping window --- p.46 / Chapter 4.5.4 --- CPU time vs Threshold --- p.46 / Chapter 4.6 --- Summary --- p.47 / Chapter 5 --- Relevance Feedback under Time Warping --- p.49 / Chapter 5.1 --- Integrating Relevance Feedback with DTW --- p.49 / Chapter 5.2 --- Query Reformulation --- p.53 / Chapter 5.2.1 --- Constraint Updating --- p.53 / Chapter 5.2.2 --- Weight Updating --- p.55 / Chapter 5.2.3 --- Overall Strategy --- p.58 / Chapter 5.3 --- Experiments and Evaluation --- p.59 / Chapter 5.3.1 --- Effectiveness of the strategy --- p.61 / Chapter 5.3.2 --- Efficiency of the strategy --- p.63 / Chapter 5.3.3 --- Usability --- p.64 / Chapter 5.4 --- Summary --- p.71 / Chapter 6 --- Conclusion --- p.72 / Chapter A --- Deduction of Data Bounding Hyper-rectangle --- p.74 / Chapter B --- Proof of Theorem2 --- p.76 / Bibliography --- p.77 / Publications --- p.84 Time-series analysis Database searching Information retrieval
27	Efficient similarity search in time series data. / CUHK electronic theses & dissertations collection January 2007 (has links) Time series data is ubiquitous in real world, and the similarity search in time series data is of great importance to many applications. This problem consists of two major parts: how to define the similarity between time series and how to search for similar time series efficiently. As for the similarity measure, the Euclidean distance is a good starting point; however, it also has several limitations. First, it is sensitive to the shifting and scaling transformations. Under a geometric model, we analyze this problem extensively and propose an angle-based similarity measure which is invariant to the shifting and scaling transformations. We then extend the conical index to support for the proposed angle-based similarity measure efficiently. Besides the distortions in amplitude axis, the Euclidean distance is also sensitive to the distortion in time axis; Dynamic Time Warping (DTW) distance is a very good similarity measure which is invariant to the time distortion. However, the time complexity of DTW is high which inhibits its application on large datasets. The index method under DTW distance is a common solution for this problem, and the lower-bound technique plays an important role in the indexing of DTW. We explain the existing lower-bound functions under a unified frame work and propose a group of new lower-bound functions which are much better. Based on the proposed lower-bound functions, an efficient index structure under DTW distance is implemented. In spite of the great success of DTW, it is not very suitable for the time scaling search problem where the time distortion is too large. We modify the traditional DTW distance and propose the Segment-wise Time Warping (STW) distance to adapt to the time scaling search problem. Finally, we devise an efficient search algorithm for the problem of online pattern detection in data streams under DTW distance. / Zhou, Mi. / "January 2007." / Adviser: Man Hon Wong. / Source: Dissertation Abstracts International, Volume: 68-09, Section: B, page: 6100. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2007. / Includes bibliographical references (p. 167-180). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract in English and Chinese. / School code: 1307. Data mining Database searching Time-series analysis
28	Fast algorithms for sequence data searching. January 1997 (has links) by Sze-Kin Lam. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1997. / Includes bibliographical references (leaves 71-76). / Abstract --- p.i / Acknowledgement --- p.iii / Chapter 1 --- Introduction --- p.1 / Chapter 2 --- Related Work --- p.6 / Chapter 2.1 --- Sequence query processing --- p.8 / Chapter 2.2 --- Text sequence searching --- p.8 / Chapter 2.3 --- Numerical sequence searching --- p.11 / Chapter 2.4 --- Indexing schemes --- p.17 / Chapter 3 --- Sequence Data Searching using the Projection Algorithm --- p.21 / Chapter 3.1 --- Sequence Similarity --- p.21 / Chapter 3.2 --- Searching Method --- p.24 / Chapter 3.2.1 --- Sequential Algorithm --- p.24 / Chapter 3.2.2 --- Projection Algorithm --- p.25 / Chapter 3.3 --- Handling Scaling Problem by the Projection Algorithm --- p.33 / Chapter 4 --- Sequence Data Searching using Hashing Algorithm --- p.37 / Chapter 4.1 --- Sequence Similarity --- p.37 / Chapter 4.2 --- Hashing algorithm --- p.39 / Chapter 4.2.1 --- Motivation of the Algorithm --- p.40 / Chapter 4.2.2 --- Hashing Algorithm using dynamic hash function --- p.44 / Chapter 4.2.3 --- Handling Scaling Problem by the Hashing Algorithm --- p.47 / Chapter 5 --- Comparisons between algorithms --- p.50 / Chapter 5.1 --- Performance comparison with the sequence searching algorithms --- p.54 / Chapter 5.2 --- Comparison between indexing structures --- p.54 / Chapter 5.3 --- Comparison between sequence searching algorithms in coping some deficits --- p.55 / Chapter 6 --- Performance Evaluation --- p.58 / Chapter 6.1 --- Performance Evaluation using Projection Algorithm --- p.58 / Chapter 6.2 --- Performance Evaluation using Hashing Algorithm --- p.61 / Chapter 7 --- Conclusion --- p.66 / Chapter 7.1 --- Motivation of the thesis --- p.66 / Chapter 7.1.1 --- Insufficiency of Euclidean distance --- p.67 / Chapter 7.1.2 --- Insufficiency of orthonormal transforms --- p.67 / Chapter 7.1.3 --- Insufficiency of multi-dimensional indexing structure --- p.68 / Chapter 7.2 --- Major contribution --- p.68 / Chapter 7.2.1 --- Projection algorithm --- p.68 / Chapter 7.2.2 --- Hashing algorithm --- p.69 / Chapter 7.3 --- Future work --- p.70 / Bibliography --- p.71 Computer algorithms Database searching Information retrieval
29	Learning from large data : Bias, variance, sampling, and learning curves Brain, Damien, mikewood@deakin.edu.au January 2003 (has links) One of the fundamental machine learning tasks is that of predictive classification. Given that organisations collect an ever increasing amount of data, predictive classification methods must be able to effectively and efficiently handle large amounts of data. However, it is understood that present requirements push existing algorithms to, and sometimes beyond, their limits since many classification prediction algorithms were designed when currently common data set sizes were beyond imagination. This has led to a significant amount of research into ways of making classification learning algorithms more effective and efficient. Although substantial progress has been made, a number of key questions have not been answered. This dissertation investigates two of these key questions. The first is whether different types of algorithms to those currently employed are required when using large data sets. This is answered by analysis of the way in which the bias plus variance decomposition of predictive classification error changes as training set size is increased. Experiments find that larger training sets require different types of algorithms to those currently used. Some insight into the characteristics of suitable algorithms is provided, and this may provide some direction for the development of future classification prediction algorithms which are specifically designed for use with large data sets. The second question investigated is that of the role of sampling in machine learning with large data sets. Sampling has long been used as a means of avoiding the need to scale up algorithms to suit the size of the data set by scaling down the size of the data sets to suit the algorithm. However, the costs of performing sampling have not been widely explored. Two popular sampling methods are compared with learning from all available data in terms of predictive accuracy, model complexity, and execution time. The comparison shows that sub-sampling generally products models with accuracy close to, and sometimes greater than, that obtainable from learning with all available data. This result suggests that it may be possible to develop algorithms that take advantage of the sub-sampling methodology to reduce the time required to infer a model while sacrificing little if any accuracy. Methods of improving effective and efficient learning via sampling are also investigated, and now sampling methodologies proposed. These methodologies include using a varying-proportion of instances to determine the next inference step and using a statistical calculation at each inference step to determine sufficient sample size. Experiments show that using a statistical calculation of sample size can not only substantially reduce execution time but can do so with only a small loss, and occasional gain, in accuracy. One of the common uses of sampling is in the construction of learning curves. Learning curves are often used to attempt to determine the optimal training size which will maximally reduce execution time while nut being detrimental to accuracy. An analysis of the performance of methods for detection of convergence of learning curves is performed, with the focus of the analysis on methods that calculate the gradient, of the tangent to the curve. Given that such methods can be susceptible to local accuracy plateaus, an investigation into the frequency of local plateaus is also performed. It is shown that local accuracy plateaus are a common occurrence, and that ensuring a small loss of accuracy often results in greater computational cost than learning from all available data. These results cast doubt over the applicability of gradient of tangent methods for detecting convergence, and of the viability of learning curves for reducing execution time in general. Data mining database searching algorithms data processing
30	Online searching and connecting caching / Stafford, Matthew, January 2000 (has links) Thesis (Ph. D.)--University of Texas at Austin, 2000. / Vita. Includes bibliographical references (leaves 125-127). Available also in a digital version from Dissertation Abstracts.

Search results