Global ETD Search

421	Exploratory Visualization of Data Pattern Changes in Multivariate Data Streams Xie, Zaixian 21 October 2011 (has links) " More and more researchers are focusing on the management, querying and pattern mining of streaming data. The visualization of streaming data, however, is still a very new topic. Streaming data is very similar to time-series data since each datapoint has a time dimension. Although the latter has been well studied in the area of information visualization, a key characteristic of streaming data, unbounded and large-scale input, is rarely investigated. Moreover, most techniques for visualizing time-series data focus on univariate data and seldom convey multidimensional relationships, which is an important requirement in many application areas. Therefore, it is necessary to develop appropriate techniques for streaming data instead of directly applying time-series visualization techniques to it. As one of the main contributions of this dissertation, I introduce a user-driven approach for the visual analytics of multivariate data streams based on effective visualizations via a combination of windowing and sampling strategies. To help users identify and track how data patterns change over time, not only the current sliding window content but also abstractions of past data in which users are interested are displayed. Sampling is applied within each single time window to help reduce visual clutter as well as preserve data patterns. Sampling ratios scheduled for different windows reflect the degree of user interest in the content. A degree of interest (DOI) function is used to represent a user's interest in different windows of the data. Users can apply two types of pre-defined DOI functions, namely RC (recent change) and PP (periodic phenomena) functions. The developed tool also allows users to interactively adjust DOI functions, in a manner similar to transfer functions in volume visualization, to enable a trial-and-error exploration process. In order to visually convey the change of multidimensional correlations, four layout strategies were designed. User studies showed that three of these are effective techniques for conveying data pattern changes compared to traditional time-series data visualization techniques. Based on this evaluation, a guide for the selection of appropriate layout strategies was derived, considering the characteristics of the targeted datasets and data analysis tasks. Case studies were used to show the effectiveness of DOI functions and the various visualization techniques. A second contribution of this dissertation is a data-driven framework to merge and thus condense time windows having small or no changes and distort the time axis. Only significant changes are shown to users. Pattern vectors are introduced as a compact format for representing the discovered data model. Three views, juxtaposed views, pattern vector views, and pattern change views, were developed for conveying data pattern changes. The first shows more details of the data but needs more canvas space; the last two need much less canvas space via conveying only the pattern parameters, but lose many data details. The experiments showed that the proposed merge algorithms preserves more change information than an intuitive pattern-blind averaging. A user study was also conducted to confirm that the proposed techniques can help users find pattern changes more quickly than via a non-distorted time axis. A third contribution of this dissertation is the history views with related interaction techniques were developed to work under two modes: non-merge and merge. In the former mode, the framework can use natural hierarchical time units or one defined by domain experts to represent timelines. This can help users navigate across long time periods. Grid or virtual calendar views were designed to provide a compact overview for the history data. In addition, MDS pattern starfields, distance maps, and pattern brushes were developed to enable users to quickly investigate the degree of pattern similarity among different time periods. For the merge mode, merge algorithms were applied to selected time windows to generate a merge-based hierarchy. The contiguous time windows having similar patterns are merged first. Users can choose different levels of merging with the tradeoff between more details in the data and less visual clutter in the visualizations. The usability evaluation demonstrated that most participants could understand the concepts of the history views correctly and finished assigned tasks with a high accuracy and relatively fast response time. " Multivariate Data Visualization Streaming Data Visualization Time-series Data Visualization Data streams
422	Exploratory Visualization of Data with Variable Quality Huang, Shiping 11 January 2005 (has links) Data quality, which refers to correctness, uncertainty, completeness and other aspects of data, has became more and more prevalent and has been addressed across multiple disciplines. Data quality could be introduced and presented in any of the data manipulation processes such as data collection, transformation, and visualization. Data visualization is a process of data mining and analysis using graphical presentation and interpretation. The correctness and completeness of the visualization discoveries to a large extent depend on the quality of the original data. Without the integration of quality information with data presentation, the analysis of data using visualization is incomplete at best and can lead to inaccurate or incorrect conclusions at worst. This thesis addresses the issue of data quality visualization. Incorporating data quality measures into the data displays is challenging in that the display is apt to be cluttered when faced with multiple dimensions and data records. We investigate both the incorporation of data quality information in traditional multivariate data display techniques as well as develop novel visualization and interaction tools that operate in data quality space. We validate our results using several data sets that have variable quality associated with dimensions, records, and data values. Visualization Uncertainty Missing Data Imputation Data Quality Electronic data processing Quality control Visualization Data processing
423	The implementation of a subset data dictionary verifier Cline, Jacquelyn Fern January 2010 (has links) Typescript (photocopy). / Digitized by Kansas Correctional Industries
424	System development and its effect on management in planning for the use of electronic data processing equipment Hokansson, Nils C.I. January 1962 (has links) Thesis (M.B.A.)--Boston University Management Data processing
425	Development of an optical system for dynamic evaluation of phase recovery algorithms Palani, Ananta January 2015 (has links) No description available. 620 Optical data processing
426	Automated identification of digital evidence across heterogeneous data resources Mohammed, Hussam J. January 2018 (has links) Digital forensics has become an increasingly important tool in the fight against cyber and computer-assisted crime. However, with an increasing range of technologies at people's disposal, investigators find themselves having to process and analyse many systems with large volumes of data (e.g., PCs, laptops, tablets, and smartphones) within a single case. Unfortunately, current digital forensic tools operate in an isolated manner, investigating systems and applications individually. The heterogeneity and volume of evidence place time constraints and a significant burden on investigators. Examples of heterogeneity include applications such as messaging (e.g., iMessenger, Viber, Snapchat, and WhatsApp), web browsers (e.g., Firefox and Google Chrome), and file systems (e.g., NTFS, FAT, and HFS). Being able to analyse and investigate evidence from across devices and applications in a universal and harmonized fashion would enable investigators to query all data at once. In addition, successfully prioritizing evidence and reducing the volume of data to be analysed reduces the time taken and cognitive load on the investigator. This thesis focuses on the examination and analysis phases of the digital investigation process. It explores the feasibility of dealing with big and heterogeneous data sources in order to correlate the evidence from across these evidential sources in an automated way. Therefore, a novel approach was developed to solve the heterogeneity issues of big data using three developed algorithms. The three algorithms include the harmonising, clustering, and automated identification of evidence (AIE) algorithms. The harmonisation algorithm seeks to provide an automated framework to merge similar datasets by characterising similar metadata categories and then harmonising them in a single dataset. This algorithm overcomes heterogeneity issues and makes the examination and analysis easier by analysing and investigating the evidential artefacts across devices and applications based on the categories to query data at once. Based on the merged datasets, the clustering algorithm is used to identify the evidential files and isolate the non-related files based on their metadata. Afterwards, the AIE algorithm tries to identify the cluster holding the largest number of evidential artefacts through searching based on two methods: criminal profiling activities and some information from the criminals themselves. Then, the related clusters are identified through timeline analysis and a search of associated artefacts of the files within the first cluster. A series of experiments using real-life forensic datasets were conducted to evaluate the algorithms across five different categories of datasets (i.e., messaging, graphical files, file system, internet history, and emails), each containing data from different applications across different devices. The results of the characterisation and harmonisation process show that the algorithm can merge all fields successfully, with the exception of some binary-based data found within the messaging datasets (contained within Viber and SMS). The error occurred because of a lack of information for the characterisation process to make a useful determination. However, on further analysis, it was found that the error had a minimal impact on subsequent merged data. The results of the clustering process and AIE algorithm showed the two algorithms can collaborate and identify more than 92% of evidential files. Digital Forensics ; HETEROGENEOUS DATA
427	Fast fingerprint verification using sub-regions of fingerprint images. January 2004 (has links) Chan Ka Cheong. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2004. / Includes bibliographical references (leaves 77-85). / Abstracts in English and Chinese. / Chapter 1. --- Introduction --- p.1 / Chapter 1.1 --- Introduction to Fingerprint Verification --- p.1 / Chapter 1.1.1 --- Biometrics --- p.1 / Chapter 1.1.2 --- Fingerprint History --- p.2 / Chapter 1.1.3 --- Fingerprint characteristics --- p.4 / Chapter 1.1.4 --- A Generic Fingerprint Matching System Architecture --- p.6 / Chapter 1.1.5 --- Fingerprint Verification and Identification --- p.8 / Chapter 1.1.7 --- Biometric metrics --- p.10 / Chapter 1.2 --- Embedded system --- p.12 / Chapter 1.2.1 --- Introduction to embedded systems --- p.12 / Chapter 1.2.2 --- Embedded systems characteristics --- p.12 / Chapter 1.2.3 --- Performance evaluation of a StrongARM processor --- p.13 / Chapter 1.3 --- Objective -An embedded fingerprint verification system --- p.16 / Chapter 1.4 --- Organization of the Thesis --- p.17 / Chapter 2 --- Literature Reviews --- p.18 / Chapter 2.1 --- Fingerprint matching overviews --- p.18 / Chapter 2.1.1 --- Minutiae-based fingerprint matching --- p.20 / Chapter 2.2 --- Fingerprint image enhancement --- p.21 / Chapter 2.3 --- Orientation field Computation --- p.22 / Chapter 2.4 --- Fingerprint Segmentation --- p.24 / Chapter 2.5 --- Singularity Detection --- p.25 / Chapter 2.6 --- Fingerprint Classification --- p.27 / Chapter 2.7 --- Minutia extraction --- p.30 / Chapter 2.7.1 --- Binarization and thinning --- p.30 / Chapter 2.7.2 --- Direct gray scale approach --- p.32 / Chapter 2.7.3 --- Comparison of the minutiae extraction approaches --- p.35 / Chapter 2.8 --- Minutiae matching --- p.37 / Chapter 2.8.1 --- Point matching --- p.37 / Chapter 2.8.2 --- Structural matching technique --- p.38 / Chapter 2.9 --- Summary --- p.40 / Chapter 3. --- Implementation --- p.41 / Chapter 3.1 --- Fast Fingerprint Matching System Overview --- p.41 / Chapter 3.1.1 --- Typical Fingerprint Matching System --- p.41 / Chapter 3.1.2. --- Fast Fingerprint Matching System Overview --- p.41 / Chapter 3.2 --- Orientation computation --- p.43 / Chapter 3.21 --- Orientation computation --- p.43 / Chapter 3.22 --- Smooth orientation field --- p.43 / Chapter 3.3 --- Fingerprint image segmentation --- p.45 / Chapter 3.4 --- Reference Point Extraction --- p.46 / Chapter 3.5 --- A Classification Scheme --- p.51 / Chapter 3.6 --- Finding A Small Fingerprint Matching Area --- p.54 / Chapter 3.7 --- Fingerprint Matching --- p.57 / Chapter 3.8 --- Minutiae extraction --- p.59 / Chapter 3.8.1 --- Ridge tracing --- p.59 / Chapter 3.8.2 --- cross sectioning --- p.60 / Chapter 3.8.3 --- local maximum determination --- p.61 / Chapter 3.8.4 --- Ridge tracing marking --- p.62 / Chapter 3.8.5 --- Ridge tracing stop criteria --- p.63 / Chapter 3.9 --- Optimization technique --- p.65 / Chapter 3.10 --- Summary --- p.66 / Chapter 4. --- Experimental results --- p.67 / Chapter 4.1 --- Experimental setup --- p.67 / Chapter 4.2 --- Fingerprint database --- p.67 / Chapter 4.3 --- Reference point accuracy --- p.67 / Chapter 4.4 --- Variable number of matching minutiae results --- p.68 / Chapter 4.5 --- Contribution of the verification prototype --- p.72 / Chapter 5. --- Conclusion and Future Research --- p.74 / Chapter 5.1 --- Conclusion --- p.74 / Chapter 5.2 --- Future Research --- p.74 / Bibliography --- p.77
428	A new approach to clustering large databases in data mining. January 2004 (has links) Lau Hei Yuet. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2004. / Includes bibliographical references (leaves 74-76). / Abstracts in English and Chinese. / Abstract --- p.i / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Cluster Analysis --- p.1 / Chapter 1.2 --- Dissimilarity Measures --- p.3 / Chapter 1.2.1 --- Continuous Data --- p.4 / Chapter 1.2.2 --- Categorical and Nominal Data --- p.4 / Chapter 1.2.3 --- Mixed Data --- p.5 / Chapter 1.2.4 --- Missing Data --- p.6 / Chapter 1.3 --- Outline of the thesis --- p.6 / Chapter 2 --- Clustering Algorithms --- p.9 / Chapter 2.1 --- The k-means Algorithm Family --- p.9 / Chapter 2.1.1 --- The Algorithms --- p.9 / Chapter 2.1.2 --- Choosing the Number of Clusters - the MaxMin Algo- rithm --- p.12 / Chapter 2.1.3 --- Starting Configuration - the MaxMin Algorithm --- p.16 / Chapter 2.2 --- Clustering Using Unidimensional Scaling --- p.16 / Chapter 2.2.1 --- Unidimensional Scaling --- p.16 / Chapter 2.2.2 --- Procedures --- p.17 / Chapter 2.2.3 --- Guttman's Updating Algorithm --- p.18 / Chapter 2.2.4 --- Pliner's Smoothing Algorithm --- p.18 / Chapter 2.2.5 --- Starting Configuration --- p.19 / Chapter 2.2.6 --- Choosing the Number of Clusters --- p.21 / Chapter 2.3 --- Cluster Validation --- p.23 / Chapter 2.3.1 --- Continuous Data --- p.23 / Chapter 2.3.2 --- Nominal Data --- p.24 / Chapter 2.3.3 --- Resampling Method --- p.25 / Chapter 2.4 --- Conclusion --- p.27 / Chapter 3 --- Experimental Results --- p.29 / Chapter 3.1 --- Simulated Data 1 --- p.29 / Chapter 3.2 --- Simulated Data 2 --- p.35 / Chapter 3.3 --- Iris Data --- p.41 / Chapter 3.4 --- Wine Data --- p.47 / Chapter 3.5 --- Mushroom Data --- p.53 / Chapter 3.6 --- Conclusion --- p.59 / Chapter 4 --- Large Database --- p.61 / Chapter 4.1 --- Sliding Windows Algorithm --- p.61 / Chapter 4.2 --- Two-stage Algorithm --- p.63 / Chapter 4.3 --- Three-stage Algorithm --- p.65 / Chapter 4.4 --- Experimental Results --- p.66 / Chapter 4.5 --- Conclusion --- p.68 / Chapter A --- Algorithms --- p.69 / Chapter A.1 --- MaxMin Algorithm --- p.69 / Chapter A.2 --- Sliding Windows Algorithm --- p.70 / Chapter A.3 --- Two-stage Algorithm - Stage One --- p.72 / Chapter A.4 --- Two-stage Algorithm - Stage Two --- p.73 / Bibliography --- p.74 Data mining Cluster analysis
429	Induction of classification rules and decision trees using genetic algorithms. January 2005 (has links) Ng Sai-Cheong. / Thesis submitted in: December 2004. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2005. / Includes bibliographical references (leaves 172-178). / Abstracts in English and Chinese. / Abstract --- p.i / Acknowledgement --- p.iii / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Data Mining --- p.1 / Chapter 1.2 --- Problem Specifications and Motivations --- p.3 / Chapter 1.3 --- Contributions of the Thesis --- p.5 / Chapter 1.4 --- Thesis Roadmap --- p.6 / Chapter 2 --- Related Work --- p.9 / Chapter 2.1 --- Supervised Classification Techniques --- p.9 / Chapter 2.1.1 --- Classification Rules --- p.9 / Chapter 2.1.2 --- Decision Trees --- p.11 / Chapter 2.2 --- Evolutionary Algorithms --- p.19 / Chapter 2.2.1 --- Genetic Algorithms --- p.19 / Chapter 2.2.2 --- Genetic Programming --- p.24 / Chapter 2.2.3 --- Evolution Strategies --- p.26 / Chapter 2.2.4 --- Evolutionary Programming --- p.32 / Chapter 2.3 --- Applications of Evolutionary Algorithms to Induction of Classification Rules --- p.33 / Chapter 2.3.1 --- SCION --- p.33 / Chapter 2.3.2 --- GABIL --- p.34 / Chapter 2.3.3 --- LOGENPRO --- p.35 / Chapter 2.4 --- Applications of Evolutionary Algorithms to Construction of Decision Trees --- p.35 / Chapter 2.4.1 --- Binary Tree Genetic Algorithm --- p.35 / Chapter 2.4.2 --- OC1-GA --- p.36 / Chapter 2.4.3 --- OC1-ES --- p.38 / Chapter 2.4.4 --- GATree --- p.38 / Chapter 2.4.5 --- Induction of Linear Decision Trees using Strong Typing GP --- p.39 / Chapter 2.5 --- Spatial Data Structures and its Applications --- p.40 / Chapter 2.5.1 --- Spatial Data Structures --- p.40 / Chapter 2.5.2 --- Applications of Spatial Data Structures --- p.42 / Chapter 3 --- Induction of Classification Rules using Genetic Algorithms --- p.45 / Chapter 3.1 --- Introduction --- p.45 / Chapter 3.2 --- Rule Learning using Genetic Algorithms --- p.46 / Chapter 3.2.1 --- Population Initialization --- p.47 / Chapter 3.2.2 --- Fitness Evaluation of Chromosomes --- p.49 / Chapter 3.2.3 --- Token Competition --- p.50 / Chapter 3.2.4 --- Chromosome Elimination --- p.51 / Chapter 3.2.5 --- Rule Migration --- p.52 / Chapter 3.2.6 --- Crossover --- p.53 / Chapter 3.2.7 --- Mutation --- p.55 / Chapter 3.2.8 --- Calculating the Number of Correctly Classified Training Samples in a Rule Set --- p.56 / Chapter 3.3 --- Performance Evaluation --- p.56 / Chapter 3.3.1 --- Performance Comparison of the GA-based CPRLS and Various Supervised Classifi- cation Algorithms --- p.57 / Chapter 3.3.2 --- Performance Comparison of the GA-based CPRLS and RS-based CPRLS --- p.68 / Chapter 3.3.3 --- Effects of Token Competition --- p.69 / Chapter 3.3.4 --- Effects of Rule Migration --- p.70 / Chapter 3.4 --- Chapter Summary --- p.73 / Chapter 4 --- Genetic Algorithm-based Quadratic Decision Trees --- p.74 / Chapter 4.1 --- Introduction --- p.74 / Chapter 4.2 --- Construction of Quadratic Decision Trees --- p.76 / Chapter 4.3 --- Evolving the Optimal Quadratic Hypersurface using Genetic Algorithms --- p.77 / Chapter 4.3.1 --- Population Initialization --- p.80 / Chapter 4.3.2 --- Fitness Evaluation --- p.81 / Chapter 4.3.3 --- Selection --- p.81 / Chapter 4.3.4 --- Crossover --- p.82 / Chapter 4.3.5 --- Mutation --- p.83 / Chapter 4.4 --- Performance Evaluation --- p.84 / Chapter 4.4.1 --- Performance Comparison of the GA-based QDT and Various Supervised Classification Algorithms --- p.85 / Chapter 4.4.2 --- Performance Comparison of the GA-based QDT and RS-based QDT --- p.92 / Chapter 4.4.3 --- Effects of Changing Parameters of the GA-based QDT --- p.93 / Chapter 4.5 --- Chapter Summary --- p.109 / Chapter 5 --- Induction of Linear and Quadratic Decision Trees using Spatial Data Structures --- p.111 / Chapter 5.1 --- Introduction --- p.111 / Chapter 5.2 --- Construction of k-D Trees --- p.113 / Chapter 5.3 --- Construction of Generalized Quadtrees --- p.119 / Chapter 5.4 --- Induction of Oblique Decision Trees using Spatial Data Structures --- p.124 / Chapter 5.5. --- Induction of Quadratic Decision Trees using Spatial Data Structures --- p.130 / Chapter 5.6 --- Performance Evaluation --- p.139 / Chapter 5.6.1 --- Performance Comparison with Various Supervised Classification Algorithms --- p.142 / Chapter 5.6.2 --- Effects of Changing the Minimum Number of Training Samples at Each Node of a k-D Tree --- p.155 / Chapter 5.6.3 --- Effects of Changing the Minimum Number of Training Samples at Each Node of a Generalized Quadtree --- p.157 / Chapter 5.6.4 --- Effects of Changing the Size of Datasets . --- p.158 / Chapter 5.7 --- Chapter Summary --- p.160 / Chapter 6 --- Conclusions --- p.164 / Chapter 6.1 --- Contributions --- p.164 / Chapter 6.2 --- Future Work --- p.167 / Chapter A --- Implementation of Data Mining Algorithms Specified in the Thesis --- p.170 / Bibliography --- p.178 Data mining Genetic algorithms
430	Design and control of a controllable hybrid mechanical metal forming press. / CUHK electronic theses & dissertations collection January 2008 (has links) A real-time dynamic feedback control system is developed. An improved PID algorithm, called the integral separated piecewise PID scheme, is used in the control system. This algorithm is able to limit the contribution of the integral component in the PID calculation to avoid integral windup. In addition, it could use different PID parameters to adapt to different segments within one punch motion cycle. Hence, the error of the punch motion, either resulting from the machine assembly or from the machine dynamics, can be compensated by tuning the velocity of the servomotor. This is a unique feature of the new press that ensures its accuracy. / Based on the novel structure, the detailed design is then carried out, which includes the mechanical design, kinematics and inverse kinematics analysis, static force analysis, parametric design and the other related designs. A calibration method based on the experiment and computer simulation is proposed for the new press, which is also useful for the parallel mechanisms. Cooperated with Guangdong Metal Forming Machine Works Co. Ltd., a 250 KN prototype has been built and tested. / In order to ensure the desirable performance, dynamic control is necessary. The thesis uses two dynamic modeling methods to study the dynamics of the press. One is the kineto-static method. It is also called D'Alembert principle which rearranges Newton's second law and transfers a dynamic problem to an equivalent static problem by adding the inertial forces and inertial torques onto the system. The model can then be analyzed easily and exactly as a static system subjected to the inertial forces and torques and the external forces. The other method is the Lagrangian method which derives the dynamic model from the energy perspective. Based on the model, the dynamics of the press is studied by means of computer simulation and is validated experimentally. / In this thesis, a controllable hybrid mechanical metal forming press is developed, which is driven by a CSM with a flywheel and a servomotor. From a mechanism point of view, it is a closed-loop 2-DOF parallel planar five-bar mechanism with four resolute joints and one prismatic joint. Thanks to the usage of the servomotor, the punch motion of the new press can be controlled by tuning the velocity of the servomotor. Accordingly, desired punch motions for different stamping processes can be obtained. In other words, the new press is flexible and controllable like the servo mechanical press and the hydraulic press. Moreover, the CSM with flywheel provides the main power during the stamping operation, and hence, it is energy efficient. In addition, it is not expensive to build, as it uses only a small servomotor. / Metal forming is one of the oldest production processes and yet, is still one of the most commonly used processes today. Everyday, millions of parts are produced by metal forming ranging from battery caps to automobile body panels. Therefore, even a small improvement may add to significant corporative gain. / The thesis also describes the trajectory planning method for the press, which is based on the combination of the inverse kinematics and cubic spline interpolation. The trajectory is optimized under multiple constraints on velocity, acceleration and jerk of the servomotor. It guarantees the new press is controllable and energy efficient. / Two typical stamping processes, drawing and forging, are taken as examples for the operations of the new press. The results of the simulation and the experiment match well. Based on the simulation and experiments, it is found that the CSM provides the main power for the metal forming operations, while the servomotor is mainly responsible for overcoming the inertia forces to realize the desired punch motion. The experiments show that the new press is energy efficient, fast, controllable and inexpensive to build. It combines the advantages of both mechanical press and hydraulic press and has a good performance. It is expected the new press will have a great potential for the metal forming industry. (Abstract shortened by UMI.) / He, Kai. / "February 2008." / Adviser: Ruxu Du. / Source: Dissertation Abstracts International, Volume: 70-03, Section: B, page: 1902. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2008. / Includes bibliographical references (p. 147-149). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts in English and Chinese. / School code: 1307. Metal stamping--Data processing

Search results