Global ETD Search

1	An Obstruction-Check Approach to Mining Closed Sequential Patterns in Data Streams Chin, Tsz-lin 21 June 2010 (has links) Online mining sequential patterns over data streams is an important problem in data mining. There are many applications of using sequential patterns in data streams, such as market analysis, network security, sensor networks and web track- ing. Previous studies have shown mining closed patterns provides more beneﬁts than mining the complete set of frequent patterns, since closed pattern mining leads to compact results. A sequential pattern is closed if the sequential pattern does not have any supersequence which has the same support. Chang et al. proposed a time- based sliding window model. The time-based sliding window has two features, the new item is inserted in front of a sequence, and the obsolete item is removed from of tail of a sequence. For solving the problem of data mining in the time-based sliding window, Chang et al. proposed an algorithm called SeqStream. It uses a data struc- ture IST (Inverse Closed Sequence Tree) to keep the result. IST can incrementally be updated by the SeqStream algorithm. Although the SeqStream algorithm has used the technique of dividing the time-based sliding window to speed up the updating of IST, the SeqStream algorithm still scans the sliding window many times when IST needs to be updated. In this thesis, we propose an obstruction-check approach to maintain the result of closed sequential patterns. Our approach is designed based on the lattice structure. The feature of the lattice structure is that the parent is a supersequence of its children. By utilizing this feature, we decide the obstruction link between the parent and child if their support is the same. If a node does not have any obstruction link parent, the node is a closed sequential pattern. Then we can utilize this feature to locally travel the lattice structure. Moreover, we can fully utilize the features of the time-based sliding window model to locally travel the lat- tice structure. Based on the lattice structure, we propose the EULB (Exact Update based on Lattice structure with Bit stream)-Lattice algorithm. The EULB-Lattice algorithm is an exact method for mining data streams. We record additional informa- tion, instead of scanning the entire sliding window. We conduct several experiments using diﬀerent synthetic data sets. The simulation results show that the proposed algorithm outperforms the SeqStream algorithm. Closed Sequential Pattern Lattice Sliding Window Sequential Pattern Data Stream
2	Contrasting sequence groups by emerging sequences Deng, Kang. January 2009 (has links) Thesis (M. Sc.)--University of Alberta, 2009. / Title from PDF file main screen (viewed on Nov. 27, 2009). "A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment of the requirements for the degree of Master of Science, Department of Computing Science, University of Alberta." Includes bibliographical references.
3	Pattern Mining and Concept Discovery for Multimodal Content Analysis Li, Hongzhi January 2016 (has links) With recent advances in computer vision, researchers have been able to demonstrate impressive performance at near-human-level capabilities in difficult tasks such as image recognition. For example, for images taken under typical conditions, computer vision systems now have the ability to recognize if a dog, cat, or car appears in an image. These advances are made possible by utilizing the massive volume of image datasets and label annotations, which include category labels and sometimes bounding boxes around the objects of interest within the image. However, one major limitation of the current solutions is that when users apply recognition models to new domains, users need to manually define the target classes and label the training data in order to prepare labeled annotations required for the process of training the recognition models. Manually identifying the target classes and constructing the concept ontology for a new domain are time-consuming tasks, as they require the users to be familiar with the content of the image collection, and the manual process of defining target classes is difficult to scale up to generate a large number of classes. In addition, there has been significant interest in developing knowledge bases to improve content analysis and information retrieval. Knowledge base is an object model (ontology) with classes, subclasses, attributes, instances, and relations among them. The knowledge base generation problem is to identify the (sub)classes and their structured relations for a given domain of interest. Similar to ontology construction, Knowledge base is usually generated by human experts manually, and it is usually a time-consuming and difficult task. Thus, it is important and necessary to find a way to explore the semantic concepts and their structural relations that are important for a target data collection or domain of interest, so that we can construct an ontology or knowledge base for visual data or multimodal content automatically or semi-automatically. Visual patterns are the discriminative and representative image content found in objects or local image regions seen in an image collection. Visual patterns can also be used to summarize the major visual concepts in an image collection. Therefore, automatic discovery of visual patterns can help users understand the content and structure of a data collection and in turn help users construct the ontology and knowledge base mentioned earlier. In this dissertation, we aim to answer the following question: given a new target domain and associated data corpora, how do we rapidly discover nameable content patterns that are semantically coherent, visually consistent, and can be automatically named with semantic concepts related to the events of interest in the target domains? We will develop pattern discovery methods that focus on visual content as well as multimodal data including text and visual. Traditional visual pattern mining methods only focus on analysis of the visual content, and do not have the ability to automatically name the patterns. To address this, we propose a new multimodal visual pattern mining and naming method that specifically addresses this shortcoming. The named visual patterns can be used as discovered semantic concepts relevant to the target data corpora. By combining information from multiple modalities, we can ensure that the discovered patterns are not only visually similar, but also have consistent meaning, as well. The capability of accurately naming the visual patterns is also important for finding relevant classes or attributes in the knowledge base construction process mentioned earlier. Our framework contains a visual model and a text model to jointly represent the text and visual content. We use the joint multimodal representation and the association rule mining technique to discover semantically coherent and visually consistent visual patterns. To discover better visual patterns, we further improve the visual model in the multimodal visual pattern mining pipeline, by developing a convolutional neural network (CNN) architecture that allows for the discovery of scale-invariant patterns. In this dissertation, we use news as an example domain and image caption pairs as example multimodal corpora to demonstrate the effectiveness of the proposed methods. However, the overall proposed framework is general and can be easily extended to other domains. The problem of concept discovery is made more challenging if the target application domain involves fine-grained object categories (e.g., highly related dog categories or consumer product categories). In such cases, the content of different classes could be quite similar, making automatic separation of classes difficult. In the proposed multimodal pattern mining framework, representation models for visual and text data play an important role, as they shape the pool of candidates that are fed to the pattern mining process. General models like the CNN models trained on ImageNet, though shown to be generalizable to various domains, are unable to capture the small differences in the fine-grained dataset. To address this problem, we propose a new representation model that uses an end-to-end artificial neural network architecture to discover visual patterns. This model can be fine-tuned on a fine-grained dataset so that the convolutional layers can be optimized to capture the features and patterns from the fine-trained image set. It has the ability to discover visual patterns from fine-grained image datasets because its convolutional layers of the CNN can be optimized to capture the features and patterns from the fine-grained images. Finally, to demonstrate the advantage of the proposed multimodal visual pattern mining and naming framework, we apply the proposed technique to two applications. In the first application, we use the visual pattern mining technique to find visual anchors to summarize video news events. In the second application, we use the visual patterns as important cues to link video news events to social media events. The contributions of this dissertation can be summarized as follows: (1) We develop a novel multimodal mining framework for discovering visual patterns and nameable concepts from a collection of multimodal data and automatically naming the discovered patterns, producing a large pool of semantic concepts specifically relevant to a high-level event. The framework combines visual representation based on CNN and text representation based on embedding. The named visual patterns can be required for construct event schema needed in the knowledge base construction process. (2) We propose a scale-invariant visual pattern mining model to improve the multimodal visual pattern mining framework. The improved visual model leads to better overall performance in discovering and naming concepts. To localize the visual patterns discovered in this framework, we propose a deconvolutional neural network model to localize the visual pattern patterns within the image. (3) To directly learn from data in the target domain, we propose a novel end-to-end neural network architecture called PatternNet for finding high-quality visual patterns even for datsets that consistent of fine-grained classes. (4) We demonstrate novel applications of visual pattern mining in two applications: video news event summarization and video news event linking. Data mining Computer science Computer vision Sequential pattern mining
4	Constructing Bayesian Networks with Sequential Patterns for Hemodialysis Wang, Woei-Ru 05 August 2002 (has links) In this thesis, I introduce a multivariate discretization algorithm to discretize the continuous variables of clinical pathways of Hemodialysis and use the clustering algorithm to shift time stamps to reduce the number of nodes of Bayesian networks. The generalized sequential patterns algorithm is used to find the possible patterns, which have far-reaching effect on the next nodes of the Bayesian networks of Hemodialysis. Bayesian network is a graphical model that encodes probabilistic relationships among variables of interest, and easily incorporates with new instances to maintain rules up to date. Bayesian networks are used to represent knowledge of frequent state transitions in medical logs. Bayesian networks and sequential patterns algorithms can only handle discrete or categorical data. Therefore, we have to discretize the continuous variables with suitable technique to generalize the node, and shift the time stamps of nodes to reduce the variations in time. With these generalizations, we improve the problem of over-fitting of the Bayesian networks of Hemodialysis. We expect the discovered patterns can give more information to medical professionals and help them to build the reciprocal cycle of knowledge management of Hemodialysis. knowledge management data mining Bayesian network Hemodialysis clustering sequential pattern
5	Applying Data Mining Technique to Analyze Sequential Patterns in the Stock Market in Taiwan Yeh, Ming-Wei 10 July 2003 (has links) Our research adopts data mining technique to analyze stock market and build the analysis model, from the historical data of the stock market, to assist investment decision. The performance of the stock market is the collection of all individuals¡¦ decisions, taking the Taiwan¡¦s stock market for instance, there is a phenomenon that all the prices of the stocks in the same industry will raise in turn, and a lot of corporations and investors will invest some industry more actively and then invest another industry sequentially according the strategies of the corporations or other reasons. Besides, based on the theory of recurring prosperity, investors and corporations will decide the target of investment by the characteristics of the industry and the status of the prosperity and show a recurring investment strategy. The phenomenon of sequential investment can be discovered by using Data Mining technique, especially the Sequential Pattern Analysis in Data Mining technique. The Sequential Pattern Analysis is used to analyze the sequential relation between two things, and this technique has been improved greatly in recent days. Using this technique to analyze the behavior of stock market can be a whole new research topic. The object of this research is to generalize a sequential pattern of the investment in Taiwan¡¦s stock market. Based on the history transaction data of Taiwan¡¦s stock market, we mine for the sequential pattern of different stocks in Taiwan¡¦s stock market and then build the behavior model of Taiwan¡¦s stock market in order to help the stock investors to make the correct decisions. Data Mining Association Rull Stock Analysis Sequential pattern
6	The Impact of Gender Difference on Response Strategy in E-Negotiation Hu, Chia-hua 05 August 2009 (has links) Today people already accustom to do businesses on the Internet. The electronic negotiation also becomes popular because of its advantages. Furthermore more and more females get high positions in their company and often engage important activities such as electronic negotiation for their company. If negotiators could understand the differences of males and females on their behavioral sequence and response strategy, they could have a better interaction during negotiation no matter what their counterpart s gender is. This study explores the relation of different gender compositions and response strategy in E-Negotiation. We design an algorithm to find significant sequential patterns and then group them into three kinds of response strategies. Lastly we use Chi-Square Independence Test to see the correlation and Column Comparison to see which gender composition has significant higher proportion on three types of response strategies. The result suggests gender compositions and response strategies are interrelated. Negotiators in inter-gender dyad are more likely to response with reciprocal strategy and negotiators in intra-gender dyad are more likely to response with structural strategy. Moreover female-only dyad is more likely to response with all kinds of strategies compared to male-only dyad. Finally female would response to male with more reciprocal strategies and to female with more complementary and structural strategies. On the other hand, male would response to female counterpart with more reciprocal strategies and to male counterpart with more structural strategies. Electronic negotiation Response strategy Sequential pattern Gender difference
7	AV space for efficiently learning classification rules from large datasets / Wang, Linyan. January 2006 (has links) Thesis (M.Sc.)--York University, 2006. Graduate Programme in Computer Science. / Typescript. Includes bibliographical references (leaves 130-134). Also available on the Internet. MODE OF ACCESS via web browser by entering the following URL: http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&res_dat=xri:pqdiss&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&rft_dat=xri:pqdiss:MR19748
8	SNIF TOOL - Sniffing for patterns in continuous streams Mukherji, Abhishek. January 2008 (has links) Thesis (M.S.)--Worcester Polytechnic Institute. / Keywords: continuous queries; streaming time-series; similarity queries; pattern matching. Includes bibliographical references (p. 58-61).
9	Cybersecurity Testing and Intrusion Detection for Cyber-Physical Power Systems Pan, Shengyi 13 December 2014 (has links) Power systems will increasingly rely on synchrophasor systems for reliable and high-performance wide area monitoring and control (WAMC). Synchrophasor systems greatly use information communication technologies (ICT) for data exchange which are vulnerable to cyber-attacks. Prior to installation of a synchrophasor system a set of cyber security requirements must be developed and new devices must undergo vulnerability testing to ensure that proper security controls are in place to protect the synchrophasor system from unauthorized access. This dissertation describes vulnerability analysis and testing performed on synchrophasor system components. Two network fuzzing frameworks are proposed; for the I C37.118 protocol and for an energy management system (EMS). While fixing the identified vulnerabilities in information infrastructures is imperative to secure a power system, it is likely that successful intrusions will still occur. The ability to detect intrusions is necessary to mitigate the negative effects from a successful attacks. The emergence of synchrophasor systems provides real-time data with millisecond precision which makes the observation of a sequence of fast events feasible. Different power system scenarios present different patterns in the observed fast event sequences. This dissertation proposes a data mining approach called mining common paths to accurately extract patterns for power system scenarios including disturbances, control and protection actions and cyber-attacks from synchrophasor data and logs of system components. In this dissertation, such a pattern is called a common path, which is represented as a sequence of critical system states in temporal order. The process of automatically discovering common paths and building a state machine for detecting power system scenarios and attacks is introduced. The classification results show that the proposed approach can accurately detect these scenarios even with variation in fault locations and load conditions. This dissertation also describes a hybrid intrusion detection framework that employs the mining common path algorithm to enable a systematic and automatic IDS construction process. An IDS prototype was validated on a 2-line 3-bus power transmission system protected by the distance protection scheme. The result shows the IDS prototype accurately classifies 25 power system scenarios including disturbances, normal control operations, and cyber-attacks. Fuzzing Common Paths Sequential Pattern Mining Test Bed
10	Data Mining and Mathematical Models for Direct Market Campaign Optimization for Fred Meyer Jewelers Lin, Lebin January 2016 (has links) No description available. Industrial Engineering Operations Research Customer Relationship Management Sequential Pattern Mining Data Mining Optimization Time Based Sequential Pattern

Search results