• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 47
  • 10
  • 7
  • 6
  • 5
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 100
  • 100
  • 85
  • 63
  • 19
  • 17
  • 14
  • 12
  • 11
  • 11
  • 10
  • 9
  • 9
  • 9
  • 9
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Hybrid recommender system using association rules a thesis submitted to Auckland University of Technology in partial fulfilment of the requirements for the degree of Master of Computer and Information Sciences (MCIS), 2009 /

Cristache, Alex. January 2009 (has links)
Thesis (MCIS)--AUT University, 2009. / Includes bibliographical references. Also held in print ( leaves : ill. ; 30 cm.) in the Archive at the City Campus (T 006.312 CRI)
12

Association rule based classification

Palanisamy, Senthil Kumar. January 2006 (has links)
Thesis (M.S.)--Worcester Polytechnic Institute. / Keywords: Itemset Pruning, Association Rules, Adaptive Minimal Support, Associative Classification, Classification. Includes bibliographical references (p.70-74).
13

Use of data mining for investigation of crime patterns

Padhye, Manoday D. January 2006 (has links)
Thesis (M.S.)--West Virginia University, 2006. / Title from document title page. Document formatted into pages; contains viii, 108 p. : ill. (some col.). Includes abstract. Includes bibliographical references (p. 80-81).
14

Granule-based knowledge representation for intra and inter transaction association mining

Yang, Wanzhong January 2009 (has links)
Abstract With the phenomenal growth of electronic data and information, there are many demands for the development of efficient and effective systems (tools) to perform the issue of data mining tasks on multidimensional databases. Association rules describe associations between items in the same transactions (intra) or in different transactions (inter). Association mining attempts to find interesting or useful association rules in databases: this is the crucial issue for the application of data mining in the real world. Association mining can be used in many application areas, such as the discovery of associations between customers’ locations and shopping behaviours in market basket analysis. Association mining includes two phases. The first phase, called pattern mining, is the discovery of frequent patterns. The second phase, called rule generation, is the discovery of interesting and useful association rules in the discovered patterns. The first phase, however, often takes a long time to find all frequent patterns; these also include much noise. The second phase is also a time consuming activity that can generate many redundant rules. To improve the quality of association mining in databases, this thesis provides an alternative technique, granule-based association mining, for knowledge discovery in databases, where a granule refers to a predicate that describes common features of a group of transactions. The new technique first transfers transaction databases into basic decision tables, then uses multi-tier structures to integrate pattern mining and rule generation in one phase for both intra and inter transaction association rule mining. To evaluate the proposed new technique, this research defines the concept of meaningless rules by considering the co-relations between data-dimensions for intratransaction-association rule mining. It also uses precision to evaluate the effectiveness of intertransaction association rules. The experimental results show that the proposed technique is promising.
15

A Meaningful Candidate Approach to Mining Bi-Directional Traversal Patterns on the WWW

Chen, Jiun-rung 27 July 2004 (has links)
Since the World Wide Web (WWW) appeared, more and more useful information has been available on the WWW. In order to find the information, one application of data mining techniques on the WWW, referred to as Web mining, has become a research area with increasing importance. Mining traversal patterns is one of the important topics in Web mining. It focuses on how to find the Web page sequences which are frequently browsed by users. Although the algorithms for mining association rules (e.g., Apriori and DHP algorithms) could be applied to mine traversal patterns, they do not utilize the property of Web transactions and generate too many invalid candidate patterns. Thus, they could not provide good performance. Wu et al. proposed an algorithm for mining traversal patterns, SpeedTracer, which utilizes the property of Web transactions, i.e., the continuous property of the traversal patterns in the Web structure. Although they decrease the number of candidate patterns generated in the mining process, they do not efficiently utilize the property of Web transactions to decrease the number of checks while checking the subsets of each candidate pattern. In this thesis, we design three algorithms, which improve the SpeedTracer algorithm, for mining traversal patterns. For the first algorithm, SpeedTracer*-I, it utilizes the property of Web transactions to directly generate and count all candidate patterns from user sessions. Moreover, it utilizes this property to improve the checking step, when candidate patterns are generated. Next, according to the SpeedTracer*-I algorithm, we propose SpeedTracer*-II and SpeedTracer*-III algorithms. In these two algorithms, we improve the performance of the SpeedTracer*-I algorithm by decreasing the number of times to scan the database. In the SpeedTracer*-II algorithm, given a parameter n, we apply the SpeedTracer*-I algorithm to find Ln first, and use Ln to generate all Ck, where k > n. After generating all candidate patterns, we scan the database once to count all candidate patterns and then the frequent patterns could be determined. In the SpeedTracer*-III algorithm, given a parameter n, we also apply the SpeedTracer*-I algorithm to find Ln first, and directly generate and count Ck from user sessions based on Ln, where k > n. The simulation results show that the performance of the SpeedTracer*-I algorithm is better than that of the Speed- Tracer algorithm in terms of the processing time. The simulation results also show that SpeedTracer*-II and SpeedTracer*-III algorithms outperform SpeedTracer and SpeedTracer*-I algorithms, because the former two algorithms need less times to scan the database than the latter two algorithms. Moreover, from our simulation results, we show that all of our proposed algorithms could provide better performance than Apriori-like algorithms (e.g., FS and FDLP algorithms) in terms of the processing time.
16

Constructing Directed Domain Knowledge Structure Map Using Association Rule - An Example of MIS Domain

Cheng, Pai-shung 31 August 2006 (has links)
In the coming knowledge-based economy era, knowledge structure map (KSM) has becoming more and more important. If learners doing learning without the support of knowledge structure map, it will cause learning alone problem. In order to construct a real KSM, we targeted the MIS domain. By using the National Dissertation and Thesis Abstract System as input source, we first extract different research subjects from keywords and then calculate the relation strength between each keyword pairs. An automatic approach has been developed for constructing KSM for different periods of time. The constructed KSM can help learners to reduce learning alone and provide a good reference for new researchers to seek for related research directions. The proposed method can also be applied to enterprises. They can adopt this method to construct any specific KSM corresponding to their professional domain, the constructed KSM would help new employee to learn better. Furthermore, with the support of KSM, CEO can make a better decision as the KSM would contain internal and external competitive advantages about future directions.
17

Discovery of fuzzy temporal and periodic association rules

Lee, Wan-Jui 29 January 2008 (has links)
With the rapidly growing volumes of data from various sources, new tools and computational theories are required to extract useful information (knowledge) from large databases. Data mining techniques such as association rules have been proved to be effective in searching hidden knowledge in a large database. However, if we want to extract knowledge from data with temporal components, it becomes necessary to incorporate temporal semantics with the traditional data mining techniques. As mining techniques evolves, mathematical techniques become more involved to help improve the quality and diversity of mining. Fuzzy theory is one that has been adopted for this purpose. Up to now, many approaches have been proposed to discover temporal association rules or fuzzy association rules, respectively. However, no work is contributed on mining fuzzy temporal patterns. We propose in this thesis two data mining systems for discovering fuzzy temporal association rules and fuzzy periodic association rules, respectively. The mined patterns are expressed in fuzzy temporal and periodic association rules which satisfy the temporal requirements specified by the user. Temporal requirements specified by human beings tend to be ill-defined or uncertain. To deal with this kind of uncertainty, a fuzzy calendar algebra is developed to allow users to describe desired temporal requirements in fuzzy calendars easily and naturally. Moreover, the fuzzy calendar algebra helps the construction of desired time intervals in which interesting patterns are discovered and presented in terms of fuzzy temporal and periodic association rules. In our system of mining fuzzy temporal association rules, a border-based mining algorithm is proposed to find association rules incrementally. By keeping useful information of the database in a border, candidate itemsets can be computed in an efficient way. Updating of the discovered knowledge due to addition and deletion of transactions can also be done efficiently. The kept information can be used to help save the work of counting and unnecessary scans over the updated database can be avoided. Simulation results show the effectiveness of the proposed system for mining fuzzy temporal association rules. In our mining system for discovering fuzzy periodic association rules, we develop techniques for discovering patterns with periodicity. Patterns with periodicity are those that occur at regular time intervals, and therefore there are two aspects to the problem: finding the pattern, and determining the periodicity. The difficulty of the task lies in the problem of discovering these regular time intervals, i.e., the periodicity. Periodicites in the database are usually not very precise and have disturbances, and might occur at time intervals in multiple time granularities. To discover the patterns with fuzzy periodicity, we utilize the information of crisp periodic patterns to obtain a lower bound for generating candidate itemsets with fuzzy periodicities. Experimental results have shown that our system is effective in discovering fuzzy periodic association rules.
18

An Efficient Parameter-Relationship-Based Approach for Projected Clustering

Huang, Tsun-Kuei 16 June 2008 (has links)
The clustering problem has been discussed extensively in the database literature as a tool for many applications, for example, bioinformatics. Traditional clustering algorithms consider all of the dimensions of an input dataset in an attempt to learn as much as possible about each object described. In the high dimensional data, however, many of the dimensions are often irrelevant. Therefore, projected clustering is proposed. A projected cluster is a subset C of data points together with a subset D of dimensions such that the points in C are closely clustered in the subspace of dimensions D. There have been many algorithms proposed to find the projected cluster. Most of them can be divided into three kinds of classification: partitioning, density-based, and hierarchical. The DOC algorithm is one of well-known density-based algorithms for projected clustering. It uses a Monte Carlo algorithm for iteratively computing projected clusters, and proposes a formula to calculate the quality of cluster. The FPC algorithm is an extended version of the DOC algorithm, it uses the mining large itemsets approach to find the dimensions of projected cluster. Finding the large itemsets is the main goal of mining association rules, where a large itemset is a combination of items whose appearing times in the dataset is greater than a given threshold. Although the FPC algorithm has used the technique of mining large itemsets to speed up finding projected clusters, it still needs many user-specified parameters to work. Moreover, in the first step, to choose the medoid, the FPC algorithm applies a random approach for several times to get the medoid, which takes long time and may still find a bad medoid. Furthermore, the way to calculate the quality of a cluster can be considered in more details, if we take the weight of dimensions into consideration. Therefore, in this thesis, we propose an algorithm which improves those disadvantages. First, we observe that the relationship between parameters, and propose a parameter-relationship-based algorithm that needs only two parameters, instead of three parameters in most of projected clustering algorithms. Next, our algorithm chooses the medoid with the median, we choose the medoid only one time and the quality of our cluster is better than that in the FPC algorithm. Finally, our quality measure formula considers the weight of each dimension of the cluster, and gives different values according to the times of occurrences of dimensions. This formula makes the quality of projected clustering based on our algorithm better than that of the FPC algorithm. It avoids the cluster containing too many irrelevant dimensions. From our simulation results, we show that our algorithm is better than the FPC algorithm, in term of the execution time and the quality of clustering.
19

Validating cohesion metrics by mining open source software data with association rules

Singh, Pariksha January 2008 (has links)
Dissertation submitted for the fulfillment of the requirement for the degree of Masters in Information Technology, Department of Information Technology, Faculty of Accounting and Informatics, Durban University of Technology, 2008. / Competitive pressure on the software industry encourages organizations to examine the effectiveness of their software development and evolutionary processes. Therefore it is important that software is measured in order to improve the quality. The question is not whether we should measure software but how it should be measured. Software measurement has been in existence for over three decades and it is still in the process of becoming a mature science. The many influences of new software development technologies have led to a diverse growth in software measurement technologies which have resulted in various definitions and validation techniques. An important aspect of software measurement is the measurement of the design, which nowadays often means the measurement of object oriented design. Chidamer and Kemerer (1994) designed a metric suite for object oriented design, which has provided a new foundation for metrics and acts as a starting point for further development of the software measurement science. This study documents theoretical object oriented cohesion metrics and calculates those metrics for classes extracted from a sample of open source software packages. For each open source software package, the following data is recorded: software size, age, domain, number of developers, number of bugs, support requests, feature requests, etc. The study then tests by means of association rules which theoretical cohesion metrics are validated hypothesis: that older software is more cohesive than younger software, bigger packages is less cohesive than smaller packages, and the smaller the software program the more maintainable it is. This study attempts to validate existing theoretical object oriented cohesion metrics by mining open source software data with association rules.
20

Validating cohesion metrics by mining open source software data with association rules

Singh, Pariksha January 2008 (has links)
Dissertation submitted for the fulfillment of the requirement for the degree of Masters in Information Technology, Department of Information Technology, Faculty of Accounting and Informatics, Durban University of Technology, 2008. / Competitive pressure on the software industry encourages organizations to examine the effectiveness of their software development and evolutionary processes. Therefore it is important that software is measured in order to improve the quality. The question is not whether we should measure software but how it should be measured. Software measurement has been in existence for over three decades and it is still in the process of becoming a mature science. The many influences of new software development technologies have led to a diverse growth in software measurement technologies which have resulted in various definitions and validation techniques. An important aspect of software measurement is the measurement of the design, which nowadays often means the measurement of object oriented design. Chidamer and Kemerer (1994) designed a metric suite for object oriented design, which has provided a new foundation for metrics and acts as a starting point for further development of the software measurement science. This study documents theoretical object oriented cohesion metrics and calculates those metrics for classes extracted from a sample of open source software packages. For each open source software package, the following data is recorded: software size, age, domain, number of developers, number of bugs, support requests, feature requests, etc. The study then tests by means of association rules which theoretical cohesion metrics are validated hypothesis: that older software is more cohesive than younger software, bigger packages is less cohesive than smaller packages, and the smaller the software program the more maintainable it is. This study attempts to validate existing theoretical object oriented cohesion metrics by mining open source software data with association rules.

Page generated in 0.1041 seconds