Spelling suggestions: "subject:"data MIning"" "subject:"mata MIning""
71 |
AN EFFICIENT SET-BASED APPROACH TO MINING ASSOCIATION RULESHsieh, Yu-Ming 28 July 2000 (has links)
Discovery of {it association rules} is an important problem in the area of data
mining. Given a database of sales transactions, it is desirable to discover
the important associations among items such that the presence of some items
in a transaction will imply the presence of other items in the same
transaction.
Since mining association rules may require to repeatedly scan through a large
transaction database to find different association patterns, the amount of
processing could be huge, and performance improvement is an essential
concern.
Among this problem, how to efficiently {it count large
itemsets} is the major work, where a large itemset is a set of items
appearing in a sufficient number of transactions.
In this thesis, we propose efficient algorithms for mining association
rules based on a high-level set-based approach.
A set-based approach allows a clear expression of what needs to be done
as opposed to specifying exactly how the operations are carried out in a low-level approach, where
a low-level approach means to retrieve one tuple from the database at a time.
The advantage of the set-based approach, like the SETM algorithm,
is simple and stable over the range of parameter values.
However, the SETM algorithm proposed by Houtsma and Swami may generate too many invalid candidate itemsets.
Therefore, in this thesis, we propose a set-based algorithm called SETM*,
which provides the same advantages of the SETM algorithm,
while it avoids the disadvantages of the SETM algorithm.
In the SETM* algorithm, we reduce the size of the candidate database by
modifying the way of constructing it,
where a candidate database is a transaction database formed with candidate
$k$-itemsets.
Then, based on the new way to construct the candidate database in the SETM*
algorithm, we propose SETM*-2K, mbox{SETM*-MaxK} and SETM*-Lmax algorithms.
In the SETM*-2K algorithm, given a $k$, we efficiently construct $L_{k}$
based on $L_{w}$, where $w=2^{lceil log_{2}k
ceil - 1}$,
instead of step by step.
In the SETM*-MaxK algorithm, we efficiently to find the $L_{k}$ based on $L_{w}$,
where $L_{k}
ot= emptyset, L_{k+1}=emptyset$ and $w=2^{lceil log_{2}k
ceil - 1}$,
instead of step by step.
In the SETM*-Lmax algorithm, we use a forward approach to find all maximal large itemsets from $L_{k}$,
and the $k$-itemset is not included in the $k$-subsets of the $j$-itemset,
except $k=MaxK$, where $1 leq k < j leq MaxK$,
$L_{MaxK}
ot= emptyset$ and $L_{MaxK+1}=emptyset$.
We conduct several experiments using different synthetic relational databases.
The simulation results show that the
SETM* algorithm outperforms the SETM algorithm in terms of storage space or the
execution time for all relational database settings.
Moreover, we show that the proposed SETM*-2K
and SETM*-MaxK algorithms also require shorter time to achieve their goals than the SETM or SETM* algorithms.
Furthermore, we also show that the proposed forward approach (SETM*-Lmax)
to find all maximal large itemsets requires shorter time than the backward approach proposed by Agrawal.
|
72 |
The Research of Population Census with Data Mining Technology and GISChang, Chin-jui 17 July 2009 (has links)
This article offers the results of creative research: (1) Suggestions on a research structure for data mining visulization with GIS. (2) A search for the distribution of various population groups in society using the census 2000 as research background.
In single-parent families, aborigines and the elderly have long been considered disadvantaged social classes, and their widening problems will have a tremendous impact and influence on society. This study aims to apply data mining techniques to investigate the demographic features of socially disadvantaged groups in Taipei, Kaohsiung and Los Angeles County by using population data collected in the 2000 census to provide reference for social welfare decision makers in understanding these groups and forming policy. The demographic features, marital features and educational attainment of the heads of household in single-parent families and poverty lives were investigated. The demographic features, educational attainment and marital status of aborigines were analyzed. The marital features, educational attainment, care and life patterns of the elderly were studied.
|
73 |
Effective and efficient analysis of spatio-temporal data /Zhang, Zhongnan. January 2008 (has links)
Thesis (Ph.D.)--University of Texas at Dallas, 2008. / Includes vita. Includes bibliographical references (leaves 106-114)
|
74 |
Efficient mining of association rules using conjectural informationLoo, Kin-kong. January 2001 (has links)
Thesis (M. Phil.)--University of Hong Kong, 2001. / Includes bibliographical references (leaves 71-72).
|
75 |
Data mining-driven approaches for process monitoring and diagnosisSukchotrat, Thuntee. January 2008 (has links)
Thesis (Ph.D.) -- University of Texas at Arlington, 2008.
|
76 |
Detecting malicious software by dynamic executionDai, Jianyong. January 2009 (has links)
Thesis (Ph.D.)--University of Central Florida, 2009. / Adviser: Ratan K. Guha. Includes bibliographical references (p. 98-106).
|
77 |
A multi-level biomedical ontology-enabled broker dynamic service-based data source integration /Fu, Sheng-Chieh Jack. January 2008 (has links)
Thesis ( Ph.D. ) -- University of Texas at Arlington, 2008.
|
78 |
Data mining medication administration incident data to identify opportunities for improving patient safetyGray, Michael David. Thomas, Robert Evans. January 2009 (has links)
Dissertation (Ph.D.)--Auburn University, 2009. / Abstract. Vita. Includes bibliographic references.
|
79 |
Financial market predictions using Web mining approaches /Ma, Yao. January 2009 (has links)
Includes bibliographical references (p. 62-67).
|
80 |
Sequence mining algorithmsZhang, Minghua, 張明華 January 2004 (has links)
published_or_final_version / Computer Science and Information Systems / Doctoral / Doctor of Philosophy
|
Page generated in 0.0693 seconds