Return to search

Efficient database management based on complex association rules

The large amount of data accumulated by applications is stored in a database. Because of the large amount, name conflicts or missing values sometimes occur. This prevents certain types of analysis. In this work, we solve the name conflict problem by comparing the similarity of the data, and changing the test data into the form of a given template dataset. Studies on data use many methods to discover knowledge from a given dataset. One popular method is association rules mining, which can find associations between items. This study unifies the incomplete data based on association rules. However, most rules based on traditional association rules mining are item-to-item rules, which is a less than perfect solution to the problem. The data recovery system is based on complex association rules able to find two more types of association rules, prefix pattern-to-item, and suffix pattern-to-item rules. Using complex association rules, several missing values are filled in. In order to find the frequent prefixes and frequent suffixes, this system used FP-tree to reduce the time, cost and redundancy. The segment phrases method can also be used for this system, which is a method based on the viscosity of two words to split a sentence into several phrases. Additionally, methods like data compression and hash map were used to speed up the search.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:miun-31917
Date January 2017
CreatorsZhang, Heng
PublisherMittuniversitetet, Avdelningen för informationssystem och -teknologi
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0021 seconds