Global ETD Search

Return to search

Efficient database management based on complex association rules

The large amount of data accumulated by applications is stored in a database. Because of the large amount, name conflicts or missing values sometimes occur. This prevents certain types of analysis. In this work, we solve the name conflict problem by comparing the similarity of the data, and changing the test data into the form of a given template dataset. Studies on data use many methods to discover knowledge from a given dataset. One popular method is association rules mining, which can find associations between items. This study unifies the incomplete data based on association rules. However, most rules based on traditional association rules mining are item-to-item rules, which is a less than perfect solution to the problem. The data recovery system is based on complex association rules able to find two more types of association rules, prefix pattern-to-item, and suffix pattern-to-item rules. Using complex association rules, several missing values are filled in. In order to find the frequent prefixes and frequent suffixes, this system used FP-tree to reduce the time, cost and redundancy. The segment phrases method can also be used for this system, which is a method based on the viscosity of two words to split a sentence into several phrases. Additionally, methods like data compression and hash map were used to speed up the search.

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:miun-31917
Date	January 2017
Creators	Zhang, Heng
Publisher	Mittuniversitetet, Avdelningen för informationssystem och -teknologi
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0015 seconds

Efficient database management based on complex association rules

Description

Links & Downloads

Tags

Additional Fields