Return to search

Predictive Techniques and Methods for Decision Support in Situations with Poor Data Quality

Today, decision support systems based on predictive modeling are becoming more common, since organizations often collectmore data than decision makers can handle manually. Predictive models are used to find potentially valuable patterns in the data, or to predict the outcome of some event. There are numerous predictive techniques, ranging from simple techniques such as linear regression,to complex powerful ones like artificial neural networks. Complexmodels usually obtain better predictive performance, but are opaque and thus cannot be used to explain predictions or discovered patterns.The design choice of which predictive technique to use becomes even harder since no technique outperforms all others over a large set of problems. It is even difficult to find the best parameter values for aspecific technique, since these settings also are problem dependent.One way to simplify this vital decision is to combine several models, possibly created with different settings and techniques, into an ensemble. Ensembles are known to be more robust and powerful than individual models, and ensemble diversity can be used to estimate the uncertainty associated with each prediction.In real-world data mining projects, data is often imprecise, contain uncertainties or is missing important values, making it impossible to create models with sufficient performance for fully automated systems.In these cases, predictions need to be manually analyzed and adjusted.Here, opaque models like ensembles have a disadvantage, since theanalysis requires understandable models. To overcome this deficiencyof opaque models, researchers have developed rule extractiontechniques that try to extract comprehensible rules from opaquemodels, while retaining sufficient accuracy.This thesis suggests a straightforward but comprehensive method forpredictive modeling in situations with poor data quality. First,ensembles are used for the actual modeling, since they are powerful,robust and require few design choices. Next, ensemble uncertaintyestimations pinpoint predictions that need special attention from adecision maker. Finally, rule extraction is performed to support theanalysis of uncertain predictions. Using this method, ensembles can beused for predictive modeling, in spite of their opacity and sometimesinsufficient global performance, while the involvement of a decisionmaker is minimized.The main contributions of this thesis are three novel techniques that enhance the performance of the purposed method. The first technique deals with ensemble uncertainty estimation and is based on a successful approach often used in weather forecasting. The other twoare improvements of a rule extraction technique, resulting in increased comprehensibility and more accurate uncertainty estimations. / <p><b>Sponsorship</b>:</p><p>This work was supported by the Information Fusion Research</p><p>Program (www.infofusion.se) at the University of Skövde, Sweden, in</p><p>partnership with the Swedish Knowledge Foundation under grant</p><p>2003/0104.</p>

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:hb-3517
Date January 2009
CreatorsKönig, Rikard
PublisherHögskolan i Borås, Institutionen Handels- och IT-högskolan
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeLicentiate thesis, monograph, info:eu-repo/semantics/masterThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0019 seconds