Global ETD Search

21	Extração de conhecimento simbólico em técnicas de aprendizado de máquina caixa-preta por similaridade de rankings / Symbolic knowledge extraction from black-box machine learning techniques with ranking similarities Rodrigo Elias Bianchi 26 September 2008 (has links) Técnicas de Aprendizado de Máquina não-simbólicas, como Redes Neurais Artificiais, Máquinas de Vetores de Suporte e combinação de classificadores têm mostrado um bom desempenho quando utilizadas para análise de dados. A grande limitação dessas técnicas é a falta de compreensibilidade do conhecimento armazenado em suas estruturas internas. Esta Tese apresenta uma pesquisa realizada sobre métodos de extração de representações compreensíveis do conhecimento armazenado nas estruturas internas dessas técnicas não-simbólicas, aqui chamadas de caixa preta, durante seu processo de aprendizado. A principal contribuição desse trabalho é a proposta de um novo método pedagógico para extração de regras que expliquem o processo de classificação seguido por técnicas não-simbólicas. Esse novo método é baseado na otimização (maximização) da similaridade entre rankings de classificação produzidos por técnicas de Aprendizado de Máquina simbólicas e não simbólicas (de onde o conhecimento interno esta sendo extraído). Experimentos foram realizados com vários conjuntos de dados e os resultados obtidos sugerem um bom potencial para o método proposto / Non-symbolic Machine Learning techniques, like Artificial Neural Networks, Support Vector Machines and Ensembles of classifiers have shown a good performance when they are used in data analysis. The strong limitation regarding the use of these techniques is the lack of comprehensibility of the knowledge stored in their internal structure. This Thesis presents an investigation of methods capable of extracting comprehensible representations of the knowledge acquired by these non-symbolic techniques, here named black box, during their learning process. The main contribution of this work is the proposal of a new pedagogical method for rule extraction that explains the classification process followed by non-symbolic techniques. This new method is based on the optimization (maximization) of the similarity between classification rankings produced by symbolic and non-symbolic (from where the internal knowledge is being extracted) Machine Learning techniques. Experiments were performed for several datasets and the results obtained suggest a good potential of the proposed method Aprendizado de máquina Extração de conhecimento Extração de regras Máquinas de vetores suporte Redes neurais Knowledge extraction Machine learning Neural networks Rule extraction Support vector machines
22	Predictive Techniques and Methods for Decision Support in Situations with Poor Data Quality König, Rikard January 2009 (has links) Today, decision support systems based on predictive modeling are becoming more common, since organizations often collectmore data than decision makers can handle manually. Predictive models are used to find potentially valuable patterns in the data, or to predict the outcome of some event. There are numerous predictive techniques, ranging from simple techniques such as linear regression,to complex powerful ones like artificial neural networks. Complexmodels usually obtain better predictive performance, but are opaque and thus cannot be used to explain predictions or discovered patterns.The design choice of which predictive technique to use becomes even harder since no technique outperforms all others over a large set of problems. It is even difficult to find the best parameter values for aspecific technique, since these settings also are problem dependent.One way to simplify this vital decision is to combine several models, possibly created with different settings and techniques, into an ensemble. Ensembles are known to be more robust and powerful than individual models, and ensemble diversity can be used to estimate the uncertainty associated with each prediction.In real-world data mining projects, data is often imprecise, contain uncertainties or is missing important values, making it impossible to create models with sufficient performance for fully automated systems.In these cases, predictions need to be manually analyzed and adjusted.Here, opaque models like ensembles have a disadvantage, since theanalysis requires understandable models. To overcome this deficiencyof opaque models, researchers have developed rule extractiontechniques that try to extract comprehensible rules from opaquemodels, while retaining sufficient accuracy.This thesis suggests a straightforward but comprehensive method forpredictive modeling in situations with poor data quality. First,ensembles are used for the actual modeling, since they are powerful,robust and require few design choices. Next, ensemble uncertaintyestimations pinpoint predictions that need special attention from adecision maker. Finally, rule extraction is performed to support theanalysis of uncertain predictions. Using this method, ensembles can beused for predictive modeling, in spite of their opacity and sometimesinsufficient global performance, while the involvement of a decisionmaker is minimized.The main contributions of this thesis are three novel techniques that enhance the performance of the purposed method. The first technique deals with ensemble uncertainty estimation and is based on a successful approach often used in weather forecasting. The other twoare improvements of a rule extraction technique, resulting in increased comprehensibility and more accurate uncertainty estimations. / <p><b>Sponsorship</b>:</p><p>This work was supported by the Information Fusion Research</p><p>Program (www.infofusion.se) at the University of Skövde, Sweden, in</p><p>partnership with the Swedish Knowledge Foundation under grant</p><p>2003/0104.</p> rule extraction genetic programming uncertainty estimation machine learning artificial neural networks information fusion Computer Science data mining Computer and Information Sciences Data- och informationsvetenskap Information Systems
23	Σύγκριση μεθόδων δημιουργίας έμπειρων συστημάτων με κανόνες για προβλήματα κατηγοριοποίησης από σύνολα δεδομένων Τζετζούμης, Ευάγγελος 31 January 2013 (has links) Σκοπός της παρούσας εργασίας είναι η σύγκριση διαφόρων μεθόδων κατηγοριοποίησης που στηρίζονται σε αναπαράσταση γνώσης με κανόνες μέσω της δημιουργίας έμπειρων συστημάτων από γνωστά σύνολα δεδομένων. Για την εφαρμογή των μεθόδων και τη δημιουργία και υλοποίηση των αντίστοιχων έμπειρων συστημάτων χρησιμοποιούμε διάφορα εργαλεία όπως: (α) Το ACRES, το οποίο είναι ένα εργαλείο αυτόματης παραγωγής έμπειρων συστημάτων με συντελεστές βεβαιότητας. Οι συντελεστές βεβαιότητος μπορούν να υπολογίζονται κατά δύο τρόπους και επίσης παράγονται δύο τύποι έμπειρων συστημάτων που στηρίζονται σε δύο διαφορετικές μεθόδους συνδυασμού των συντελεστών βεβαιότητας (κατά MYCIN και μιας γενίκευσης αυτής του MYCIN με χρήση βαρών που υπολογίζονται μέσω ενός γενετικού αλγορίθμου). (β) Το WEKA, το οποίο είναι ένα εργαλείο που περιέχει αλγόριθμους μηχανικής μάθησης. Συγκεκριμένα, στην εργασία χρησιμοποιούμε τον αλγόριθμο J48, μια υλοποίηση του γνωστού αλγορίθμου C4.5, που παράγει δένδρα απόφασης, δηλ. κανόνες. (γ) Το CLIPS, το οποίο είναι ένα κέλυφος για προγραμματισμό με κανόνες. Εδώ, εξάγονται οι κανόνες από το δέντρο απόφασης του WEKA και υλοποιούνται στο CLIPS με ενδεχόμενες μετατροπές. (δ) Το FuzzyCLIPS, το οποίο επίσης είναι ένα κέλυφος για την δημιουργία ασαφών ΕΣ. Είναι μια επέκταση του CLIPS που χρησιμοποιεί ασαφείς κανόνες και συντελεστές βεβαιότητος. Εδώ, το έμπειρο σύστημα που παράγεται μέσω του CLIPS μετατρέπεται σε ασαφές έμπειρο σύστημα με ασαφοποίηση κάποιων μεταβλητών. (ε) Το GUI Ant-Miner, το οποίο είναι ένα εργαλείο για την εξαγωγή κανόνων κατηγοριοποίησης από ένα δοσμένο σύνολο δεδομένων. με τη χρήση ενός μοντέλου ακολουθιακής κάλυψης, όπως ο αλγόριθμος AntMiner. Με βάση τις παραπάνω μεθόδους-εργαλεία δημιουργήθηκαν έμπειρα συστήματα από πέντε σύνολα δεδομένων κατηγοριοποίησης από τη βάση δεδομένων UCI Machine Learning Repository. Τα συστήματα αυτά αξιολογήθηκαν ως προς την ταξινόμηση με βάση γνωστές μετρικές (ορθότητα, ευαισθησία, εξειδίκευση και ακρίβεια). Από τη σύγκριση των μεθόδων και στα πέντε σύνολα δεδομένων, εξάγουμε τα παρακάτω συμπεράσματα: (α) Αν επιθυμούμε αποτελέσματα με μεγαλύτερη ακρίβεια και μεγάλη ταχύτητα, θα πρέπει μάλλον να στραφούμε στην εφαρμογή WEKA. (β) Αν θέλουμε να κάνουμε και παράλληλους υπολογισμούς, η μόνη εφαρμογή που μας παρέχει αυτή τη δυνατότητα είναι το FuzzyCLIPS, θυσιάζοντας όμως λίγη ταχύτητα και ακρίβεια. (γ) Όσον αφορά το GUI Ant-Miner, λειτουργεί τόσο καλά όσο και το WEKA όσον αφορά την ακρίβεια αλλά είναι πιο αργή μέθοδος. (δ) Σχετικά με το ACRES, λειτουργεί καλά όταν δουλεύουμε με υποσύνολα μεταβλητών, έτσι ώστε να παράγεται σχετικά μικρός αριθμός κανόνων και να καλύπτονται σχεδόν όλα τα στιγμιότυπα στο σύνολο έλεγχου. Στα σύνολα δεδομένων μας το ACRES δεν θεωρείται πολύ αξιόπιστο υπό την έννοια ότι αναγκαζόμαστε να δουλεύουμε με υποσύνολο μεταβλητών και όχι όλες τις μεταβλητές του συνόλου δεδομένων. Όσο πιο πολλές μεταβλητές πάρουμε ως υποσύνολο στο ACRES, τόσο πιο αργό γίνεται. / The aim of this thesis is the comparison of several classification methods that are based on knowledge representation with rules via the creation of expert systems from known data sets. For the application of those methods and the creation and implementation of the corresponding expert systems, we use various tools such as: (a) ACRES, which is a tool for automatic production of expert systems with certainty factors. The certainty factors can be calculated via two different methods and also two different types of expert systems can be produced based on different methods of certainty propagation (that of MYCIN and a generalized version of MYCIN one that uses weights calculated via a genetic algorithm). (b) WEKA, which is a tool that contains machine learning algorithms. Specifically, we use J48, an implementation of the known algorithm C4.5, which produces decision trees, which are coded rules. (c) CLIPS, which is a shell for rule based programming. Here, the rules encoded on the decision true produced by WEKA are extracted and codified in CLIPS with possible changes. (d) FuzzyCLIPS, which is a shell for creating fuzzy expert systems. It's an extension of CLIPS that uses fuzzy rules and certainty factors. Here, the expert system created via CLIPS is transferred to a fuzzy expert system by making some variables fuzzy. (e) GUI Ant-Miner, which is a tool for classification rules extraction from a given data set, using a sequential covering model, such as the AntMiner algorithm. Based on the above methods-tools, expert systems were created from five (5) classification data sets from the UCI Machine Learning Repository. Those systems have been evaluated according to their classification capabilities based on known metrics (accuracy, sensitivity, specificity and precision). From the comparison of the methods on the five data sets, we conclude the following: (a) if we want results with greater accuracy and high speed, we should probably turn into WEKA. (b) if we want to do parallel calculations too, the only tool that provides us this capability is FuzzyCLIPS, sacrificing little speed and accuracy. (c) With regards to GUI Ant-Miner, it works as well as WEKA in terms of accuracy, but it is slower. (d) About ACRES, it works well when we work with subsets of the variables, so that it produces a relatively small number or rules and covers almost all the instances of the test set. For our datasets, ACRES is not considered very reliable in the sense that we should work with subsets of variables, not all the variables of the dataset. The more variables we consider as a subset in ACRES, the slower it becomes. Έμπειρα συστήματα Ακολουθιακή κάλυψη 006.33 Expert systems Classification rule extraction Sequential covering Certainty factors Automatic creation of expert systems Machine learning algorithms
24	Automatické ladění vah pravidlových bází znalostí / Automated Weight Tuning for Rule-Based Knowledge Bases Valenta, Jan January 2009 (has links) This dissertation thesis introduces new methods of automated knowledge-base creation and tuning in information and expert systems. The thesis is divided in the two following parts. The first part is focused on the legacy expert system NPS32 developed at the Faculty of Electrical Engineering and Communication, Brno University of Technology. The mathematical base of the system is expression of the rule uncertainty using two values. Thus, it extends information capability of the knowledge-base by values of the absence of the information and conflict in the knowledge-base. The expert system has been supplemented by a learning algorithm. The learning algorithm sets weights of the rules in the knowledge base using differential evolution algorithm. It uses patterns acquired from an expert. The learning algorithm is only one-layer knowledge-bases limited. The thesis shows a formal proof that the mathematical base of the NPS32 expert system can not be used for gradient tuning of the weights in the multilayer knowledge-bases. The second part is focused on multilayer knowledge-base learning algorithm. The knowledge-base is based on a specific model of the rule with uncertainty factors. Uncertainty factors of the rule represents information impact ratio. Using a learning algorithm adjusting weights of every single rule in the knowledge base structure, the modified back propagation algorithm is used. The back propagation algorithm is modified for the given knowledge-base structure and rule model. For the purpose of testing and verifying the learning algorithm for knowledge-base tuning, the expert system RESLA has been developed in C#. With this expert system, the knowledge-base from medicine field, was created. The aim of this knowledge base is verify learning ability for complex knowledge-bases. The knowledge base represents heart malfunction diagnostic base on the acquired ECG (electrocardiogram) parameters. For the purpose of the comparison with already existing knowledge-basis, created by the expert and knowledge engineer, the expert system was compared with professionally designed knowledge-base from the field of agriculture. The knowledge-base represents system for suitable cultivar of winter wheat planting decision support. The presented algorithms speed up knowledge-base creation while keeping all advantages, which arise from using rules. Contrary to the existing solution based on neural network, the presented algorithms for knowledge-base weights tuning are faster and more simple, because it does not need rule extraction from another type of the knowledge representation.
25	Neural-Symbolic Integration Bader, Sebastian 05 October 2009 (has links) In this thesis, we discuss different techniques to bridge the gap between two different approaches to artiﬁcial intelligence: the symbolic and the connectionist paradigm. Both approaches have quite contrasting advantages and disadvantages. Research in the area of neural-symbolic integration aims at bridging the gap between them. Starting from a human readable logic program, we construct connectionist systems, which behave equivalently. Afterwards, those systems can be trained, and later the reﬁned knowledge be extracted. info:eu-repo/classification/ddc/004 ddc:004

Page generated in 0.0896 seconds