• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 146
  • 36
  • 22
  • 15
  • 8
  • 4
  • 3
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 289
  • 289
  • 97
  • 90
  • 77
  • 69
  • 57
  • 57
  • 56
  • 39
  • 39
  • 36
  • 34
  • 31
  • 28
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
171

A Search For the Standard Model Higgs Boson Produced in Association with Top Quarks in the Lepton + Jets Channel at CMS

Smith, Geoffrey N. 18 August 2014 (has links)
No description available.
172

An adaptive ensemble classifier for mining concept drifting data streams

Farid, D.M., Zhang, L., Hossain, A., Rahman, C.M., Strachan, R., Sexton, G., Dahal, Keshav P. January 2013 (has links)
No / It is challenging to use traditional data mining techniques to deal with real-time data stream classifications. Existing mining classifiers need to be updated frequently to adapt to the changes in data streams. To address this issue, in this paper we propose an adaptive ensemble approach for classification and novel class detection in concept drifting data streams. The proposed approach uses traditional mining classifiers and updates the ensemble model automatically so that it represents the most recent concepts in data streams. For novel class detection we consider the idea that data points belonging to the same class should be closer to each other and should be far apart from the data points belonging to other classes. If a data point is well separated from the existing data clusters, it is identified as a novel class instance. We tested the performance of this proposed stream classification model against that of existing mining algorithms using real benchmark datasets from UCI (University of California, Irvine) machine learning repository. The experimental results prove that our approach shows great flexibility and robustness in novel class detection in concept drifting and outperforms traditional classification models in challenging real-life data stream applications. (C) 2013 Elsevier Ltd. All rights reserved.
173

Automation and Expert System Framework for Coupled Shell-Solid Finite Element Modeling of Complex Structures

Palwankar, Manasi Prafulla 25 March 2022 (has links)
Finite Element (FE) analysis is a powerful numerical technique widely utilized to simulate the real-world response of complex engineering structures. With the advancements in adaptive optimization frameworks, multi-fidelity (coupled shell-solid) FE models are increasingly sought during the early design stages where a large design space is being explored. This is because multi-fidelity models have the potential to provide accurate solutions at a much lower computational cost. However, the time and effort required to create accurate and optimal multi-fidelity models with acceptable meshes for highly complex structures is still significant and is a major bottleneck in the FE modeling process. Additionally, there is a significant level of subjectivity involved in the decision-making about the multi-fidelity element topology due to a high dependence on the analyst's experience and expertise, which often leads to disagreements between analysts regarding the optimal modeling approach and heavy losses due to schedule delays. Moreover, this analyst-to-analyst variability can also result in significantly different final engineering designs. Thus, there is a greater need to accelerate the FE modeling process by automating the development of robust and adaptable multi-fidelity models as well as eliminating the subjectivity and art involved in the development of multi-fidelity models. This dissertation presents techniques and frameworks for accelerating the finite element modeling process of multi-fidelity models. A framework for the automated development of multi-fidelity models with adaptable 2-D/3-D topology using the parameterized full-fidelity and structural fidelity models is presented. Additionally, issues related to the automated meshing of highly complex assemblies is discussed and a strategic volume decomposition technique blueprint is proposed for achieving robust hexahedral meshes in complicated assembly models. A comparison of the full-solid, full-shell, and different multi-fidelity models of a highly complex stiffened thin-walled pressure vessel under external and internal tank pressure is presented. Results reveal that automation of multi-fidelity model generation in an integrated fashion including the geometry creation, meshing and post-processing can result in considerable reduction in cost and efforts. Secondly, the issue of analyst-to-analyst variability is addressed using a Decision Tree (DT) based Fuzzy Inference System (FIS) for recommending optimal 2D-3D element topology for a multi-fidelity model. Specifically, the FIS takes the structural geometry and desired accuracy as inputs (for a range of load cases) and infers the optimal 2D-3D topology distribution. Once developed, the FIS can provide real-time optimal choices along with interpretability that provides confidence to the analyst regarding the modeling choices. The proposed techniques and frameworks can be generalized to more complex problems including non-linear finite element models and as well as adaptable mesh generation schemes. / Doctor of Philosophy / Structural analysis is the process of determining the response (mainly, deformation and stresses) of a structure under specified loads and external conditions. This is often performed using computational modeling of the structure to approximate its response in real-life conditions. The Finite Element Method (FEM) is a powerful and widely used numerical technique utilized in engineering applications to evaluate the physical performance of structures in several engineering disciplines, including aerospace and ocean engineering. As optimum designs are increasing sought in industries, the need to develop computationally efficient models becomes necessary to explore a large design space. As such, optimal multi-fidelity models are preferred that utilize higher fidelity computational domain in the critical areas and a lower fidelity domain in less critical areas to provide an optimal trade-off between accuracy and efficiency. However, the development of such optimal models involves a high level of expertise in making a-priori and a-posteriori optimal modeling decisions. Such experience based variability between analysts is often a major cause of schedule delays and considerable differences in final engineering designs. A combination of automated model development and optimization along with an expert system that relieves the analyst of the need for experience and expertise in making software and theoretical assumptions for the model can result in a powerful and cost-effective computational modeling process that accelerates technological advancements. This dissertation proposes techniques for automating robust development of complex multi-fidelity models. Along with these techniques, a data-driven expert system framework is proposed that makes optimal multi-fidelity modeling choices based on the structural configuration and desired accuracy level.
174

Interpretable Binary and Multiclass Prediction Models for Insolvencies and Credit Ratings

Obermann, Lennart 10 May 2016 (has links)
Insolvenzprognosen und Ratings sind wichtige Aufgaben der Finanzbranche und dienen der Kreditwürdigkeitsprüfung von Unternehmen. Eine Möglichkeit dieses Aufgabenfeld anzugehen, ist maschinelles Lernen. Dabei werden Vorhersagemodelle aufgrund von Beispieldaten aufgestellt. Methoden aus diesem Bereich sind aufgrund Ihrer Automatisierbarkeit vorteilhaft. Dies macht menschliche Expertise in den meisten Fällen überflüssig und bietet dadurch einen höheren Grad an Objektivität. Allerdings sind auch diese Ansätze nicht perfekt und können deshalb menschliche Expertise nicht gänzlich ersetzen. Sie bieten sich aber als Entscheidungshilfen an und können als solche von Experten genutzt werden, weshalb interpretierbare Modelle wünschenswert sind. Leider bieten nur wenige Lernalgorithmen interpretierbare Modelle. Darüber hinaus sind einige Aufgaben wie z.B. Rating häufig Mehrklassenprobleme. Mehrklassenklassifikationen werden häufig durch Meta-Algorithmen erreicht, welche mehrere binäre Algorithmen trainieren. Die meisten der üblicherweise verwendeten Meta-Algorithmen eliminieren jedoch eine gegebenenfalls vorhandene Interpretierbarkeit. In dieser Dissertation untersuchen wir die Vorhersagegenauigkeit von interpretierbaren Modellen im Vergleich zu nicht interpretierbaren Modellen für Insolvenzprognosen und Ratings. Wir verwenden disjunktive Normalformen und Entscheidungsbäume mit Schwellwerten von Finanzkennzahlen als interpretierbare Modelle. Als nicht interpretierbare Modelle werden Random Forests, künstliche Neuronale Netze und Support Vector Machines verwendet. Darüber hinaus haben wir einen eigenen Lernalgorithmus Thresholder entwickelt, welcher disjunktive Normalformen und interpretierbare Mehrklassenmodelle generiert. Für die Aufgabe der Insolvenzprognose zeigen wir, dass interpretierbare Modelle den nicht interpretierbaren Modellen nicht unterlegen sind. Dazu wird in einer ersten Fallstudie eine in der Praxis verwendete Datenbank mit Jahresabschlüssen von 5152 Unternehmen verwendet, um die Vorhersagegenauigkeit aller oben genannter Modelle zu messen. In einer zweiten Fallstudie zur Vorhersage von Ratings demonstrieren wir, dass interpretierbare Modelle den nicht interpretierbaren Modellen sogar überlegen sind. Die Vorhersagegenauigkeit aller Modelle wird anhand von drei in der Praxis verwendeten Datensätzen bestimmt, welche jeweils drei Ratingklassen aufweisen. In den Fallstudien vergleichen wir verschiedene interpretierbare Ansätze bezüglich deren Modellgrößen und der Form der Interpretierbarkeit. Wir präsentieren exemplarische Modelle, welche auf den entsprechenden Datensätzen basieren und bieten dafür Interpretationsansätze an. Unsere Ergebnisse zeigen, dass interpretierbare, schwellwertbasierte Modelle den Klassifikationsproblemen in der Finanzbranche angemessen sind. In diesem Bereich sind sie komplexeren Modellen, wie z.B. den Support Vector Machines, nicht unterlegen. Unser Algorithmus Thresholder erzeugt die kleinsten Modelle während seine Vorhersagegenauigkeit vergleichbar mit den anderen interpretierbaren Modellen bleibt. In unserer Fallstudie zu Rating liefern die interpretierbaren Modelle deutlich bessere Ergebnisse als bei der zur Insolvenzprognose (s. o.). Eine mögliche Erklärung dieser Ergebnisse bietet die Tatsache, dass Ratings im Gegensatz zu Insolvenzen menschengemacht sind. Das bedeutet, dass Ratings auf Entscheidungen von Menschen beruhen, welche in interpretierbaren Regeln, z.B. logischen Verknüpfungen von Schwellwerten, denken. Daher gehen wir davon aus, dass interpretierbare Modelle zu den Problemstellungen passen und diese interpretierbaren Regeln erkennen und abbilden.
175

Vers une simplification de la conception de comportements stratégiques pour les opposants dans les jeux vidéo de stratégie / Towards a simplification of strategic behaviors design for opponents in strategy video games

Lemaitre, Juliette 21 March 2017 (has links)
Cette thèse aborde la problématique de la création d’intelligences artificielles (IA) contrôlant la prise de décision haut-niveau dans les jeux de stratégie. Ce type de jeux propose des environnements complexes nécessitant de manipuler de nombreuses ressources en faisant des choix d’actions dépendant d’objectifs à long terme. La conception de ces IA n’est pas simple car il s’agit de fournir une expérience pour le joueur qui soit divertissante et intéressante à jouer. Ainsi, le but n’est pas d’obtenir des comportements d’IA imbattables, mais plutôt de refléter différents traits de personnalités permettant au joueur d’être confronté à des adversaires diversifiés. Leur conception fait intervenir des game designers qui vont définir les différentes stratégies en fonction de l’expérience qu’ils souhaitent créer pour le joueur, et des développeurs qui programment et intègrent ces stratégies au jeu. La collaboration entre eux nécessite de nombreux échanges et itérations de développement pour obtenir un résultat qui correspond aux attentes des designers. L’objectif de cette thèse est de proposer une solution de modélisation de stratégies accessible aux game designers en vue d’améliorer et de simplifier la création de comportements stratégiques. Notre proposition prend la forme d’un moteur stratégique choisissant des objectifs à long terme et vient se placer au dessus d’un module tactique qui gère l’application concrète de ces objectifs. La solution proposée n’impose pas de méthode pour résoudre ces objectifs et laisse libre le fonctionnement du module tactique. Le moteur est couplé à un modèle de stratégie permettant à l’utilisateur d’exprimer des règles permettant au moteur de choisir les objectifs et de leur allouer des ressources. Ces règles permettent d’exprimer le choix d’objectifs en fonction du contexte, mais également d’en choisir plusieurs en parallèle et de leur donner des importances relatives afin d’influencer la répartition des ressources. Pour améliorer l’intelligibilité nous utilisons un modèle graphique inspiré des machines à états finis et des behavior trees. Les stratégies créées à l’aide de notre modèle sont ensuite exécutées par le moteur de stratégie pour produire des directives qui sont données au module tactique. Ces directives se présentent sous la forme d’objectifs stratégiques et de ressources qui leur sont allouées en fonction de leurs besoins et de l’importance relative qui leur a été donnée. Le module stratégique permet donc de rendre accessible la conception du niveau stratégique d’une IA contrôlant un adversaire dans un jeu de stratégie. / This PhD thesis addresses the topic of creating artificial intelligence (AI) to control high-level decision-making in strategy games. This kind of game offers complex environments that require the manipulation of a large number of resources by choosing actions depending on long-term goals. This AI design is not simple because it is about providing to the player a playful and interesting experience. Hence, the aim is not to create unbeatable behaviors, but rather to display several personality traits allowing the player to face diverse opponents. Its creation involves game designers who are responsible of defining several strategies according to the experience they want to provide to the player, and game developers who implement those strategies to put them into the game. The collaboration between them requires many exchanges and development iterations to obtain a result corresponding to game designers’ expectations. The objective of this PhD thesis is to improve and simplify the creation of strategical behaviors by proposing a strategy model intelligible to game designers and that can be interfaced easily with developers’ work. For game designers, a strategy model has been created to allow them to express rules guiding the choice of goals and their allocated resources. These rules make it possible for game designers to express which goal to choose according to the context but also to choose several of them and give them relative importance in order to influence the resource distribution. To improve intelligibility we use a graphical model inspired from finite state machines and behavior trees. Our proposition also includes a strategy engine which executes the strategies created with the model. This execution produces directives that are represented by a list of selected strategical goals and the resources that have been allocated according to the importance and needs of each goal. These directives are intended for a tactical module in charge of their application. The developers are then responsible for the implementation of this tactical module. Our solution enables game designers to directly design the strategical level of an AI and therefore facilitates their cooperation with game developers and simplifies the entire creation process of the AI.
176

Aplicação de minerador de dados na obtenção de relações entre padrões de encadeamento de viagens codificados e características sócio-econômicas / Applicability of a data miner for obtaining relationships bteween trip-chaining patterns and urban trip-makers socioeconomic characteristics

Sandra Matiko Ichikawa 29 November 2002 (has links)
O principal objetivo deste trabalho é analisar a aplicabilidade de um minerador de dados para obter relações entre padrões de viagens encadeadas e características sócio-econômicas de viajantes urbanos. Para representar as viagens encadeadas, as viagens correspondentes a cada indivíduo do banco de dados foram codificadas em termos de seqüência de letras que indicam uma ordem cronológica em que atividades são desenvolvidas. O minerador de dados utilizado neste trabalho é árvore de decisão e classificação, uma ferramenta de análise disponível no software S-Plus. A análise foi baseada na pesquisa origem-destino realizada pelo Metrô-SP na região metropolitana de São Paulo, por meio de entrevistas domiciliares, em 1987. Um dos importantes resultados é que indivíduos que têm atributos sócio-econômicos e de viagens similares não se comportam de maneira similar; pelo contrário, eles fazem diferentes padrões de viagens encadeadas, as quais podem ser descritas em termos de probabilidade ou freqüência associada a cada padrão. Portanto, o minerador de dados deve possuir a habilidade para representar essa distribuição. A consistência do resultado foi analisada comparando-os com alguns resultados encontrados na literatura referente a análise de viagem baseada em atividades. A principal conclusão é que árvore de decisão e classificação aplicada a dados individuais, contendo encadeamento de viagem codificado e atributos socioeconômicos e de viagem, permite extrair conhecimento e informações ocultas que ajudam a compreender o comportamento de viagem de viajantes urbanos. / The main aim of this work is to analyze the applicability of a data miner for obtaining relationships between trip-chaining patterns and urban trip-makers socioeconomic characteristics. In order to represent the trip-chains, trips corresponding to each individual in the data set were coded in terms of letters indicating a chronological order in which activities are performed. Data miner applied in this work is decision and classification tree, an analysis tool available in S-Plus software package. The analysis was based on the origin-destination home-interview survey carried out by Metrô-SP in São Paulo metropolitan area. One of the important findings is that individuals having similar socieconomic and trip attributes do not behave in a similar way; on the contrary, they make different trip-chaining patterns, which may be described in term of probability or frequency associated to each pattern. Therefore, the data miner should have ability to represent that distribution. The consistency of results was analyzed by comparing them with some results found in literature related to activity-based travel analysis. The main conclusion is that decision and classification tree applied to individual data, containing coded trip-chaining and socioeconomic and trip attributes, allows extracting hidden knowledge and information that help to understand the travel behaviour of urban trip-makers.
177

Inteligentni softverski sistem za dijagnostiku metaboličkog sindroma / INTELIGENT SOFTWARE SYSTEM FOR METABOLIC SYNDROMEDIAGNOSTICS

Ivanović Darko 16 April 2018 (has links)
<p>Doktorska disertacija razmatra problem algoritamske dijagnostike<br />metaboličkog sindroma na osnovu lako merljivih parametara: pol,<br />starosna dob, indeks telesne mase, odnos obima struka i visine,<br />sistolni i dijastolni krvni pritisak. U istraživanju su primenjene i<br />eksperimentalno ispitane tri različite metode mašinskog učenja:<br />stabla odluke, linearna regresija i veštačke neuronske mreže.<br />Pokazano je da veštačke neuronske mreže daju visok nivo<br />prediktivnih vrednosti dovoljan za primenu u praksi. Korišćenjem<br />dobijenog rezultata definisan je i implementiran inteligentni<br />softverski sistem za dijagnostiku metaboličkog sindroma.</p> / <p>The doctoral dissertation examines the problem of algorithmic diagnostics of<br />the metabolic syndrome based on easily measurable parameters: sex, age,<br />body mass index, waist and height ratio, systolic and diastolic blood<br />pressure. In the study, three different methods of machine learning were<br />applied and experimentally examined: decision trees, linear regression and<br />artificial neural networks. It has been shown that artificial neural networks<br />give a high level of predictive value sufficient to be applied in practice. Using<br />the obtained result, an intelligent software system for the diagnosis of<br />metabolic syndrome has been defined and implemented.</p>
178

Aplicação de minerador de dados na obtenção de relações entre padrões de encadeamento de viagens codificados e características sócio-econômicas / Applicability of a data miner for obtaining relationships bteween trip-chaining patterns and urban trip-makers socioeconomic characteristics

Ichikawa, Sandra Matiko 29 November 2002 (has links)
O principal objetivo deste trabalho é analisar a aplicabilidade de um minerador de dados para obter relações entre padrões de viagens encadeadas e características sócio-econômicas de viajantes urbanos. Para representar as viagens encadeadas, as viagens correspondentes a cada indivíduo do banco de dados foram codificadas em termos de seqüência de letras que indicam uma ordem cronológica em que atividades são desenvolvidas. O minerador de dados utilizado neste trabalho é árvore de decisão e classificação, uma ferramenta de análise disponível no software S-Plus. A análise foi baseada na pesquisa origem-destino realizada pelo Metrô-SP na região metropolitana de São Paulo, por meio de entrevistas domiciliares, em 1987. Um dos importantes resultados é que indivíduos que têm atributos sócio-econômicos e de viagens similares não se comportam de maneira similar; pelo contrário, eles fazem diferentes padrões de viagens encadeadas, as quais podem ser descritas em termos de probabilidade ou freqüência associada a cada padrão. Portanto, o minerador de dados deve possuir a habilidade para representar essa distribuição. A consistência do resultado foi analisada comparando-os com alguns resultados encontrados na literatura referente a análise de viagem baseada em atividades. A principal conclusão é que árvore de decisão e classificação aplicada a dados individuais, contendo encadeamento de viagem codificado e atributos socioeconômicos e de viagem, permite extrair conhecimento e informações ocultas que ajudam a compreender o comportamento de viagem de viajantes urbanos. / The main aim of this work is to analyze the applicability of a data miner for obtaining relationships between trip-chaining patterns and urban trip-makers socioeconomic characteristics. In order to represent the trip-chains, trips corresponding to each individual in the data set were coded in terms of letters indicating a chronological order in which activities are performed. Data miner applied in this work is decision and classification tree, an analysis tool available in S-Plus software package. The analysis was based on the origin-destination home-interview survey carried out by Metrô-SP in São Paulo metropolitan area. One of the important findings is that individuals having similar socieconomic and trip attributes do not behave in a similar way; on the contrary, they make different trip-chaining patterns, which may be described in term of probability or frequency associated to each pattern. Therefore, the data miner should have ability to represent that distribution. The consistency of results was analyzed by comparing them with some results found in literature related to activity-based travel analysis. The main conclusion is that decision and classification tree applied to individual data, containing coded trip-chaining and socioeconomic and trip attributes, allows extracting hidden knowledge and information that help to understand the travel behaviour of urban trip-makers.
179

Multistage stochastic programming models for the portfolio optimization of oil projects

Chen, Wei, 1974- 20 December 2011 (has links)
Exploration and production (E&P) involves the upstream activities from looking for promising reservoirs to extracting oil and selling it to downstream companies. E&P is the most profitable business in the oil industry. However, it is also the most capital-intensive and risky. Hence, the proper assessment of E&P projects with effective management of uncertainties is crucial to the success of any upstream business. This dissertation is concentrated on developing portfolio optimization models to manage E&P projects. The idea is not new, but it has been mostly restricted to the conceptual level due to the inherent complications to capture interactions among projects. We disentangle the complications by modeling the project portfolio optimization problem as multistage stochastic programs with mixed integer programming (MIP) techniques. Due to the disparate nature of uncertainties, we separately consider explored and unexplored oil fields. We model portfolios of real options and portfolios of decision trees for the two cases, respectively. The resulting project portfolio models provide rigorous and consistent treatments to optimally balance the total rewards and the overall risk. For explored oil fields, oil price fluctuations dominate the geologic risk. The field development process hence can be modeled and assessed as sequentially compounded options with our optimization based option pricing models. We can further model the portfolio of real options to solve the dynamic capital budgeting problem for oil projects. For unexplored oil fields, the geologic risk plays the dominating role to determine how a field is optimally explored and developed. We can model the E&P process as a decision tree in the form of an optimization model with MIP techniques. By applying the inventory-style budget constraints, we can pool multiple project-specific decision trees to get the multistage E&P project portfolio optimization (MEPPO) model. The resulting large scale MILP is efficiently solved by a decomposition-based primal heuristic algorithm. The MEPPO model requires a scenario tree to approximate the stochastic process of the geologic parameters. We apply statistical learning, Monte Carlo simulation, and scenario reduction methods to generate the scenario tree, in which prior beliefs can be progressively refined with new information. / text
180

Σχεδιασμός και ανάπτυξη αλγορίθμου συσταδοποίησης μεγάλης κλίμακας δεδομένων

Γούλας, Χαράλαμπος January 2015 (has links)
Υπό το φάσμα της νέας, ανερχόμενης κοινωνίας της πληροφορίας, η σύγκλιση των υπολογιστών με τις τηλεπικοινωνίες έχει οδηγήσει στην συνεχώς αυξανόμενη παραγωγή και αποθήκευση τεράστιου όγκου δεδομένων σχεδόν για οποιονδήποτε τομέα της ανθρώπινης ενασχόλησης. Αν, λοιπόν, τα δεδομένα αποτελούν τα καταγεγραμμένα γεγονότα της ανθρώπινης ενασχόλησης, οι πληροφορίες αποτελούν τους κανόνες, που τα διέπουν. Και η κοινωνία στηρίζεται και αναζητά διακαώς νέες πληροφορίες. Το μόνο που απομένει, είναι η ανακάλυψη τους. Ο τομέας, που ασχολείται με την συστηματική ανάλυση των δεδομένων με σκοπό την εξαγωγή χρήσιμης γνώσης ονομάζεται μηχανική μάθηση. Υπό αυτό, λοιπόν, το πρίσμα, η παρούσα διπλωματική πραγματεύεται την μηχανική μάθηση ως μια ελπίδα των επιστημόνων να αποσαφηνίσουν τις δομές που διέπουν τα δεδομένα και να ανακαλύψουν και να κατανοήσουν τους κανόνες, που “κινούν” τον φυσικό κόσμο. Αρχικά, πραγματοποιείται μια πρώτη περιγραφή της μηχανικής μάθησης ως ένα από τα βασικότερα δομικά στοιχεία της τεχνητής νοημοσύνης, παρουσιάζοντας ταυτόχρονα μια πληθώρα προβλημάτων, στα οποία μπορεί να βρει λύση, ενώ γίνεται και μια σύντομη ιστορική αναδρομή της πορείας και των κομβικών της σημείων. Ακολούθως, πραγματοποιείται μια όσο το δυνατόν πιο εμπεριστατωμένη περιγραφή, μέσω χρήσης εκτεταμένης βιβλιογραφίας, σχεδιαγραμμάτων και λειτουργικών παραδειγμάτων των βασικότερων κλάδων της, όπως είναι η επιβλεπόμενη μάθηση (δέντρα αποφάσεων, νευρωνικά δίκτυα), η μη-επιβλεπόμενη μάθηση (συσταδοποίηση δεδομένων), καθώς και πιο εξειδικευμένων μορφών της, όπως είναι η ημί-επιβλεπόμενη μηχανική μάθηση και οι γενετικοί αλγόριθμοι. Επιπρόσθετα, σχεδιάζεται και υλοποιείται ένας νέος πιθανοτικός αλγόριθμος συσταδοποίησης (clustering) δεδομένων, ο οποίος ουσιαστικά αποτελεί ένα υβρίδιο ενός ιεραρχικού αλγορίθμου ομαδοποίησης και ενός αλγορίθμου διαμέρισης. Ο αλγόριθμος δοκιμάστηκε σε ένα πλήθος διαφορετικών συνόλων, πετυχαίνοντας αρκετά ενθαρρυντικά αποτελέσματα, συγκριτικά με άλλους γνωστούς αλγορίθμους, όπως είναι ο k-means και ο single-linkage. Πιο συγκεκριμένα, ο αλγόριθμος κατασκευάζει συστάδες δεδομένων, με μεγαλύτερη ομοιογένεια κατά πλειοψηφία σε σχέση με τους παραπάνω, ενώ το σημαντικότερο πλεονέκτημά του είναι ότι δεν χρειάζεται κάποια αντίστοιχη παράμετρο k για να λειτουργήσει. Τέλος, γίνονται προτάσεις τόσο για περαιτέρω βελτίωση του παραπάνω αλγορίθμου, όσο και για την ανάπτυξη νέων τεχνικών και μεθόδων, εναρμονισμένων με τις σύγχρονες τάσεις της αγοράς και προσανατολισμένων προς τις απαιτητικές ανάγκες της νέας, αναδυόμενης κοινωνίας της πληροφορίας. / In the spectrum of a new and emerging information society, the convergence of computers and telecommunication has led to a continuously increasing production and storage of huge amounts of data for almost any field of human engagement. So, if the data are recorded facts of human involvement, then information are the rules that govern them. And society depends on and looking earnestly for new information. All that remains is their discovery. The field of computer science, which deals with the systematic analysis of data in order to extract useful information, is called machine learning. In this light, therefore, this thesis discusses the machine learning as a hope of scientists to elucidate the structures that govern the data and discover and understand the rules that "move" the natural world. Firstly, a general description of machine learning, as one of the main components of artificial intelligence, is discussed, while presenting a variety of problems that machine learning can find solutions, as well as a brief historical overview of its progress. Secondly, a more detailed description of machine learning is presented by using extensive literature, diagrams, drawings and working examples of its major research areas, as is the supervised learning (decision trees, neural networks), the unsupervised learning (clustering algorithms) and more specialized forms, as is the semi-supervised machine learning and genetic algorithms. In addition to the above, it is planned and implemented a new probabilistic clustering algorithm, which is a hybrid of a hierarchical clustering algorithm and a partitioning algorithm. The algorithm was tested on a plurality of different datasets, achieving sufficiently encouraging results, as compared to other known algorithms, such as k-means and single-linkage. More specifically, the algorithm constructs data blocks, with greater homogeneity by majority with respect to the above, while the most important advantage is that it needs no corresponding parameter k to operate. Finally, suggestions are made in order to further improve the above algorithm, as well as to develop new techniques and methods in keeping with the current market trends, oriented to the demanding needs of this new, emerging information society.

Page generated in 0.062 seconds