Global ETD Search

1	An Investigation of the Cost and Accuracy Tradeoffs of Supplanting AFDs with Bayes Network in Query Processing in the Presence of Incompleteness in Autonomous Databases January 2011 (has links) abstract: As the information available to lay users through autonomous data sources continues to increase, mediators become important to ensure that the wealth of information available is tapped effectively. A key challenge that these information mediators need to handle is the varying levels of incompleteness in the underlying databases in terms of missing attribute values. Existing approaches such as Query Processing over Incomplete Autonomous Databases (QPIAD) aim to mine and use Approximate Functional Dependencies (AFDs) to predict and retrieve relevant incomplete tuples. These approaches make independence assumptions about missing values--which critically hobbles their performance when there are tuples containing missing values for multiple correlated attributes. In this thesis, I present a principled probabilis- tic alternative that views an incomplete tuple as defining a distribution over the complete tuples that it stands for. I learn this distribution in terms of Bayes networks. My approach involves min- ing/"learning" Bayes networks from a sample of the database, and using it do both imputation (predict a missing value) and query rewriting (retrieve relevant results with incompleteness on the query-constrained attributes, when the data sources are autonomous). I present empirical studies to demonstrate that (i) at higher levels of incompleteness, when multiple attribute values are missing, Bayes networks do provide a significantly higher classification accuracy and (ii) the relevant possible answers retrieved by the queries reformulated using Bayes networks provide higher precision and recall than AFDs while keeping query processing costs manageable. / Dissertation/Thesis / M.S. Computer Science 2011 Read more Computer science Autonomous Databases Bayes Networks Incompleteness Uncertainty
2	Estudo comparativo avaliando três modalidades de diagnóstico médico parecer médico, buscas no Google e sistema especialista de apoio à decisão médica / Souza, Ademar Rosa de January 2020 (has links) Orientador: Luís Cuadrado Martin / Resumo: O conhecimento sobre qualquer patologia pode ser facilmente encontrado na internet, mas dificilmente encontra-se alguma ferramenta que faça a análise e o raciocínio entre os dados de um paciente e se obtenha o diagnóstico mais provável. Em nosso cotidiano, em virtude de uma maior demanda na área da saúde, existe uma necessidade crescente de diagnósticos médicos rápidos e precisos. Em virtude disso, foi elaborado um Sistema de Apoio à Decisão Médica com o intuito de otimizar e agilizar de forma confiável os diagnósticos médicos. A ideia é dar qualidade e agilidade à prática médica, adotando a tecnologia como ferramenta básica: “Quem tem mais informação, tem melhores condições para escolher e tomar decisões”. Na construção deste sistema, foram utilizados um banco de dados relacional (MySQL) e aplicadas técnicas de inteligência artificial, tais como: a construção de Árvores de Decisão, Aprendizado não supervisionado e a utilização das Redes de Bayes (onde estão envolvidos domínios de conhecimento com significativo grau de incerteza, como é o caso da área médica). Através da união destas técnicas, são feitas a seleção e classificação das doenças mais prováveis, onde as mesmas podem ser examinadas com mais detalhes pelo médico, garantindo assim uma maior segurança na escolha dos possíveis diagnósticos. Visando uma maior abrangência e rapidez na disseminação do conhecimento humano, o sistema foi disponibilizado via internet (www.danton.med.br). Para a concepção do projeto foi reali... (Resumo completo, clicar acesso eletrônico abaixo) / Abstract: The knowledge about any pathology can be easily found on the internet, but it is difficult to find any tool that makes the analysis and reasoning between the data of a patient and obtain the most probable diagnosis. In our daily lives, due to a greater demand in the health area, there is a growing need for fast and accurate medical diagnoses. As a result, a Medical Decision Support System was developed in order to reliably optimize and streamline medical diagnostics. The idea is to give quality and agility to medical practice, adopting technology as a basic tool: “Who has more information, has better conditions to choose and make decisions”. In the construction of this system, a relational database (MySQL) was used and artificial intelligence techniques were applied, such as: the construction of Decision Trees, Unsupervised Learning and the use of Bayes Networks (where knowledge domains are involved with significant degree of uncertainty, as is the case in the medical field). Through the union of these techniques, the selection and classification of the most probable diseases are made, where they can be examined in more detail by the doctor, thus ensuring greater security in the choice of possible diagnoses. Aiming at a greater scope and speed in the dissemination of human knowledge, the system was made available via internet (www.danton.med.br). To design the project, a prospective, randomized, crossover and open study was carried out; in which 3 groups of doctors (called gr... (Complete abstract click electronic access below) / Doutor Read more Medicina. Inteligência artificial. Redes de Bayes Arvores de decisão. Medicine Artificial intelligence Bayes networks Decision trees
3	Agrégation d'information pour la localisation d'un robot mobile sur une carte imparfaite / Information aggregation for the localization of a mobile robot using a non-perfect map Delobel, Laurent 04 May 2018 (has links) La plupart des grandes villes modernes mondiales souffrent des conséquences de la pollution et des bouchons. Une solution à ce problème serait de réglementer l'accès aux centres-villes pour les voitures personnelles en faveur d'un système de transports publics constitués de navettes autonomes propulsées par une énergie n'engendrant pas de pollution gazeuse. Celles-ci pourraient desservir les usagers à la demande, en étant déroutées en fonction des appels de ceux-ci. Ces véhicules pourraient également être utilisés afin de desservir de grands sites industriels, ou bien des sites sensibles dont l'accès, restreint, doit être contrôlé. Afin de parvenir à réaliser cet objectif, un véhicule devra être capable de se localiser dans sa zone de travail. Une bonne partie des méthodes de localisation reprises par la communauté scientifique se basent sur des méthodes de type "Simultaneous Localization and Mapping" (SLAM). Ces méthodes sont capables de construire dynamiquement une carte de l'environnement ainsi que de localiser un véhicule dans une telle carte. Bien que celles-ci aient démontré leur robustesse, dans la plupart des implémentations, le partage d'une carte commune entre plusieurs robots peut s'avérer problématique. En outre, ces méthodes n'utilisent fréquemment aucune information existant au préalable et construisent la carte de leur environnement à partir de zéro.Nous souhaitons lever ces limitations, et proposons d'utiliser des cartes de type sémantique, qui existent au-préalable, par exemple comme OpenStreetMap, comme carte de base afin de se localiser. Ce type de carte contient la position de panneaux de signalisation, de feux tricolores, de murs de bâtiments etc... De telles cartes viennent presque à-coup-sûr avec des imprécisions de position, des erreurs au niveau des éléments qu'elles contiennent, par exemple des éléments réels peuvent manquer dans les données de la carte, ou bien des éléments stockés dans celles-ci peuvent ne plus exister. Afin de gérer de telles erreurs dans les données de la carte, et de permettre à un véhicule autonome de s'y localiser, nous proposons un nouveau paradigme. Tout d'abord, afin de gérer le problème de sur-convergence classique dans les techniques de fusion de données (filtre de Kalman), ainsi que le problème de mise à l'échelle, nous proposons de gérer l'intégralité de la carte par un filtre à Intersection de Covariance Partitionnée. Nous proposons également d'effacer des éléments inexistant des données de la carte en estimant leur probabilité d'existence, calculée en se basant sur les détections de ceux-ci par les capteurs du véhicule, et supprimant ceux doté d'une probabilité trop faible. Enfin, nous proposons de scanner périodiquement la totalité des données capteur pour y chercher de nouveaux amers potentiels que la carte n'intègre pas encore dans ses données, et de les y ajouter. Des expérimentations montrent la faisabilité d'un tel concept de carte dynamique de haut niveau qui serait mise à jour au-vol. / Most large modern cities in the world nowadays suffer from pollution and traffic jams. A possible solution to this problem could be to regulate personnal car access into center downtown, and possibly replace public transportations by pollution-free autonomous vehicles, that could dynamically change their planned trajectory to transport people in a fully on-demand scenario. These vehicles could be used also to transport employees in a large industrial facility or in a regulated access critical infrastructure area. In order to perform such a task, a vehicle should be able to localize itself in its area of operation. Most current popular localization methods in such an environment are based on so-called "Simultaneous Localization and Maping" (SLAM) methods. They are able to dynamically construct a map of the environment, and to locate such a vehicle inside this map. Although these methods demonstrated their robustness, most of the implementations lack to use a map that would allow sharing over vehicles (map size, structure, etc...). On top of that, these methods frequently do not take into account already existing information such as an existing city map and rather construct it from scratch. In order to go beyond these limitations, we propose to use in the end semantic high-level maps, such as OpenStreetMap as a-priori map, and to allow the vehicle to localize based on such a map. They can contain the location of roads, traffic signs and traffic lights, buildings etc... Such kind of maps almost always come with some degree of imprecision (mostly in position), they also can be wrong, lacking existing but undescribed elements (landmarks), or containing in their data elements that do not exist anymore. In order to manage such imperfections in the collected data, and to allow a vehicle to localize based on such data, we propose a new strategy. Firstly, to manage the classical problem of data incest in data fusion in the presence of strong correlations, together with the map scalability problem, we propose to manage the whole map using a Split Covariance Intersection filter. We also propose to remove possibly absent landmarks still present in map data by estimating their probability of being there based on vehicle sensor detections, and to remove those with a low score. Finally, we propose to periodically scan sensor data to detect possible new landmarks that the map does not include yet, and proceed to their integration into map data. Experiments show the feasibility of such a concept of dynamic high level map that could be updated on-the-fly. Read more Sur-Convergence Filtre de Kalman Filtre à Intersection de Covariance Réseaux Bayésiens Gestion de l'Intégrité Overconvergence Kalman filter Covariance Intersection filter Dynamic Map Construction/Management Bayes Networks Integrity Management
4	Personalisierung im E-Commerce – zur Wirkung von E-Mail-Personalisierung auf ausgewählte ökonomische Kennzahlen des Konsumentenverhaltens Fassauer, Roland 26 May 2016 (has links) (PDF) Personalisierung ist ein wichtiger Bereich des Internet Marketings, zu dem es wenige experimentelle Untersuchungen mit großen Teilnehmerzahlen gibt. Für den erfolgreichen Einsatz von Empfehlungsverfahren sind umfangreiche Daten über das Käuferverhalten erforderlich. Diesen Problemstellungen nimmt sich die vorliegende Arbeit an. In ihr wird das Shop-übergreifende individuelle Käuferverhalten von bis zu 126.000 Newsletter-Empfängern eines deutschen Online-Bonussystems sowohl mittels ausgewählter Data-Mining-Methoden als auch experimentell untersucht. Dafür werden Prototypen eines Data-Mining-Systems, einer A/B-Test-Software-Komponente und einer Empfehlungssystem-Komponente entwickelt und im Rahmen des Data Minings und durch Online-Feldexperimente evaluiert. Dabei kann für die genannte Nutzergruppe in einem Experiment bereits mit einem einfachen Empfehlungsverfahren gezeigt werden, dass zum einen die Shop-übergreifenden individuellen Verhaltensdaten des Online-Bonus-Systems für die Erzeugung von Empfehlungen geeignet sind, und zum anderen, dass die dadurch erzeugten Empfehlungen zu signifikant mehr Bestellungen als bei der besten Empfehlung auf Basis durchschnittlichen Käuferverhaltens führten. In weiteren Experimenten im Rahmen der Evaluierung der A/B-Test-Komponente konnte gezeigt werden, dass absolute Rabattangebote nur dann zu signifikant mehr Bestellungen führten als relative Rabatt-Angebote, wenn sie mit einer Handlungsaufforderung verbunden waren. Die Arbeit ordnet sich damit in die Forschung zur Beeinflussung des Käuferverhaltens durch Personalisierung und durch unterschiedliche Rabatt-Darstellungen ein und trägt die genannten Ergebnisse und Artefakte bei. Read more A/B-Test absolute Rabatte Affiliate Marketing Bayes’sche Netze Bonus-System Cashback Data Mining E-Commerce E-Mail Empfehlungssysteme Käuferverhalten Konsumentenverhalten Newsletter Online-Feldexperiment Online Marketing Personalisierung Price Framing Rabatt Rabattdarstellung relative Rabatte A/B test affiliate marketing bayes networks cashback consumer behaviour data mining ecommerce email online marketing price framing recommender recommender systems split-run test ddc:330
5	Personalisierung im E-Commerce – zur Wirkung von E-Mail-Personalisierung auf ausgewählte ökonomische Kennzahlen des Konsumentenverhaltens: Personalisierung im E-Commerce – zur Wirkung von E-Mail-Personalisierung auf ausgewählte ökonomische Kennzahlen des Konsumentenverhaltens Fassauer, Roland 29 April 2016 (has links) Personalisierung ist ein wichtiger Bereich des Internet Marketings, zu dem es wenige experimentelle Untersuchungen mit großen Teilnehmerzahlen gibt. Für den erfolgreichen Einsatz von Empfehlungsverfahren sind umfangreiche Daten über das Käuferverhalten erforderlich. Diesen Problemstellungen nimmt sich die vorliegende Arbeit an. In ihr wird das Shop-übergreifende individuelle Käuferverhalten von bis zu 126.000 Newsletter-Empfängern eines deutschen Online-Bonussystems sowohl mittels ausgewählter Data-Mining-Methoden als auch experimentell untersucht. Dafür werden Prototypen eines Data-Mining-Systems, einer A/B-Test-Software-Komponente und einer Empfehlungssystem-Komponente entwickelt und im Rahmen des Data Minings und durch Online-Feldexperimente evaluiert. Dabei kann für die genannte Nutzergruppe in einem Experiment bereits mit einem einfachen Empfehlungsverfahren gezeigt werden, dass zum einen die Shop-übergreifenden individuellen Verhaltensdaten des Online-Bonus-Systems für die Erzeugung von Empfehlungen geeignet sind, und zum anderen, dass die dadurch erzeugten Empfehlungen zu signifikant mehr Bestellungen als bei der besten Empfehlung auf Basis durchschnittlichen Käuferverhaltens führten. In weiteren Experimenten im Rahmen der Evaluierung der A/B-Test-Komponente konnte gezeigt werden, dass absolute Rabattangebote nur dann zu signifikant mehr Bestellungen führten als relative Rabatt-Angebote, wenn sie mit einer Handlungsaufforderung verbunden waren. Die Arbeit ordnet sich damit in die Forschung zur Beeinflussung des Käuferverhaltens durch Personalisierung und durch unterschiedliche Rabatt-Darstellungen ein und trägt die genannten Ergebnisse und Artefakte bei.:1 Inhalt 1 Einleitung 1 1.1 Stand der Forschung 3 1.2 Forschungsbedarf 6 1.3 Forschungskonzept 8 1.4 Verwendete Methoden 11 1.5 Aufbau der Arbeit 11 2 Theoretische und konzeptionelle Grundlagen 13 2.1 Internethandel, E-Commerce und E-Business 13 2.2 Marketing, Konsumenten- und Käuferverhalten 16 2.2.1 Käuferverhalten bei Rabatt-Angeboten 20 2.3 Internet Marketing 21 2.3.1 Erfolgskontrolle im Internet Marketing 24 2.3.2 Ausgewählte Disziplinen des Internet Marketings 27 2.3.2.1 Affiliate Marketing 28 2.3.2.2 Online-Cashback-Systeme 35 2.3.2.3 E-Mail-Marketing 38 2.4 Personalisierung im Internet Marketing 56 2.4.1 Empfehlungssysteme 59 2.4.2 Bewertung von Empfehlungssystemen 59 2.4.3 Architektur von Empfehlungssystemen 60 2.4.4 Empfehlungssystem-Kategorien 62 2.4.4.1 Hybride Empfehlungssysteme 67 2.4.5 Techniken für Empfehlungsverfahren 69 2.5 Wissensaufbereitung und -entdeckung 89 2.5.1 Datenerhebungsverfahren 89 2.5.1.1 Datenqualität 91 2.5.1.2 Datensicherheit und Datenschutz 92 2.5.2 Knowledge Discovery und Data Mining 94 2.5.2.1 Der Data-Mining-Prozess 96 2.5.2.2 Data-Mining-Problemtypen 98 2.5.2.3 Das Data-Mining-System 100 2.5.3 Das Experiment als Erhebungsdesign 106 2.5.3.1 Anforderungen und Gütekriterien 111 2.5.3.2 Online-Feldexperimente im Marketing 117 2.5.3.3 Auswertungsverfahren 120 2.5.3.4 Theoretische Grundlagen des A/B-Testverfahrens 121 3 Vorgehen 126 3.1 Forschungsdesign 126 3.1.1.1 Ziele und Anforderungen der Andasa GmbH 128 3.1.1.2 Ziele und Anforderungen des Instituts für Angewandte Informatik 129 3.1.2 Design des Informationssystems 130 3.1.2.1 Der Designprozess 131 3.1.3 Konzeption des Software-Systems 133 3.1.4 Evaluation 134 3.2 Datenanalyse 135 3.2.1 Datenbeschaffung 135 3.2.2 Datenaufbereitung 136 3.2.3 Auswahl geeigneter Data-Mining-Methoden 137 3.2.3.1 Auswahl-Kriterien 137 3.2.3.2 Methodenauswahl 140 3.2.4 Erläuterung ausgewählter Data-Mining-Methoden 156 3.2.4.1 Bayes’sche Netze 156 3.2.4.2 Clustering 158 3.2.4.3 Diskriminanzanalyse 158 3.2.4.4 Korrelationsanalyse 159 3.2.4.5 Online Analytical Processing (OLAP) 159 3.2.5 Auswahl geeigneter Data-Mining-Werkzeuge 165 3.2.5.1 Auswahlprozess 165 3.2.5.2 Kriterien 166 3.2.5.3 Werkzeuge zur statistischen Analyse und Visualisierung 168 3.2.5.4 Werkzeuge für Clustering und Diskriminanzanalyse 168 3.2.5.5 Werkzeuge für Online Analytical Processing 169 3.2.5.6 Werkzeuge für Bayes’sche Netze 169 3.3 Untersuchungsdesign 171 3.3.1 Online-Marketing-Instrumente bei Andasa 172 3.3.2 Stimulus-Auswahl 174 3.3.3 Entwurf des Experimentaldesigns 175 4 Umsetzung 180 4.1 Architektur und prototypische Implementation 180 4.1.1 Das Data-Mining-System 180 4.1.2 Der ETL-Prozess 181 4.1.2.1 Datenerhebung 183 4.1.2.2 Datenbereinigung 184 4.1.3 Die A/B-Testumgebung 185 4.1.4 Das Empfehlungssystem 189 4.1.5 Usability-Evaluation 196 4.2 Data Mining 199 4.2.1 Statistische Analyse 200 4.2.2 Anwendung ausgewählter Data-Mining-Methoden 206 4.2.2.1 Clustering 208 4.2.2.2 Klassifikation 213 4.2.2.3 Modellierung als Bayes’sche Netze 214 4.2.3 Ergebnisse und Evaluation 221 4.3 Feldexperimente mit Newslettern 222 4.3.1 Eckdaten der Tests 223 4.3.2 Beispiel-Experimente 224 4.3.3 A/B-Tests Rabattdarstellungen 226 4.3.3.1 Öffnungsrate Prozente vs. Euro 226 4.3.3.2 Klickrate Prozente vs. Euro 227 4.3.3.3 Conversion-Rate Prozente vs. Euro 229 4.3.4 A/B-Test zur Personalisierung 230 4.3.4.1 Auswahl des Empfehlungsverfahrens 230 4.3.4.2 Definition der Kontrollgruppe 231 4.3.4.3 Operative Durchführung 231 4.3.4.4 Auswertung 232 4.3.5 Ergebnisse und Evaluation 236 5 Zusammenfassung und Ausblick 239 6 Anhang 243 6.1 Anhang A Usability-Evaluation 243 6.1.1 Methoden der Usability-Evaluierung 246 6.1.1.1 Usability-Tests und lautes Denken 246 6.1.1.2 Benutzerbefragung 248 6.1.1.3 Feldstudien und Partizipation 250 6.1.1.4 Expertenorientierte (Inspektions-)Methoden 251 6.1.1.5 Formal-analytische Verfahren 252 6.1.1.6 Quantitative Fragebogen 252 6.1.1.7 Verfahrensmodell 259 6.1.1.8 Auswertung 262 6.1.2 Fragebögen 263 6.2 Anhang B Zeitreihenanalyse 281 6.2.1 Klassische Komponentenmodelle 281 6.2.2 Stochastische Prozesse 282 6.2.3 Fourier-Analyse-Methoden (Spektralanalyse) 283 6.3 Anhang C Daten und Programme 286 6.3.1 Technische Daten 286 6.3.1.1 Data Warehouse / Data Mining Server 286 6.3.2 Programm- und Skriptcodes 287 6.3.2.1 R- Skripte 287 6.3.2.2 SQL – Skripte 296 6.3.2.3 C# Code MostRecentLinkInvocationsShopRecommender.cs 314 6.3.3 Daten A/B-Tests 317 6.3.3.1 Übersicht Newsletter 317 6.3.3.2 Mengengerüst Aussendungen 319 6.3.3.3 Shopaufrufe und Besteller 319 6.3.3.4 Darstellungen der Newsletter-Varianten 320 6.3.4 Daten Personalisierung 335 6.4 Abbildungsverzeichnis 338 6.5 Tabellenverzeichnis 343 6.6 Literaturverzeichnis 346 Read more info:eu-repo/classification/ddc/330 ddc:330

1

Page generated in 0.035 seconds