Global ETD Search

11	Sentimental Bi-Partite Graph Of Political Blogs January 2012 (has links) abstract: Analysis of political texts, which contains a huge amount of personal political opinions, sentiments, and emotions towards powerful individuals, leaders, organizations, and a large number of people, is an interesting task, which can lead to discover interesting interactions between the political parties and people. Recently, political blogosphere plays an increasingly important role in politics, as a forum for debating political issues. Most of the political weblogs are biased towards their political parties, and they generally express their sentiments towards their issues (i.e. leaders, topics etc.,) and also towards issues of the opposing parties. In this thesis, I have modeled the above interactions/debate as a sentimental bi-partite graph, a bi-partite graph with Blogs forming vertices of a disjoint set, and the issues (i.e. leaders, topics etc.,) forming the other disjoint set,and the edges between the two sets representing the sentiment of the blogs towards the issues. I have used American Political blog data to model the sentimental bi- partite graph, in particular, a set of popular political liberal and conservative blogs that have clearly declared positions. These blogs contain discussion about social, political, economic issues and related key individuals in their conservative/liberal view. To be more focused and more polarized, 22 most popular liberal/conservative blogs of a particular time period, May 2008 - October 2008(because of high intensity of debate and discussions), just before the presidential elections, was considered, involving around 23,800 articles. This thesis involves solving the questions: a) which is the most liberal/conservative blogs on the web? b) Who is on which side of debate and what are the issues? c) Who are the important leaders? d) How do you model the relationship between the participants of the debate and the underlying issues? / Dissertation/Thesis / M.S. Computer Science 2012 Computer science Blogosphere DataMining Information Extraction Political blogs Sentimental bi-partite graph
12	Desarrollo de técnicas de computación evolutiva para soporte en minería de datos y texto Cecchini, Rocío L. 13 April 2010 (has links) La obtención de información a partir de un conjunto de datos o minería de datos es una tarea compleja que involucra varias etapas, tal como sucede en la minería de texto. Esta puede ser considerada como un caso particular de minería de datos donde los datos contemplan la incorporación de texto. Ambos procesos de minería se vuelven aun más complejos cuando nos encontramos ante grandes cúmulos de datos o texto. Es común encontrar conjuntos de datos grandes, complejos y ricos en información en áreas como medicina, comercio, ingeniería y ciencias de la computación. Simultáneamente, los avances tecnológicos han dado lugar a la acumulación de sustanciosas cantidades de documentos, artículos y texto; el ejemplo más contundente de esta clase de material es la Web, la cual se estima que alcanza más de 8.05 billones de páginas. La propuesta de esta tesis es el uso de herramientas evolutivas mono- y multi-objetivo como un soporte para algunas de las etapas de este proceso. En particular, las etapas que implican optimización y búsqueda dentro de estos grandes espacios en los cuales otros métodos serían inviables. A lo largo de la investigación se desarrollaron, evaluaron y compararon algoritmos evolutivos mono y multi-objetivo tanto para la rama de minería de datos como para la rama de minería de texto. Como caso particular dentro de minería de datos, se contempló el problema de encontrar las relaciones más relevantes entre variables dentro de distintos conjuntos de datos. Dichas relaciones, no son visibles para un experto cuando se encuentra frente a la base de datos original cruda, la cual puede contemplar miles de variables y miles de instan-cias. Para resolver este problema se propuso una metodología de dos fases. Los algoritmos desarrollados en este contexto se integraron a la primera fase de la arquitectura y fueron exitosamente utilizados como mecanismo de búsqueda masiva. Por otra parte, en el caso de minería de texto se abordó el problema de recuperar información relacionada y novedosa con respecto a un tópico de interés. Para este problema se propuso, implementó y evaluó una arquitectura que, partiendo de una descripción para el tópico de interés, evoluciona varios conjuntos de términos hacia conjuntos que logren obtener mejores documentos con respecto a dicho tema de interés y con respecto a los objetivos propuestos (por ejemplo: simi-litud, precisión, cobertura). Dentro de las técnicas evolutivas multi-objetivo propuestas, se diseñaron adaptaciones de los algoritmos basados en Pareto más prometedores reportados por la literatura y se propusieron versiones multi-objetivo agregativas. Ambos enfoques, los basados en Pareto y los agregativos, demostraron ser claramente competentes tanto para minería de datos como para minería de texto. / Data mining comprises the capture of information from data, which is a complex task that involves many stages. The same applies to text mining that can be considered as a special case of data mining where the data include text. As data and text sets increase, both mining processes become even more complicated. Large, complex and rich information data sets arise in many common research elds like medicine, commerce, engineering and computer science. Simultaneously, techno-logical advances have led to theaccumulation of substantial amounts of documents, articles and text; the clearest example of this kind of material is the Web, which is estimated to have reached more than 8.05 billion pages. This thesis proposes the use of mono- and multi-objective evolutionary tools as support in some of the stages of the data and text mining processes. In particular, those stages which imply optimiza-tion and search in wide search spaces where other methods could be unfeasible. In this research work, several mono- and multi-objective evolutionary algorithms were developed, evaluated and compared for both, data and text mining research areas. As a particular case in data mining, the problem of finding the most relevant relationship among variables from the data was considered. These relations, are not obvious for experts when they are faced with the original raw database, which can include thousands of variables and thousand of samples. In order to solve this problem, a two-phase methodology was proposed. In this context, the developed algorithms were integrated into the first phase and were succesfully used as massive search mechanisms. On the other hand, as a particular case of the text mining research area, the problem of retrieving novel material that is related to a search context was considered. In order to overcome this problem, an architecture was proposed, implemented and evaluated. Starting from a description for the topic of interest, this architecture evolves several sets of terms towards sets which can obtain better documents with respect to both, the topic of interest and the proposed objectives (e.g., similarity, precision, recall). Among the proposed multi-objetive evolutionary techniques, adap-tations of the more promising reported Pareto-based evolutionary algorithms were designed and new multi-objective aggregative schemes were proposed. Both approaches- i.e., the Pareto-based strategy and the aggregative techniques- proved to be clearly competent for both research areas: data and text mining. computación evolutiva minería de datos minería de texto evolutionary computation datamining text mining
13	Clustering of Distributed Word Representations and its Applicability for Enterprise Search Korger, Christina 04 October 2016 (has links) (PDF) Machine learning of distributed word representations with neural embeddings is a state-of-the-art approach to modelling semantic relationships hidden in natural language. The thesis “Clustering of Distributed Word Representations and its Applicability for Enterprise Search” covers different aspects of how such a model can be applied to knowledge management in enterprises. A review of distributed word representations and related language modelling techniques, combined with an overview of applicable clustering algorithms, constitutes the basis for practical studies. The latter have two goals: firstly, they examine the quality of German embedding models trained with gensim and a selected choice of parameter configurations. Secondly, clusterings conducted on the resulting word representations are evaluated against the objective of retrieving immediate semantic relations for a given term. The application of the final results to company-wide knowledge management is subsequently outlined by the example of the platform intergator and conceptual extensions." maschinelles Lernen word2vec Datamining Unternehmenssuche Clustering verteilte Wortrepräsentationen Sprachmodelle machine learning word2vec datamining enterprise search distributed word representations neural word embeddings language models ddc:004 rvk:ST 237 rvk:ST 265 rvk:ST 270
14	Clustering of Distributed Word Representations and its Applicability for Enterprise Search Korger, Christina 18 August 2016 (has links) Machine learning of distributed word representations with neural embeddings is a state-of-the-art approach to modelling semantic relationships hidden in natural language. The thesis “Clustering of Distributed Word Representations and its Applicability for Enterprise Search” covers different aspects of how such a model can be applied to knowledge management in enterprises. A review of distributed word representations and related language modelling techniques, combined with an overview of applicable clustering algorithms, constitutes the basis for practical studies. The latter have two goals: firstly, they examine the quality of German embedding models trained with gensim and a selected choice of parameter configurations. Secondly, clusterings conducted on the resulting word representations are evaluated against the objective of retrieving immediate semantic relations for a given term. The application of the final results to company-wide knowledge management is subsequently outlined by the example of the platform intergator and conceptual extensions.":1 Introduction 1.1 Motivation 1.2 Thesis Structure 2 Related Work 3 Distributed Word Representations 3.1 History 3.2 Parallels to Biological Neurons 3.3 Feedforward and Recurrent Neural Networks 3.4 Learning Representations via Backpropagation and Stochastic Gradient Descent 3.5 Word2Vec 3.5.1 Neural Network Architectures and Update Frequency 3.5.2 Hierarchical Softmax 3.5.3 Negative Sampling 3.5.4 Parallelisation 3.5.5 Exploration of Linguistic Regularities 4 Clustering Techniques 4.1 Categorisation 4.2 The Curse of Dimensionality 5 Training and Evaluation of Neural Embedding Models 5.1 Technical Setup 5.2 Model Training 5.2.1 Corpus 5.2.2 Data Segmentation and Ordering 5.2.3 Stopword Removal 5.2.4 Morphological Reduction 5.2.5 Extraction of Multi-Word Concepts 5.2.6 Parameter Selection 5.3 Evaluation Datasets 5.3.1 Measurement Quality Concerns 5.3.2 Semantic Similarities 5.3.3 Regularities Expressed by Analogies 5.3.4 Construction of a Representative Test Set for Evaluation of Paradigmatic Relations 5.3.5 Metrics 5.4 Discussion 6 Evaluation of Semantic Clustering on Word Embeddings 6.1 Qualitative Evaluation 6.2 Discussion 6.3 Summary 7 Conceptual Integration with an Enterprise Search Platform 7.1 The intergator Search Platform 7.2 Deployment Concepts of Distributed Word Representations 7.2.1 Improved Document Retrieval 7.2.2 Improved Query Suggestions 7.2.3 Additional Support in Explorative Search 8 Conclusion 8.1 Summary 8.2 Further Work Bibliography List of Figures List of Tables Appendix info:eu-repo/classification/ddc/004 ddc:004
15	Image Classification for Remote Sensing Using Data-Mining Techniques Alam, Mohammad Tanveer 11 August 2011 (has links) No description available. Computer Science Geographic Information Science Remote Sensing Image Classification Remote Sensing Datamining unsupervised classification supervised classification LANDSAT IKONOS
16	Extraction automatique de connaissances pour la décision multicritère Plantié, Michel 29 September 2006 (has links) (PDF) Cette thèse, sans prendre parti, aborde le sujet délicat qu'est l'automatisation cognitive. Elle propose la mise en place d'une chaîne informatique complète pour supporter chacune des étapes de la décision. Elle traite en particulier de l'automatisation de la phase d'apprentissage en faisant de la connaissance actionnable--la connaissance utile à l'action--une entité informatique manipulable par des algorithmes.<br />Le modèle qui supporte notre système interactif d'aide à la décision de groupe (SIADG) s'appuie largement sur des traitements automatiques de la connaissance. Datamining, multicritère et optimisation sont autant de techniques qui viennent se compléter pour élaborer un artefact de décision qui s'apparente à une interprétation cybernétique du modèle décisionnel de l'économiste Simon. L'incertitude épistémique inhérente à une décision est mesurée par le risque décisionnel qui analyse les facteurs discriminants entre les alternatives. Plusieurs attitudes dans le contrôle du risque décisionnel peuvent être envisagées : le SIADG peut être utilisé pour valider, vérifier ou infirmer un point de vue. Dans tous les cas, le contrôle exercé sur l'incertitude épistémique n'est pas neutre quant à la dynamique du processus de décision. L'instrumentation de la phase d'apprentissage du processus décisionnel conduit ainsi à élaborer l'actionneur d'une boucle de rétroaction visant à asservir la dynamique de décision. Notre modèle apporte un éclairage formel des liens entre incertitude épistémique, risque décisionnel et stabilité de la décision.<br />Les concepts fondamentaux de connaissance actionnable (CA) et d'indexation automatique sur lesquels reposent nos modèles et outils de TALN sont analysés. La notion de connaissance actionnable trouve dans cette vision cybernétique de la décision une interprétation nouvelle : c'est la connaissance manipulée par l'actionneur du SIADG pour contrôler la dynamique décisionnelle. Une synthèse rapide des techniques d'apprentissage les plus éprouvées pour l'extraction automatique de connaissances en TALN est proposée. Toutes ces notions et techniques sont déclinées sur la problématique spécifique d'extraction automatique de CAs dans un processus d'évaluation multicritère. Enfin, l'exemple d'application d'un gérant de vidéoclub cherchant à optimiser ses investissements en fonction des préférences de sa clientèle reprend et illustre le processus informatisé dans sa globalité. Décision Système d'Aide à la Décision Gestion des Connaissances Connaissance Actionnable Fusion d'Informations Explication Argumentation Risque décisionnel Text-Mining Datamining TALN Classification Indexation automatique
17	VISTREE: uma linguagem visual para análise de padrões arborescentes e para especificação de restrições em um ambiente de mineração de árvores Felício, Crícia Zilda 25 March 2008 (has links) The frequent pattern mining in data represented by more complex structures like trees and graphs are growing lately. Among the reasons for this improvement is the fact that the tree and graph patterns has more information than sequential patterns, besides there is the possibility of usage of this type of mining in several areas like XML Mining,Web Mining and Bioinformatic. A problem that occurs in mining patterns in general is the great amount of patterns generated. Being some of them not interesting for users. The decrease in the quantity of patterns generated can be done restricting the patterns types produced through the user constraint. Even incorporating constraints in the mining process, the quantity of tree pattern mined is large, what make necessary one tool for pattern analysis, possibiliting the user specify queries to extract in the mass of mined patterns that satisfy the criteria of the selection in the query. The pattern mining with constraint, aim to obtain as a result of the process of mining only the patterns with the real interest for the user. The constraint about patterns will be represented related to the structure of them. One form to represent the sequential pattern mining would be through regular expressions, for the tree pattern mining, the tree automata. The use of constraints solve the problem to generate a large amout of patterns, but the mechanism used to represent the constraint is still constituted in another problem that would be the difficult for a user do the input of constraint using this mechanism. The queries about frequent patterns are made according to the characteristics of the data. One way to extract specific patterns in data structured like trees is to store the specific patterns in a XML file and make queries using one of the query languages for XML files. Among the XML query languages, the XQuery language is very used, mainly by the fact that it s similar in semantic to SQL, the query language for databases. The frequently patterns queries could be made using this language, but, for this the user would have to know and be capable to express queries through it. In this research it will be presented the visual language VisTree that consists of visual tool to be used in a phase of preprocess for specification the user preferences that involves the format of the tree pattern that are interested to him, as in a phase of postprocess to analyze the mined patterns. The VisTree sintaxe is based on in a fragment of the Tree Pattern language[Chen et al. 2003, Che and Liu 2005], the core of XPath 1.0 [Clark and Derose 1999, Olteanu et al. 2002]. However, the semantic of VisTree differs from the semantic of these languages in the sense that VisTree queries return the sets of tree patterns. VisTree uses a XQuery language [Chamberlin 2003, Katz et al. 2003] like query process mechanism: the visual queries specified in VisTree are mapped in XQuery queries and theirs responses are adapted to fit the format returned by VisTree. VisTree works like a XQuery front-end. A complete system of mining tree pattern was developed to test and validate the use of VisTree language in specific contexts of applications. The system was made in a modular form, in a way to allow that new applications could be incorporated in a simple way. This research show the application of tree mining with constraint in the areas of XML Mining andWeb Mining through study case. In both applications, the system use the VisTree language in the preprocess modules (constraint input) and analysis of patterns (query input). / A mineração de padrões freqüentes em dados representados por estruturas mais complexas como árvores e grafos vêm crescendo muito nos últimos tempos. Entre as razões para esse crescimento está o fato do padrão arborescente ou em forma de grafo possuir mais informações do que os padrões seqüenciais, e na possibilidade de aplicação desse tipo de mineração em várias áreas como XML Mining, Web Mining e Bioinformática. Um problema que ocorre na mineração de padrões em geral é a grande quantidade de padrões gerados; sendo que muitos deles nem são do interesse do usuário. A diminuição da quantidade de padrões gerados pode ser feita restringido o tipo de padrão produzido através de especificações do usuário. Mesmo incorporando restrições no processo de mineração, a quantidade de padrões arborescentes minerados é grande, o que torna necessário uma ferramenta de análise dos padrões, possibilitando ao usuário especificar consultas para extrair da massa de padrões minerados aqueles que satisfazem os critérios de seleção da consulta. A mineração de padrões com restrição, visa obter como resultado de um processo de mineração apenas os padrões de real interesse do usuário. Uma restrição sobre padrões será representada de acordo com a estrutura dos mesmos. Para a mineração de padrões seqüencias uma forma de representá-la seria através de expressões regulares, para a mineração de padrões arborescentes, os autômatos de árvore. O uso de restrições resolve o problema da geração de uma grande quantidade de padrões, mas o mecanismo usado para representar a restrição ainda se constitui em um outro problema que seria a dificuldade de um usuário em fazer a entrada da restrição utilizando esse mecanismo. As consultas sobre padrões freqüentes são feitas de acordo com as características dos dados. Uma forma de extrair padrões específicos em dados estruturados como árvores é armazenar os padrões freqüentes em um documento XML e efetuar uma consulta usando uma das linguagens de consulta a documentos XML. Dentre as linguagens de consulta XML, a linguagem XQuery é muito utilizada, principalmente pelo fato de ser similar semanticamente a SQL (linguaguem de consulta a banco de dados). A consulta aos padrões freqüentes poderia então ser feita utilizando essa linguagem, mas para isso o usuário teria que conhecer e ser capaz de expressar sua consulta através dela. Nesse trabalho é apresentada a linguagem visual VisTree, que consiste em uma ferramenta visual a ser utilizada tanto numa fase de Pré-processamento para a especificação das preferências do usuário no que se refere ao formato dos padrões arborescentes que lhe interessa, quanto numa fase de pós-processamento para a análise dos padrões minerados. A sintaxe da VisTree se baseia na sintaxe de um fragmento simples da linguagem Tree Pattern [Miklau and Suciu 2004, Chen et al. 2003], na qual a linguagem XPath 1.0 [Clark and Derose 1999, Olteanu et al. 2002] também se baseou. Entretanto, a semântica de VisTree difere da semântica destas linguagens no sentido de que consultas de VisTree retornam conjuntos de padrões arborescentes. A VisTree utiliza a linguagem XQuery [Chamberlin 2003, Katz et al. 2003] como mecanismo de processamento de consultas: as consultas visuais especificadas em VisTree são mapeadas em consultas da XQuery e suas respostas adaptadas para se adequarem ao formato retornado por VisTree. Um sistema completo de mineração de padrões arborescentes foi desenvolvido para testar e validar o uso da linguagem VisTree em contextos específicos de aplicações. O sistema foi construído de forma modular para que novas aplicações possam ser incorporadas de maneira simples. A aplicação de mineração de árvores com restrição nas áreas de XML Mining e Web Mining foi feita através de um estudo de caso. Nas duas aplicações, o sistema utiliza a linguagem VisTree nos módulos que fazem a tarefa de Pré-Processamento (entrada da restrição) e de Análise de Padrões (entrada da consulta). / Mestre em Ciência da Computação Datamining Tree mining Constraint-based tree mining Web mining XML mining Mineração de árvores Mineração de árvores com restrição Banco de dados Mineração de dados (Computação)
18	Databázová nezávislost jádra systému pro dolování z dat FIT-Miner / Data Independency of the FIT-Miner Data Mining System Novák, Ondřej January 2013 (has links) System for data mining Fit-Miner is now dependant on only one specific DBMS. This master’s thesis deals with analysis of implementation that works with database, modules and functions for data mining. Next it shows the set of changes which will allow FIT-Miner to work with another DBMS. And finally, a description of the implementation of these changes.
19	Systém pro testování obchodní strategie / System for Testing of Business Strategy Lanc, Martin January 2008 (has links) Aim of this thesis is to introduce questions about trading stocks on global stock exchange. It shows up basics ideas, which are necessary to understand the system of trading stocks, building a bussines strategy and its automatization by simple information technology techniques. In the following, there is a description of concept and implementation of business system for testing a trading strategy, which is based on historical market data analysis. The next part of this work is focused on the demonstration system and its expansion possibilities. Whole aplication is created by means of scripting language PHP and Javascript, markup language HTML, using the MySQL database system.
20	Privacy Within Photo-Sharing and Gaming Applications: Motivation and Opportunity and the Decision to Download Hopkins, Ashley R. 20 September 2019 (has links) No description available. Journalism mobile applications apps smartphone social media privacy data datamining data mine Android Apple privacy permissions terms and conditions motivation opportunity decision-making gaming photo-sharing Facebook MODE model

Search results