Global ETD Search

1	Actionable Knowledge Discovery using Multi-Step Mining DharaniK, Kalpana Gudikandula 01 December 2012 (has links) Data mining at enterprise level operates on huge amount of data such as government transactions, banks, insurance companies and so on. Inevitably, these businesses produce complex data that might be distributed in nature. When mining is made on such data with a single-step, it produces business intelligence as a particular aspect. However, this is not sufficient in enterprise where different aspects and standpoints are to be considered before taking business decisions. It is required that the enterprises perform mining based on multiple features, data sources and methods. This is known as combined mining. The combined mining can produce patterns that reflect all aspects of the enterprise. Thus the derived intelligence can be used to take business decisions that lead to profits. This kind of knowledge is known as actionable knowledge. / Data mining is a process of obtaining trends or patterns in historical data. Such trends form business intelligence that in turn leads to taking well informed decisions. However, data mining with a single technique does not yield actionable knowledge. This is because enterprises have huge databases and heterogeneous in nature. They also have complex data and mining such data needs multi-step mining instead of single step mining. When multiple approaches are involved, they provide business intelligence in all aspects. That kind of information can lead to actionable knowledge. Recently data mining has got tremendous usage in the real world. The drawback of existing approaches is that insufficient business intelligence in case of huge enterprises. This paper presents the combination of existing works and algorithms. We work on multiple data sources, multiple methods and multiple features. The combined patterns thus obtained from complex business data provide actionable knowledge. A prototype application has been built to test the efficiency of the proposed framework which combines multiple data sources, multiple methods and multiple features in mining process. The empirical results revealed that the proposed approach is effective and can be used in the real world. Data mining actionable knowledge discovery multimethod mining multi-feature mining
2	Detecting Organizational Accounts from Twitter Based on Network and Behavioral Factors January 2017 (has links) abstract: With the rise of Online Social Networks (OSN) in the last decade, social network analysis has become a crucial research topic. The OSN graphs have unique properties that distinguish them from other types of graphs. In this thesis, five month Tweet corpus collected from Bangladesh - between June 2016 and October 2016 is analyzed, in order to detect accounts that belong to groups. These groups consist of official and non-official twitter handles of political organizations and NGOs in Bangladesh. A set of network, temporal, spatial and behavioral features are proposed to discriminate between accounts belonging to individual twitter users, news, groups and organization leaders. Finally, the experimental results are presented and a subset of relevant features is identified that lead to a generalizable model. Detection of tiny number of groups from large network is achieved with 0.8 precision, 0.75 recall and 0.77 F1 score. The domain independent network and behavioral features and models developed here are suitable for solving twitter account classification problem in any context. / Dissertation/Thesis / Masters Thesis Computer Science 2017 Artificial intelligence Computer science Classification Feature mining Outlier detection Social Network Analysis Twitter
3	Construction de lignes de produits logiciels par rétro-ingénierie de modèles de caractéristiques à partir de variantes de logiciels : l'approche REVPLINE / Reverse Engineering Feature Models From Software Variants to Build Software Product Lines : RIVEPLINE Approach Al-Msie' Deen, Ra'Fat 24 June 2014 (has links) Les lignes de produits logicielles constituent une approche permettant de construire et de maintenir une famille de produits logiciels similaires mettant en œuvre des principes de réutilisation. Ces principes favorisent la réduction de l'effort de développement et de maintenance, raccourcissent le temps de mise sur le marché et améliorent la qualité globale du logiciel. La migration de produits logiciels similaires vers une ligne de produits demande de comprendre leurs similitudes et leurs différences qui s'expriment sous forme de caractéristiques (features) offertes. Dans cette thèse, nous nous intéressons au problème de la construction d'une ligne de produits à partir du code source de ses produits et de certains artefacts complémentaires comme les diagrammes de cas d'utilisation, quand ils existent. Nous proposons des contributions sur l'une des étapes principales dans cette construction, qui consiste à extraire et à organiser un modèle de caractéristiques (feature model) dans un mode automatisé. La première contribution consiste à extraire des caractéristiques dans le code source de variantes de logiciels écrits dans le paradigme objet. Trois techniques sont mises en œuvre pour parvenir à cet objectif : l'Analyse Formelle de Concepts, l'Indexation Sémantique Latente et l'analyse des dépendances structurelles dans le code. Elles exploitent les parties communes et variables au niveau du code source. La seconde contribution s'attache à documenter une caractéristique extraite par un nom et une description. Elle exploite le code source mais également les diagrammes de cas d'utilisation, qui contiennent, en plus de l'organisation logique des fonctionnalités externes, des descriptions textuelles de ces mêmes fonctionnalités. En plus des techniques précédentes, elle s'appuie sur l'Analyse Relationnelle de Concepts afin de former des groupes d'entités d'après leurs relations. Dans la troisième contribution, nous proposons une approche visant à organiser les caractéristiques, une fois documentées, dans un modèle de caractéristiques. Ce modèle de caractéristiques est un arbre étiqueté par des opérations et muni d'expressions logiques qui met en valeur les caractéristiques obligatoires, les caractéristiques optionnelles, des groupes de caractéristiques (groupes ET, OU, OU exclusif), et des contraintes complémentaires textuelles sous forme d'implication ou d'exclusion mutuelle. Ce modèle est obtenu par analyse d'une structure obtenue par Analyse Formelle de Concepts appliquée à la description des variantes par les caractéristiques. L'approche est validée sur trois cas d'étude principaux : ArgoUML-SPL, Health complaint-SPL et Mobile media. Ces cas d'études sont déjà des lignes de produits constituées. Nous considérons plusieurs produits issus de ces lignes comme s'ils étaient des variantes de logiciels, nous appliquons notre approche, puis nous évaluons son efficacité par comparaison entre les modèles de caractéristiques extraits automatiquement et les modèles de caractéristiques initiaux (conçus par les développeurs des lignes de produits analysées). / The idea of Software Product Line (SPL) approach is to manage a family of similar software products in a reuse-based way. Reuse avoids repetitions, which helps reduce development/maintenance effort, shorten time-to-market and improve overall quality of software. To migrate from existing software product variants into SPL, one has to understand how they are similar and how they differ one from another. Companies often develop a set of software variants that share some features and differ in other ones to meet specific requirements. To exploit existing software variants and build a software product line, a feature model must be built as a first step. To do so, it is necessary to extract mandatory and optional features in addition to associate each feature with its name. Then, it is important to organize the mined and documented features into a feature model. In this context, our thesis proposes three contributions.Thus, we propose, in this dissertation as a first contribution a new approach to mine features from the object-oriented source code of a set of software variants based on Formal Concept Analysis, code dependency and Latent Semantic Indexing. The novelty of our approach is that it exploits commonality and variability across software variants, at source code level, to run Information Retrieval methods in an efficient way. The second contribution consists in documenting the mined feature implementations based on Formal Concept Analysis, Latent Semantic Indexing and Relational Concept Analysis. We propose a complementary approach, which aims to document the mined feature implementations by giving names and descriptions, based on the feature implementations and use-case diagrams of software variants. The novelty of our approach is that it exploits commonality and variability across software variants, at feature implementations and use-cases levels, to run Information Retrieval methods in an efficient way. In the third contribution, we propose an automatic approach to organize the mined documented features into a feature model. Features are organized in a tree which highlights mandatory features, optional features and feature groups (and, or, xor groups). The feature model is completed with requirement and mutual exclusion constraints. We rely on Formal Concept Analysis and software configurations to mine a unique and consistent feature model. To validate our approach, we applied it on three case studies: ArgoUML-SPL, Health complaint-SPL, Mobile media software product variants. The results of this evaluation validate the relevance and the performance of our proposal as most of the features and its constraints were correctly identified. Ingénierie des lignes de produits Variante de logiciel Réingénierie Identification de caractéristique Modèle de caractéristiques Variabilité Software Product Line Engineering Software Product Variants Re-Engineering Feature mining Feature location Feature model

1

Page generated in 0.0774 seconds