• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 110
  • 76
  • 13
  • 8
  • 4
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 257
  • 257
  • 82
  • 81
  • 69
  • 44
  • 40
  • 39
  • 37
  • 37
  • 36
  • 32
  • 28
  • 27
  • 27
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
241

The role of process conformance and developers' skills in the context of test-driven development

Fucci, D. (Davide) 26 April 2016 (has links)
Abstract Modern software development must adapt to demanding schedules while keeping the software at a high level of quality. Agile software development has been adopted in recent years to meet such a need. Test-driven development (TDD) is one practice that has arisen within the agile software development movement that leverages unit tests to develop software in incremental cycles. TDD supporters claim that the practice increases the productivity of the practitioners who employ it, as well as the internal and external quality of the software they develop. In order to validate or refute such claims, the software engineering research community has studied TDD for the last decade; the results of the empirical studies on the effects of TDD have been mostly inconclusive. This dissertation has studied two factors that may impact the manifestation of the claimed effects of TDD on software’s external quality and developers’ productivity: the developers’ conformance to the process (i.e., their ability to follow TDD) and their skills. The research was performed in four phases. In the first phase, the literature was reviewed to identify a set of factors that have been considered to affect TDD. In the second phase, two experiments were executed within academia. A total of 77 students at the University of Oulu, took part in the studies. These experiments investigated the quality of the software, as well as the subject’s productivity with respect to their programming and testing skills. A follow-up study, using data collected during the second experiment, explored the relation between the quality, productivity and the subjects’ process conformance. In the third phase, four industrial experiments, involving 30 professional, were performed. Process conformance and skills were investigated in relation to the TDD’s effects on external quality and productivity. The fourth phase synthesized the evidence gathered in the two previous phases. The results show that TDD is not associated with improvements in external quality, or developers’ productivity. Further, improvements in both external quality and productivity are associated with skills, rather than with the process, at least in the case of professional developers. Hence, process conformance has a negligible impact. The productivity of novice developers, on the other hand, can benefit from the test-first approach promoted by TDD. / Tiivistelmä Modernin ohjelmistokehityksen täytyy mukautua haastaviin aikatauluihin säilyttäen ohjelmistojen korkea laatu. Ketterä ohjelmistokehitys on viime vuosina omaksuttu tähän tarpeeseen ja suuntauksessa on saanut alkunsa testivetoisen kehityksen käytäntö, joka hyödyntää yksikkötestausta ohjelmiston inkrementaalisessa, syklisessä kehityksessä. Testivetoisen kehityksen puolestapuhujat väittävät tämän käytännön lisäävän ohjelmistokehittäjien tuottavuutta sekä ohjelmiston sisäistä ja ulkoista laatua. Ohjelmistotuotannon tutkimusyhteisö on tutkinut testivetoista kehitystä viimeisen vuosikymmenen aikana vahvistaakseen tai kumotakseen nämä väitteet. Empiiriset tutkimukset testivetoisen kehityksen vaikutuksista ohjelmistotuotantoon ovat suurelta osin tuloksettomia. Tämä väitöstyö tutkii kahta tekijää, jotka voivat vaikuttaa testivetoisen kehityksen väitettyjen vaikutusten ilmentymiseen ohjelmiston ulkoisena laatuna ja ohjelmistokehittäjien tuottavuutena: ohjelmistokehittäjien taitoja ja prosessin mukaista toimintaa. Tutkimus toteutettiin neljässä vaiheessa. Ensimmäisessä vaiheessa tehtiin kirjallisuuskatsaus, jolla selvitettiin tekijöitä, joiden on katsottu vaikuttavan testivetoiseen kehitykseen. Toisessa vaiheessa tehtiin Oulun yliopistolla kaksi koetta, joihin osallistui kaikkiaan 77 opiskelijaa. Kokeissa tutkittiin ohjelmiston laadun ja osallistujien tuottavuuden suhdetta heidän ohjelmointi- ja testaustaitoihinsa. Toisen kokeen aikana kerättyä aineistoa hyödynnettiin jatkotutkimuksessa, jossa tarkasteltiin laadun, tuottavuuden ja prosessin mukaisen toiminnan suhdetta. Kolmannessa vaiheessa tehtiin neljä koetta, joihin osallistui 30 ohjelmistoalan ammattilaista. Prosessin mukaista toimintaa ja taitoja tutkittiin suhteessa testivetoisen kehityksen vaikutuksiin ohjelmiston ulkoiseen laatuun ja tuottavuuteen. Neljännessä vaiheessa syntetisoitiin kahden edellisen vaiheen löydökset. Tulokset osoittavat, ettei testivetoinen kehitys liity ulkoisen laadun parantumiseen eikä ohjelmistokehittäjien tuottavuuteen. Parannukset laadussa ja tuottavuudessa liittyvät ennemmin taitoihin kuin prosessiin, ainakin ohjelmistokehityksen ammattilaisten kohdalla. Näin ollen prosessin mukaisella toiminnalla on vähäpätöinen vaikutus. Toisaalta testivetoisen kehityksen suosiman test-first-menettelytavan hyödyntäminen voi edistää aloittelevien ohjelmistokehittäjien tuottavuutta.
242

Implementation and Evaluation of a Continuous Code Inspection Platform / Implementation och utvärdering av en kontinuerlig kodgranskningsplattform

Melin, Tomas January 2016 (has links)
Establishing and preserving a high level of software quality is a not a trivial task, although the benefits of succeeding with this task has been proven profitable and advantageous. An approach to mitigate the decreasing quality of a project is to track metrics and certain properties of the project, in order to view the progression of the project’s properties. This approach may be carried out by introducing continuous code inspection with the application of static code analysis. However, as the initial common opinion is that these type of tools produce a too high number of false positives, there is a need to investigate what the actual case is. This is the origin for the investigation and case study performed in this paper. The case study is performed at Ida Infront AB in Linköping, Sweden and involves interviews with developers to determine the performance of the continuous inspection platform SonarQube, in addition to examine the general opinion among developers at the company. The author executes the implementation and configuration of a continuous inspection environment to analyze a partition of the company’s product and determine what rules that are appropriate to apply in the company’s context. The results from the investigation indicate the high quality and accuracy of the tool, in addition to the advantageous functionality of continuously monitoring the code to observe trends and the progression of metrics such as cyclomatic complexity and duplicated code, with the goal of preventing the constant increase of complex and duplicated code. Combining this with features such as false positive suppression, instant analysis feedback in pull requests and the possibility to break the build given specified conditions, suggests that the implemented environment is a way to mitigate software quality difficulties. / 建立和保持高水平的软件质量可以带来经济利益等诸多好处,然而这是一项很困难的任务。其中一种防止软件项目质量下降的方法是通过跟踪项目的度量值和某些属性,来查看项目的属性的变化情况。通过引入持续的代码审查和应用静态代码分析方法可以实现这种方法。然而,在人们的印象中,这类工具往往具有较高的误检,因此需要进一步调查实际情况、研究其可行性,这是本文的初始研究目标。本文在瑞典林雪平的Ida Infront AB公司开展了案例研究,调研了该公司开发人员的意见,并通过访问开发人员,确定持续的代码审查平台SonarQube的性能。作者对持续的代码审查环境进行了配置,分析了公司的部分产品,进而确定哪些规则适用于该公司。调查结果表明该工具是高质量并且准确的,还提供了持续监测代码来观察度量值的趋势和进展等先进功能,例如通过监测环路复杂度和重复代码等度量值,来防止复杂度和重复代码的增加。通过组合误检压缩、对pull requests的瞬间分析反馈、以及分解和建立给定的条件等特征,使得所实现的环境成为一种可以降低软件质量保障难度的方式。
243

Automatic non-functional testing and tuning of configurable generators / Une approche pour le test non-fonctionnel et la configuration automatique des générateurs

Boussaa, Mohamed 06 September 2017 (has links)
Les techniques émergentes de l’ingénierie dirigée par les modèles et de la programmation générative ont permis la création de plusieurs générateurs (générateurs de code et compilateurs). Ceux-ci sont souvent utilisés afin de faciliter le développement logiciel et automatiser le processus de génération de code à partir des spécifications abstraites. De plus, les générateurs modernes comme les compilateurs C, sont devenus hautement configurables, offrant de nombreuses options de configuration à l'utilisateur de manière à personnaliser facilement le code généré pour la plateforme matérielle cible. Par conséquent, la qualité logicielle est devenue fortement corrélée aux paramètres de configuration ainsi qu'au générateur lui-même. Dans ce contexte, il est devenu indispensable de vérifier le bon comportement des générateurs. Cette thèse établit trois contributions principales : Contribution I: détection automatique des inconsistances dans les familles de générateurs de code : Dans cette contribution, nous abordons le problème de l'oracle dans le domaine du test non-fonctionnel des générateurs de code. La disponibilité de multiples générateurs de code avec des fonctionnalités comparables (c.-à-d. familles de générateurs de code) nous permet d'appliquer l'idée du test métamorphique en définissant des oracles de test de haut-niveau (c.-à-d. relation métamorphique) pour détecter des inconsistances. Une inconsistance est détectée lorsque le code généré présente un comportement inattendu par rapport à toutes les implémentations équivalentes de la même famille. Nous évaluons notre approche en analysant la performance de Haxe, un langage de programmation de haut niveau impliquant un ensemble de générateurs de code multi-plateformes. Les résultats expérimentaux montrent que notre approche est capable de détecter plusieurs inconsistances qui révèlent des problèmes réels dans cette famille de générateurs de code. Contribution II: une approche pour l'auto-configuration des compilateurs. Le grand nombre d'options de compilation des compilateurs nécessite une méthode efficace pour explorer l'espace d’optimisation. Ainsi, nous appliquons, dans cette contribution, une méta-heuristique appelée Novelty Search pour l'exploration de cet espace de recherche. Cette approche aide les utilisateurs à paramétrer automatiquement les compilateurs pour une architecture matérielle cible et pour une métrique non-fonctionnelle spécifique tel que la performance et l'utilisation des ressources. Nous évaluons l'efficacité de notre approche en vérifiant les optimisations fournies par le compilateur GCC. Nos résultats expérimentaux montrent que notre approche permet d'auto-configurer les compilateurs en fonction des besoins de l'utilisateur et de construire des optimisations qui surpassent les niveaux d'optimisation standard. Nous démontrons également que notre approche peut être utilisée pour construire automatiquement des niveaux d'optimisation qui représentent des compromis optimaux entre plusieurs propriétés non-fonctionnelles telles que le temps d'exécution et la consommation des ressources. Contribution III: Un environnement d'exécution léger pour le test et la surveillance de la consommation des ressources des logiciels. Enfin, nous proposons une infrastructure basée sur les micro-services pour assurer le déploiement et la surveillance de la consommation des ressources des différentes variantes du code généré. Cette contribution traite le problème de l'hétérogénéité des plateformes logicielles et matérielles. Nous décrivons une approche qui automatise le processus de génération, compilation, et exécution du code dans le but de faciliter le test et l'auto-configuration des générateurs. Cet environnement isolé repose sur des conteneurs système, comme plateformes d'exécution, pour une surveillance et analyse fine des propriétés liées à l'utilisation des ressources (CPU et mémoire). / Generative software development has paved the way for the creation of multiple generators (code generators and compilers) that serve as a basis for automatically producing code to a broad range of software and hardware platforms. With full automatic code generation, users are able to rapidly synthesize software artifacts for various software platforms. In addition, they can easily customize the generated code for the target hardware platform since modern generators (i.e., C compilers) become highly configurable, offering numerous configuration options that the user can apply. Consequently, the quality of generated software becomes highly correlated to the configuration settings as well as to the generator itself. In this context, it is crucial to verify the correct behavior of generators. Numerous approaches have been proposed to verify the functional outcome of generated code but few of them evaluate the non-functional properties of automatically generated code, namely the performance and resource usage properties. This thesis addresses three problems : (1) Non-functional testing of generators: We benefit from the existence of multiple code generators with comparable functionality (i.e., code generator families) to automatically test the generated code. We leverage the metamorphic testing approach to detect non-functional inconsistencies in code generator families by defining metamorphic relations as test oracles. We define the metamorphic relation as a comparison between the variations of performance and resource usage of code, generated from the same code generator family. We evaluate our approach by analyzing the performance of HAXE, a popular code generator family. Experimental results show that our approach is able to automatically detect several inconsistencies that reveal real issues in this family of code generators. (2) Generators auto-tuning: We exploit the recent advances in search-based software engineering in order to provide an effective approach to tune generators (i.e., through optimizations) according to user's non-functional requirements (i.e., performance and resource usage). We also demonstrate that our approach can be used to automatically construct optimization levels that represent optimal trade-offs between multiple non-functional properties such as execution time and resource usage requirements. We evaluate our approach by verifying the optimizations performed by the GCC compiler. Our experimental results show that our approach is able to auto-tune compilers and construct optimizations that yield to better performance results than standard optimization levels. (3) Handling the diversity of software and hardware platforms in software testing: Running tests and evaluating the resource usage in heterogeneous environments is tedious. To handle this problem, we benefit from the recent advances in lightweight system virtualization, in particular container-based virtualization, in order to offer effective support for automatically deploying, executing, and monitoring code in heterogeneous environment, and collect non-functional metrics (e.g., memory and CPU consumptions). This testing infrastructure serves as a basis for evaluating the experiments conducted in the two first contributions.
244

The role of assurance within project management standards / Role dohledu ve standardech projektového řízení

Koch, Ondřej January 2016 (has links)
The thesis is focused on the role of assurance within project management standards. Firstly, the theoretical role of assurance was established based on the performed research. Three main areas of interest have been identified: Assurance over business, the project itself and the product. The role established in the theoretical part of the work was subsequently compared to information systems development methodologies and project management standards. From the comparison with AUP, Scrum and FDD methodologies, it seems that the better assurance is defined, the longer the feedback cycle is. During the comparison with the three most widespread project management standards - IPMA, PRINCE2 and PMBOK - various areas have been identified where these are not fully compliant with the theoretical role of assurance. Additions to IPMA and PMBOK have been created to support the compliance with the theoretically established role of assurance, fulfilling the objective set for the practical part of the work and providing benefits to IT project management professionals that struggle to deliver quality products while following one of the aforementioned standards.
245

Technical debt management in the context of agile methods in software development / Gerenciamento de dívida técnica no Ccontexto de desenvolvimento de software ágil.

Graziela Simone Tonin 23 March 2018 (has links)
The technical debt field covers an critical problem of software engineering, and this is one of the reasons why this field has received significant attention in recent years. The technical debt metaphor helps developers to think about, and to monitor software quality. The metaphor refers to flaws in software (usually caused by shortcuts to save time) that may affect future maintenance and evolution. It was created by Cunningham to improve the quality of software delivery. Many times the technical debt items are unknown, unmonitored and therefore not managed, thus resulting in high maintenance costs throughout the software life-cycle. We conducted an empirical study in an academic environment, during two offerings of a laboratory course on Extreme Programming (XP Lab) at University of São Paulo and in two Brazilian Software Companies (Company A and B). We analyzed thirteen teams, nine in the Academy and four in the Companies environment. The teams had a comprehensive lecture about technical debt and several ways to identify and manage technical debt were presented. We monitored the teams, performed interviews, did close observations and collected feedback. The obtained results show that the awareness of technical debt influences team behavior. Team members report thinking and discussing more on software quality after becoming aware of technical debt in their projects. We identified some impacts on the teams and the projects after having considered technical debt. A conceptual model for technical debt management was created including ways of how identifying, monitoring, categorizing, measuring, prioritizing, and paying off the technical debt. A few approaches and techniques for the technical debt management, identification, monitoring, measure, and payment are also suggested. / A metáfora de dívida técnica engloba um importante problema da engenharia de software e essa é uma das razões pelas quais este campo tem recebido uma grande atenção nos últimos anos. Essa metáfora auxilia os desenvolvedores de software a refletirem sobre e a monitorarem a qualidade de software. A metáfora se refere a falhas no software (geralmente causadas por atalhos para economizar tempo) que podem afetar a futura manutenção e evolução do mesmo. A metáfora foi criada por Cunningham com o objetivo de melhorar a qualidade das entregas de software. Muitas vezes as dívidas técnicas não são conhecidas, monitoradas e nem geridas, resultando em um alto custo de manutenção ao longo do ciclo de vida do software. Logo, conduziu-se um estudo empírico na academia, durante duas ofertas da disciplina de Programação Extrema (XP Lab) na Universidade de São Paulo e em duas empresas Brasileiras de desenvolvimento de software (Empresa A e B). Foram analisados treze times, sendo nove na academia e quatro nas empresas. Os times tiveram uma apresentação sobre dívida técnica e foram apresentadas algumas sugestões de abordagens para gerir dívida técnica. Monitorou-se os times, foram realizadas entrevistas, observações fechadas e informações foram coletadas. Os resultados mostraram que considerar dívida técnica influenciou o comportamento dos times. Eles reportaram que após considerar dívida técnica passaram a refletir e discutir mais a qualidade do software. Identificou-se alguns impactos nos times e nos projetos depois de considerarem dívida técnica. Um modelo conceitual para gestão de dívida técnica foi criado, incluindo formas, técnicas e abordagens de como identificar, monitorar, categorizar, medir, priorizar e pagar os itens de dívida técnica.
246

[en] A FRAMEWORK FOR SOFTWARE ENGINEERING PROCESS REPRESENTATION AND ANALYSIS / [pt] UM FRAMEWORK PARA A REPRESENTAÇÃO E ANÁLISE DE PROCESSOS DE SOFTWARE

LEANDRO RIBEIRO DAFLON 16 August 2004 (has links)
[pt] Diversas organizações buscam por padrões e guias de trabalho para atingir um processo de desenvolvimento maduro. Entretanto, mudanças e evoluções no negócio e na tecnologia implicam constantemente em mudanças e evoluções no processo. Esta dissertação propõe um framework que permite as organizações definirem e analisarem seus processos de desenvolvimento de software no contexto da organização ou projeto. Dessa forma, integração, alteração e evoluções do processo são facilitadas. A definição de um processo está baseada no conceito de Unidades de Processo. As Unidades de Processo representam blocos de construção utilizados na elaboração de novos modelos de processo, podendo utilizar partes de modelos de processos existentes ou não. A análise do processo é baseada em normas de qualidade ou modelos de maturidade, como SW-CMM, CMMI, ISO 12207. / [en] Many organizations search for standards and guidance to achieve a mature process. However, change and evolution of business and technology imply constant change and evolution of development processes. In this dissertation we propose a framework that offers an infrastructure allowing organizations to define and analyze software engineering process at organization level or project level. Besides that, it facilitates integration, change and process evolution. The definition of a process is based on a concept Process Units. These represent building blocks for tailoring integrated development processes, by reusing or not parts of existing process models. The process analysis is based on quality standards or maturity models, such as SW-CMM, CMMI, ISO 12207.
247

Certifikace CMMI ve vývoji software v agilním prostředí / CMMI Certification for Software Development in Agile Environment

Gajdušek, Radek January 2013 (has links)
The goal of master thesis "CMMI Certification for Software Development in Agile Environment" is CMMI quality model research with focus on software development in agile environment in the Siemens company. In the beginning CMMI model and Scrum methodics are introduced. The core of this thesis is focused on the current state analysis. Output of the analysis is a list of potential areas that are currently not compatible with quality model requirements. These areas are to be improved for the company to achieve the desired CMMI certification level. Possible improvements are introduced to the consultant. During the implementation part a web application is realized helping to remove most of the identified imperfections. Application benefit is objectively evaluated by an internal audit. The work includes discussion of possible further application development and quality model standard evolution in this company.
248

Automotive Powertrain Software Evaluation Tool

Powale, Kalkin 08 February 2018 (has links)
The software is a key differentiator and driver of innovation in the automotive industry. The major challenges for software development are increasing in complexity, shorter time-to-market, increase in development cost and demand of quality assurance. The complexity is increasing due to emission legislations, variants of product and new communication technologies being interfaced with the vehicle. The shorter development time is due to competition in the market, which requires faster feedback loops of verification and validation of developed functionalities. The increase in development cost is contributed by two factors; the first is pre-launch cost, this involves the cost of error correction in development stages. Another is post-launch cost; this involves warranty and guarantees cost. As the development time passes the cost of error correction also increases. Hence it is important to detect the error as early as possible. All these factors affect the software quality; there are several cases where Original Equipment Manufacturer (OEM) have callbacks their product because of the quality defect. Hence, there is increased in the requirement of software quality assurance. The solution for these software challenges can be the early quality evaluation in continuous integration framework environment. The most prominent in today\'s automotive industry AUTomotive Open System ARchitecture (AUTOSAR) reference architecture is used to describe software component and interfaces. AUTOSAR provides the standardised software component architecture elements. It was created to address the issues of growing complexity; the existing AUTOSAR environment does have software quality measures, such as schema validations and protocols for acceptance tests. However, it lacks the quality specification for non-functional qualities such as maintainability, modularity, etc. The tool is required which will evaluate the AUTOSAR based software architecture and give the objective feedback regarding quality. This thesis aims to provide the quality measurement tool, which will be used for evaluation of AUTOSAR based software architecture. The tool reads the AUTOSAR architecture information from AUTOSAR Extensible Markup Language (ARXML) file. The tool provides configuration ability, continuous evaluation and objective feedback regarding software quality characteristics. The tool was utilised on transmission control project, and results are validated by industry experts.
249

An?lise comparativa da acessibilidade para cegos de ambientes digitais para gerenciamento de aprendizagem para educa??o a dist?ncia / Comparative analysys of accessibility of e-Iearning management education environments for the blind

Silva, Andr? Luiz da 30 May 2007 (has links)
Submitted by SBI Biblioteca Digital (sbi.bibliotecadigital@puc-campinas.edu.br) on 2018-06-07T14:48:53Z No. of bitstreams: 1 Andre Luiz da Silva.pdf: 21913229 bytes, checksum: a21aa44045c7a96617c071e0837056d8 (MD5) / Made available in DSpace on 2018-06-07T14:48:53Z (GMT). No. of bitstreams: 1 Andre Luiz da Silva.pdf: 21913229 bytes, checksum: a21aa44045c7a96617c071e0837056d8 (MD5) Previous issue date: 2007-05-30 / This study describes a comparative analysys of accessibility for the blind used in SGEAD (Distant Learning Management Systems), TelEduc and WebCT according to criteria, recommendations and international norms such as heuristical evaluation techniques, automatic test of accessibility, inspection based on the points of verification of the W3C recommendations for the accessibility of the content of the Web 1.0 and from tasks reports and questionnaires conducted with the users. One of the goals of this researches is to demonstrate the importance of the use of standards and guides for Web accessibility as support for accessibility for the blind in the current scenario of digital inclusion. The quality of the interface is essencial for the success of distant learning interactive systems. It is expected that this study will contribute to the production of knowledge that will guide and support the professionals involved with e-Iearning. / Este estudo descreve a an?lise comparativa de acessibilidade para cegos aplicadas nos Sistemas de Gerenciamento para Educa??o a Dist?ncia (SGEAD) TelEduc e WebCT, segundo os crit?rios, recomenda??es e normas internacionais, tais como, t?cnicas de avalia??o heur?stica, teste autom?tico de acessibilidade, inspe??o baseada nos pontos de verifica??o das recomenda??es para a acessibilidade do conte?do da Web 1.0 do W3C e de tarefas e question?rios com os usu?rios. Um dos objetivos desta pesquisa ? demonstrar a import?ncia da utiliza??o de padr?es e guias para a Acessibilidade Web como fator de apoio a acessibilidade para cegos dentro do atual cen?rio de inclus?o digital. A qualidade da interface ? fundamental para que sistemas interativos de educa??o a dist?ncia possam ser utilizados com sucesso. Espera-se com este estudo, contribuir para a produ??o de conhecimentos que sirvam para orientar e apoiar os profissionais envolvidos com a educa??o computadorizada.
250

Τεχνικές εξόρυξης δεδομένων και εφαρμογές σε προβλήματα διαχείρισης πληροφορίας και στην αξιολόγηση λογισμικού / Data mining techniques and their applications in data management problems and in software systems evaluation

Τσιράκης, Νικόλαος 20 April 2011 (has links)
Τα τελευταία χρόνια όλο και πιο επιτακτική είναι η ανάγκη αξιοποίησης των ψηφιακών δεδομένων τα οποία συλλέγονται και αποθηκεύονται σε διάφορες βάσεις δεδομένων. Το γεγονός αυτό σε συνδυασμό με τη ραγδαία αύξηση του όγκου των δεδομένων αυτών επιβάλλει τη δημιουργία υπολογιστικών μεθόδων με απώτερο σκοπό τη βοήθεια του ανθρώπου στην εξόρυξη της χρήσιμης πληροφορίας και γνώσης από αυτά. Οι τεχνικές εξόρυξης δεδομένων παρουσιάζουν τα τελευταία χρόνια ιδιαίτερο ενδιαφέρον στις περιπτώσεις όπου η πηγή των δεδομένων είναι οι ροές δεδομένων ή άλλες μορφές όπως τα XML έγγραφα. Σύγχρονα συστήματα και εφαρμογές όπως είναι αυτά των κοινοτήτων πρακτικής έχουν ανάγκη χρήσης τέτοιων τεχνικών εξόρυξης για να βοηθήσουν τα μέλη τους. Τέλος ενδιαφέρον υπάρχει και κατά την αξιολόγηση λογισμικού όπου η πηγή δεδομένων είναι τα αρχεία πηγαίου κώδικα για σκοπούς καλύτερης συντηρησιμότητας τους. Από τη μια μεριά οι ροές δεδομένων είναι προσωρινά δεδομένα τα οποία περνούν από ένα σύστημα «παρατηρητή» συνεχώς και σε μεγάλο όγκο. Υπάρχουν πολλές εφαρμογές που χειρίζονται δεδομένα σε μορφή ροών, όπως δεδομένα αισθητήρων, ροές κίνησης δικτύων, χρηματιστηριακά δεδομένα και τηλεπικοινωνίες. Αντίθετα με τα στατικά δεδομένα σε βάσεις δεδομένων, οι ροές δεδομένων παρουσιάζουν μεγάλο όγκο και χαρακτηρίζονται από μια συνεχή ροή πληροφορίας που δεν έχει αρχή και τέλος. Αλλάζουν δυναμικά, και απαιτούν γρήγορες αντιδράσεις. Ίσως είναι η μοναδική πηγή γνώσης για εξόρυξη δεδομένων και ανάλυση στην περίπτωση όπου οι ανάγκες μιας εφαρμογής περιορίζονται από τον χρόνο απόκρισης και το χώρο αποθήκευσης. Αυτά τα μοναδικά χαρακτηριστικά κάνουν την ανάλυση των ροών δεδομένων πολύ ενδιαφέρουσα ιδιαίτερα στον Παγκόσμιο Ιστό. Ένας άλλος τομέας ενδιαφέροντος για τη χρήση νέων τεχνικών εξόρυξης δεδομένων είναι οι κοινότητες πρακτικής. Οι κοινότητες πρακτικής (Communities of Practice) είναι ομάδες ανθρώπων που συμμετέχουν σε μια διαδικασία συλλογικής εκμάθησης. Μοιράζονται ένα ενδιαφέρον ή μια ιδέα που έχουν και αλληλεπιδρούν για να μάθουν καλύτερα για αυτό. Οι κοινότητες αυτές είναι μικρές ή μεγάλες, τοπικές ή παγκόσμιες, face to face ή on line, επίσημα αναγνωρίσιμες, ανεπίσημες ή και αόρατες. Υπάρχουν δηλαδή παντού και σχεδόν όλοι συμμετέχουμε σε δεκάδες από αυτές. Ένα παράδειγμα αυτών είναι τα γνωστά forum συζητήσεων. Σκοπός μας ήταν ο σχεδιασμός νέων αλγορίθμων εξόρυξης δεδομένων από τις κοινότητες πρακτικής με τελικό σκοπό να βρεθούν οι σχέσεις των μελών τους και να γίνει ανάλυση των εξαγόμενων δεδομένων με μετρικές κοινωνικών δικτύων ώστε συνολικά να αποτελέσει μια μεθοδολογία ανάλυσης τέτοιων κοινοτήτων. Επίσης η eXtensible Markup Language (XML) είναι το πρότυπο για αναπαράσταση δεδομένων στον Παγκόσμιο Ιστό. Η ραγδαία αύξηση του όγκου των δεδομένων που αναπαρίστανται σε XML μορφή δημιούργησε την ανάγκη αναζήτησης μέσα στην δενδρική δομή ενός ΧΜL εγγράφου για κάποια συγκεκριμένη πληροφορία. Η ανάγκη αυτή ταυτόχρονα με την ανάγκη για γρήγορη πρόσβαση στους κόμβους του ΧΜL δέντρου, οδήγησε σε διάφορα εξειδικευμένα ευρετήρια. Για να μπορέσουν να ανταποκριθούν στη δυναμική αυτή των δεδομένων, τα ευρετήρια πρέπει να έχουν τη δυνατότητα να μεταβάλλονται δυναμικά. Ταυτόχρονα λόγο της απαίτησης για αναζήτηση συγκεκριμένης πληροφορίας πρέπει να γίνεται το φιλτράρισμα ενός συνόλου XML δεδομένων διαμέσου κάποιων προτύπων και κανόνων ώστε να βρεθούν εκείνα τα δεδομένα που ταιριάζουν με τα αποθηκευμένα πρότυπα και κανόνες. Από την άλλη μεριά οι διαστάσεις της εσωτερικής και εξωτερικής ποιότητας στη χρήση ενός προϊόντος λογισμικού αλλάζουν κατά τη διάρκεια ζωής του. Για παράδειγμα η ποιότητα όπως ορίζεται στην αρχή του κύκλου ζωής του λογισμικού δίνει πιο πολύ έμφαση στην εξωτερική ποιότητα και διαφέρει από την εσωτερική, όπως για παράδειγμα στη σχεδίαση η οποία αναφέρεται στην εσωτερική ποιότητα και αφορά τους μηχανικούς λογισμικού. Οι τεχνικές εξόρυξης δεδομένων που μπορούν να χρησιμοποιηθούν για την επίτευξη του απαραίτητου επιπέδου ποιότητας, όπως είναι ο καθορισμός και η αξιολόγηση της ποιότητας πρέπει να λαμβάνουν υπόψη τους τις διαφορετικές αυτές διαστάσεις σε κάθε στάδιο του κύκλου ζωής του προϊόντος. Στα πλαίσια αυτής της διδακτορικής διατριβής έγινε σε βάθος έρευνα σχετικά με τεχνικές εξόρυξης δεδομένων και εφαρμογές τόσο στο πρόβλημα διαχείρισης πληροφορίας όσο και στο πρόβλημα της αξιολόγησης λογισμικού. / The World Wide Web has gradually transformed into a large data repository consisting of vast amount of data in many different types. These data doubles about every year, but useful information seems to be decreasing. The area of data mining has arisen over the last decade to address this problem. It has become not only an important research area, but also one with large potential in the real world. Data mining has many directives and handles various types of data. When the related data are for example data streams or XML data then the problems seem to be very crucial and interesting. Also contemporary systems and applications related to communities of practice seek appropriate data mining techniques and algorithms in order to help their members. Finally, great interest has the field of software evaluation when by using data mining in order to facilitate the comprehension and maintainability evaluation of a software system’s source code. Source code artifacts and measurement values can be used as input to data mining algorithms in order to provide insights into a system’s structure or to create groups of artifacts with similar software measurements. First, data streams are large volumes of data arriving continuously. Data mining techniques have been proposed and studied to help users better understand and analyze the information. Clustering is a useful and ubiquitous tool in data analysis. With the rapid increase in web-traffic and e-commerce, understanding user behavior based on their interaction with a website is becoming more and more important for website owners and clustering in correlation with personalization techniques of this information space has become a necessity. The knowledge obtained by learning the users preferences can help improve web content, find usability issues related to this content and its structure, ensure the security of provided data, analyze the different groups of users that can be derived from the web access logs and extract patterns, profiles and trends. This thesis investigates the application of a new model for clustering and analyzing click-stream data in the World Wide Web with two different approaches. The next part of the thesis deals with data mining techniques regarding communities of practice. These are groups of people taking part in a collaborative way of learning and exchanging ideas. Systems for supporting argumentative collaboration have become more and more popular in digital world. There are many research attempts regarding collaboration filtering and recommendation systems. Sometimes depending on the system and its needs there are different problems and developers have to deal with special cases in order to provide useful service to users. Data mining can play an important role in the area of collaboration systems that want to provide decision support functionality. Data mining in these systems can be defined as the effort to generate actionable models through automated analysis of their databases. Data mining can only be deployed successfully when it generates insights that are substantially deeper than what a simple view of data can give. This thesis introduces a framework that can be applied to a wide range of software platforms aiming at facilitating collaboration and learning among users. More precisely, an approach that integrates techniques from the Data Mining and Social Network Analysis disciplines is being presented. The next part of the thesis deals with XML data and ways to handle huge volumes of data that they may hold. Lately data written in a more sophisticated markup language such as XML have made great strides in many domains. Processing and management of XML documents have already become popular research issues with the main problem in this area being the need to optimally index them for storage and retrieval purposes. This thesis first presents a unified clustering algorithm for both homogeneous and heterogeneous XML documents. Then using this algorithm presents an XML P2P system that efficiently distributes a set of clustered XML documents in a P2P network in order to speed-up user queries. Ultimately, data mining and its ability to handle large amounts of data and uncover hidden patterns has the potential to facilitate the comprehension and maintainability evaluation of a software system. This thesis investigates the applicability and suitability of data mining techniques to facilitate the comprehension and maintainability evaluation of a software system’s source code. What is more, this thesis focuses on the ability of data mining to produce either overviews of a software system (thus supporting a top down approach) or to point out specific parts of this system that require further attention (thus supporting a bottom up approach) potential to facilitate the comprehension and maintainability evaluation of a software system.

Page generated in 0.086 seconds