• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 744
  • 173
  • 83
  • 60
  • 59
  • 23
  • 20
  • 18
  • 11
  • 10
  • 6
  • 6
  • 5
  • 5
  • 5
  • Tagged with
  • 1546
  • 304
  • 296
  • 291
  • 236
  • 196
  • 177
  • 146
  • 127
  • 124
  • 122
  • 111
  • 111
  • 92
  • 90
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
561

Knowledge discovery method for deriving conditional probabilities from large datasets

Elsilä, U. (Ulla) 04 December 2007 (has links)
Abstract In today's world, enormous amounts of data are being collected everyday. Thus, the problems of storing, handling, and utilizing the data are faced constantly. As the human mind itself can no longer interpret the vast datasets, methods for extracting useful and novel information from the data are needed and developed. These methods are collectively called knowledge discovery methods. In this thesis, a novel combination of feature selection and data modeling methods is presented in order to help with this task. This combination includes the methods of basic statistical analysis, linear correlation, self-organizing map, parallel coordinates, and k-means clustering. The presented method can be used, first, to select the most relevant features from even hundreds of them and, then, to model the complex inter-correlations within the selected ones. The capability to handle hundreds of features opens up the possibility to study more extensive processes instead of just looking at smaller parts of them. The results of k-nearest-neighbors study show that the presented feature selection procedure is valid and appropriate. A second advantage of the presented method is the possibility to use thousands of samples. Whereas the current rules of selecting appropriate limits for utilizing the methods are theoretically proved only for small sample sizes, especially in the case of linear correlation, this thesis gives the guidelines for feature selection with thousands of samples. A third positive aspect is the nature of the results: given that the outcome of the method is a set of conditional probabilities, the derived model is highly unrestrictive and rather easy to interpret. In order to test the presented method in practice, it was applied to study two different cases of steel manufacturing with hot strip rolling. In the first case, the conditional probabilities for different types of retentions were derived and, in the second case, the rolling conditions for the occurrence of wedge were revealed. The results of both of these studies show that steel manufacturing processes are indeed very complex and highly dependent on the various stages of the manufacturing. This was further confirmed by the fact that with studies of k-nearest-neighbors and C4.5, it was impossible to derive useful models concerning the datasets as a whole. It is believed that the reason for this lies in the nature of these two methods, meaning that they are unable to grasp such manifold inter-correlations in the data. On the contrary, the presented method of conditional probabilities allowed new knowledge to be gained of the studied processes, which will help to better understand these processes and to enhance them.
562

Discovery of temporal association rules in multivariate time series

Zhao, Yi January 2017 (has links)
This thesis focuses on mining association rules on multivariate time series. Com-mon association rule mining algorithms can usually only be applied to transactional data, and a typical application is market basket analysis. If we want to mine temporal association rules on time series data, changes need to be made. During temporal association rule mining, the temporal ordering nature of data and the temporal interval between the left and right patterns of a rule need to be considered. This thesis reviews some mining methods for temporal association rule mining, and proposes two similar algorithms for the mining of frequent patterns in single and multivariate time series. Both algorithms are scalable and efficient. In addition, temporal association rules are generated from the patterns found. Finally, the usability and efficiency of the algorithms are demonstrated by evaluating the results.
563

Investigation of the cancer testis antigen lactate dehydrogenase C as a CD8 T cell target

Neilson, David S 23 December 2016 (has links)
The infrequency of known T cell targets in high grade serous ovarian carcinoma (HSGC) is a substantial barrier to the development of targeted immunotherapies. Due to their infrequency, antigen discovery is a crucial component of immunotherapeutic design. In our cohort of HGSC cases, the cancer testis (CT) antigen lactate dehydrogenase C (LDHC) is expressed in 76% of tumours (22/29). As LDHC presents with tumour specificity in women, I hypothesize that LDHC is an immunogenic target in HGSC patients, and that LDHC-specific T cells can be activated and expanded for therapeutic purposes. As such, I sought to examine whether endogenous LDHC-specific T cells were present in the ascites of HGSC patients. A standard Rapid Expansion Protocol was used to expand CD8 T cell cultures from patient ascites. These cultures were screened for reactivity to a peptide library encompassing all possible epitopes of the LDHC protein by interferon-γ ELISpot. With this approach, T cell clones from one of five patients were identified that were reactive to minimal peptides contained within LDHC. In this patient, the antigenic LDHC peptide differentiated from LDHA by a single amino acid at its C-terminus (YTSWAIGLSVM versus YTSWAIGLSVA). In recognition assays, tumour cell lines expressing endogenous LDHC, autologous ascites, or autologous B cells transfected with LDHC were unable to elicit T cell responses. Although this study suggests that LDHC is not immunogenic, continued screening of LDHC and other CT proteins will likely provide additional immunotherapeutic targets. / Graduate
564

The relationship between the use of ICT in discovering mathematics concepts and learning competencies

Mukendwa, Antoinette P January 2013 (has links)
The aim of this study is to explore the perceptions of Mathematics teachers using Information and Communications Technology (ICT) as educational tool in their classrooms. This study focuses on the Mathematics teachers’ 21st century-oriented pedagogical practices that propagate learning outcomes that are considered essential for all learners to prosper in this ever-changing and demanding information society. The learning competencies considered are termed lifelong competencies as they transcend the classroom and school environment and can thus be used to solve authentic problems in day-to-day life. The development of these learning competencies, especially by using ICT, has become vital in equipping learners with the necessary skills to become confident citizens in this globalised world. The role the teacher plays is increasingly acknowledged as having a major impact on this process. An essential assumption of this study is that learning activities facilitated by teachers utilising ICT efficiently and effectively as an educational tool have the potential of enhancing the quality of learning competencies. Moreover, as the role of the teacher in these activities is highly important, the teacher’s characteristics and background have the potential to determine the overall success of the learners. Using the underlying principles of Activity Theory and the conceptual framework of SITES 2006 this study investigates the relationship between these three components, i.e. ICT integration, learning competencies, and teacher background and characteristics. The intricate relationships that exist among these three components are investigated in this study in the context of Mathematics education. This is a secondary data analysis study that utilises data from the SITES 2006 South African Mathematics teachers’ questionnaire. Only Mathematics teachers who indicated using ICT as an educational tool in the discovery of Mathematics principles and concepts were considered. Using Spearman’s correlation coefficient, the data was analysed to determine the strength of the relationships among the variables. Findings of the study suggest that certain teacher characteristics do indeed influence the probability of teachers developing certain learning competencies in learners. Moreover, the findings indicate that a number of the learning competencies investigated in this study are not as readily attained as others. / Dissertation (MEd)--University of Pretoria, 2013. / gm2013 / Science, Mathematics and Technology Education / unrestricted
565

Découverte de services et collaboration au sein d'une flotte hétérogène et hautement dynamique d'objets mobiles communicants autonomes / Service Discovery and Collaboration in a Heterogeneous and Highly Dynamic Swarm of Mobile Communicating and Autonomous Objects

Autefage, Vincent 26 October 2015 (has links)
Les systèmes autonomes sont des objets mobiles communicants capables de réaliser un certain nombre de tâches sans intervention humaine. Le coût (e.g. argent, poids, énergie) de la charge utile requise pour effectuer certaines missions est parfois trop important pour permettre aux engins d’embarquer la totalité des capacités nécessaires (i.e. capteurs et actionneurs). Répartir ces capacités sur plusieurs entités est une solution naturelle à ce problème. Un tel groupe d’entités constitue une flotte à laquelle il devient nécessaire de fournir un mécanisme de découverte permettant aux différents engins de partager leurs capacités respectives afin de résoudre une mission globale de façon collaborative. Ce mécanisme, outre l’affectation des tâches, doit gérer les conflits et les pannes potentielles qui peuvent survenir à tout moment sur tout engin de la flotte. Fort de ces constations, nous proposons un nouveau mécanisme collaboratif nommé AMiRALE qui apporte une solution aux problèmes ci-dessus pour les flottes hétérogènes d’engins mobiles autonomes. Notre système est entièrement distribué et repose uniquement sur des communications asynchrones. Nous proposonségalement un nouvel outil nommé NEmu permettant de créer des réseaux virtuels mobiles avec un contrôle important sur les propriétés de la topologie du réseau ainsi que sur la configuration des noeuds et des inter-connexions. Cet outil permet la réalisation d’expérimentations réalistes sur des prototypes d’applications réseaux. Enfin, nous proposons une évaluation de notre système collaboratif AMiRALE au travers d’un scénario de nettoyage de parc utilisant une flotte autonome de drones et de robots terrestres spécialisés. / We call autonomous systems, mobile and communicating objects which are able to perform several tasks without any human intervention. The overall cost (including price, weight and energy) of the payload required by some missions is sometimes too important to enable the entities to embed all the required capabilities (i.e. sensors and actuators). This is the reason why it is more suitable to spread all the capabilities among several entities. The team formed by those entities is called a swarm. It then becomes necessary to provide a discovery mechanism built into the swarm in order to enable its members to share their capabilities and to collaborate for achieving a global mission.This mechanism should perform task allocation as well as management of conflicts and failures which can occur at any moment on any entity of the swarm. In this thesis, we present a novel collaborative system which is called AMiRALE for heterogeneous swarms of autonomous mobile robots. Our system is fully distributed and relies only on asynchronous communications. We also present a novel tool called NEmu which enables to create virtual mobile networks with a complete control over the network topology, links and nodes properties. This tool is designed for performingrealistic experimentation on prototypes of network applications. Finally, we present experimental results on our collaborative system AMiRALE obtained through a park cleaning scenario which relies on an autonomous swarm of drones and specialized ground robots.
566

Data Masking, Encryption, and their Effect on Classification Performance: Trade-offs Between Data Security and Utility

Asenjo, Juan C. 01 January 2017 (has links)
As data mining increasingly shapes organizational decision-making, the quality of its results must be questioned to ensure trust in the technology. Inaccuracies can mislead decision-makers and cause costly mistakes. With more data collected for analytical purposes, privacy is also a major concern. Data security policies and regulations are increasingly put in place to manage risks, but these policies and regulations often employ technologies that substitute and/or suppress sensitive details contained in the data sets being mined. Data masking and substitution and/or data encryption and suppression of sensitive attributes from data sets can limit access to important details. It is believed that the use of data masking and encryption can impact the quality of data mining results. This dissertation investigated and compared the causal effects of data masking and encryption on classification performance as a measure of the quality of knowledge discovery. A review of the literature found a gap in the body of knowledge, indicating that this problem had not been studied before in an experimental setting. The objective of this dissertation was to gain an understanding of the trade-offs between data security and utility in the field of analytics and data mining. The research used a nationally recognized cancer incidence database, to show how masking and encryption of potentially sensitive demographic attributes such as patients’ marital status, race/ethnicity, origin, and year of birth, could have a statistically significant impact on the patients’ predicted survival. Performance parameters measured by four different classifiers delivered sizable variations in the range of 9% to 10% between a control group, where the select attributes were untouched, and two experimental groups where the attributes were substituted or suppressed to simulate the effects of the data protection techniques. In practice, this represented a corroboration of the potential risk involved when basing medical treatment decisions using data mining applications where attributes in the data sets are masked or encrypted for patient privacy and security concerns.
567

Unsupervised discovery of relations for analysis of textual data in digital forensics

Louis, Anita Lily 23 August 2010 (has links)
This dissertation addresses the problem of analysing digital data in digital forensics. It will be shown that text mining methods can be adapted and applied to digital forensics to aid analysts to more quickly, efficiently and accurately analyse data to reveal truly useful information. Investigators who wish to utilise digital evidence must examine and organise the data to piece together events and facts of a crime. The difficulty with finding relevant information quickly using the current tools and methods is that these tools rely very heavily on background knowledge for query terms and do not fully utilise the content of the data. A novel framework in which to perform evidence discovery is proposed in order to reduce the quantity of data to be analysed, aid the analysts' exploration of the data and enhance the intelligibility of the presentation of the data. The framework combines information extraction techniques with visual exploration techniques to provide a novel approach to performing evidence discovery, in the form of an evidence discovery system. By utilising unrestricted, unsupervised information extraction techniques, the investigator does not require input queries or keywords for searching, thus enabling the investigator to analyse portions of the data that may not have been identified by keyword searches. The evidence discovery system produces text graphs of the most important concepts and associations extracted from the full text to establish ties between the concepts and provide an overview and general representation of the text. Through an interactive visual interface the investigator can explore the data to identify suspects, events and the relations between suspects. Two models are proposed for performing the relation extraction process of the evidence discovery framework. The first model takes a statistical approach to discovering relations based on co-occurrences of complex concepts. The second model utilises a linguistic approach using named entity extraction and information extraction patterns. A preliminary study was performed to assess the usefulness of a text mining approach to digital forensics as against the traditional information retrieval approach. It was concluded that the novel approach to text analysis for evidence discovery presented in this dissertation is a viable and promising approach. The preliminary experiment showed that the results obtained from the evidence discovery system, using either of the relation extraction models, are sensible and useful. The approach advocated in this dissertation can therefore be successfully applied to the analysis of textual data for digital forensics Copyright / Dissertation (MSc)--University of Pretoria, 2010. / Computer Science / unrestricted
568

Link layer topology discovery in an uncooperative ethernet environment

Delport, Johannes Petrus 27 August 2008 (has links)
Knowledge of a network’s entities and the physical connections between them, a network’s physical topology, can be useful in a variety of network scenarios and applications. Administrators can use topology information for fault- finding, inventorying and network planning. Topology information can also be used during protocol and routing algorithm development, for performance prediction and as a basis for accurate network simulations. Specifically, from a network security perspective, threat detection, network monitoring, network access control and forensic investigations can benefit from accurate network topology information. The dynamic nature of large networks has led to the development of various automatic topology discovery techniques, but these techniques have mainly focused on cooperative network environments where network elements can be queried for topology related information. The primary objective of this study is to develop techniques for discovering the physical topology of an Ethernet network without the assistance of the network’s elements. This dissertation describes the experiments performed and the techniques developed in order to identify network nodes and the connections between these nodes. The product of the investigation was the formulation of an algorithm and heuristic that, in combination with measurement techniques, can be used for inferring the physical topology of a target network. / Dissertation (MSc)--University of Pretoria, 2008. / Computer Science / unrestricted
569

Ranked Search on Data Graphs

Varadarajan, Ramakrishna R. 10 March 2009 (has links)
Graph-structured databases are widely prevalent, and the problem of effective search and retrieval from such graphs has been receiving much attention recently. For example, the Web can be naturally viewed as a graph. Likewise, a relational database can be viewed as a graph where tuples are modeled as vertices connected via foreign-key relationships. Keyword search querying has emerged as one of the most effective paradigms for information discovery, especially over HTML documents in the World Wide Web. One of the key advantages of keyword search querying is its simplicity – users do not have to learn a complex query language, and can issue queries without any prior knowledge about the structure of the underlying data. The purpose of this dissertation was to develop techniques for user-friendly, high quality and efficient searching of graph structured databases. Several ranked search methods on data graphs have been studied in the recent years. Given a top-k keyword search query on a graph and some ranking criteria, a keyword proximity search finds the top-k answers where each answer is a substructure of the graph containing all query keywords, which illustrates the relationship between the keyword present in the graph. We applied keyword proximity search on the web and the page graph of web documents to find top-k answers that satisfy user’s information need and increase user satisfaction. Another effective ranking mechanism applied on data graphs is the authority flow based ranking mechanism. Given a top-k keyword search query on a graph, an authority-flow based search finds the top-k answers where each answer is a node in the graph ranked according to its relevance and importance to the query. We developed techniques that improved the authority flow based search on data graphs by creating a framework to explain and reformulate them taking in to consideration user preferences and feedback. We also applied the proposed graph search techniques for Information Discovery over biological databases. Our algorithms were experimentally evaluated for performance and quality. The quality of our method was compared to current approaches by using user surveys.
570

Drug repositioning and indication discovery using description logics

Croset, Samuel January 2014 (has links)
Drug repositioning is the discovery of new indications for approved or failed drugs. This practice is commonly done within the drug discovery process in order to adjust or expand the application line of an active molecule. Nowadays, an increasing number of computational methodologies aim at predicting repositioning opportunities in an automated fashion. Some approaches rely on the direct physical interaction between molecules and protein targets (docking) and some methods consider more abstract descriptors, such as a gene expression signature, in order to characterise the potential pharmacological action of a drug (Chapter 1). On a fundamental level, repositioning opportunities exist because drugs perturb multiple biological entities, (on and off-targets) themselves involved in multiple biological processes. Therefore, a drug can play multiple roles or exhibit various mode of actions responsible for its pharmacology. The work done for my thesis aims at characterising these various modes and mechanisms of action for approved drugs, using a mathematical framework called description logics. In this regard, I first specify how living organisms can be compared to complex black box machines and how this analogy can help to capture biomedical knowledge using description logics (Chapter 2). Secondly, the theory is implemented in the Functional Therapeutic Chemical Classification System (FTC - https://www.ebi.ac.uk/chembl/ftc/), a resource defining over 20,000 new categories representing the modes and mechanisms of action of approved drugs. The FTC also indexes over 1,000 approved drugs, which have been classified into the mode of action categories using automated reasoning. The FTC is evaluated against a gold standard, the Anatomical Therapeutic Chemical Classification System (ATC), in order to characterise its quality and content (Chapter 3). Finally, from the information available in the FTC, a series of drug repositioning hypotheses were generated and made publicly available via a web application (https://www.ebi.ac.uk/chembl/research/ftc-hypotheses). A subset of the hypotheses related to the cardiovascular hypertension as well as for Alzheimer’s disease are further discussed in more details, as an example of an application (Chapter 4). The work performed illustrates how new valuable biomedical knowledge can be automatically generated by integrating and leveraging the content of publicly available resources using description logics and automated reasoning. The newly created classification (FTC) is a first attempt to formally and systematically characterise the function or role of approved drugs using the concept of mode of action. The open hypotheses derived from the resource are available to the community to analyse and design further experiments.

Page generated in 0.0755 seconds