Global ETD Search

381	Modelagem de processo de extração de conhecimento em banco de dados para sistemas de suporte à decisão. / Modeling of knowledge discovery in databases for decision systems. Shiba, Sonia Kaoru 26 June 2008 (has links) Este trabalho apresenta a modelagem de um processo de extração de conhecimento, onde a aquisição de informações para a análise de dados têm como origem os bancos de dados transacionais e data warehouse. A mineração de dados focou-se na geração de modelos descritivos a partir de técnicas de classificação baseada no Teorema de Bayes e no método direto de extração de regras de classificação, definindo uma metodologia para a geração de modelos de aprendizagem. Foi implementado um processo de extração de conhecimento para a geração de modelos de aprendizagem para suporte à decisão, aplicando técnicas de mineração de dados para modelos descritivos e geração de regras de classificação. Explorou-se a possibilidade de transformar os modelos de aprendizagem em bases de conhecimento utilizando um banco de dados relacional, disponível para acesso via sistema especialista, para a realização de novas classificações de registros, ou então possibilitar a visualização dos resultados a partir de planilhas eletrônicas. No cenário descrito neste trabalho, a organização dos procedimentos da etapa de pré-processamento permitiu que a extração de atributos adicionais ou transformação de dados fosse realizada de forma iterativa, sem a necessidade de implementação de novos programas de extração de dados. Desta forma, foram definidas todas as atividades essenciais do pré-processamento e a seqüência em que estas devem ser realizadas, além de possibilitar a repetição dos procedimentos sem perdas das unidades codificadas para o processo de extração de dados. Um modelo de processo de extração de conhecimento iterativo e quantificável, em termos das etapas e procedimentos, foi configurado vislumbrando um produto final com o projeto da base de conhecimento para ações de retenção de clientes e regras para ações específicas com segmentos de clientes. / This work presents a model of knowledge discovery in databases, where the information for data analysis comes from a repository of transactional information systems and data-warehouse. The data mining focused on the generation of descriptive models by means of classification techniques based on the Bayes\' theorem and a extraction method of classification rules, defining a methodology to propose new learning models. The process of knowledge extraction was implemented for the generation of learning models for support the make decision, applying data mining for descriptive models and generation of classification rules. This work explored the possibility of transforming the learning models in knowledge database using a relational database, to be accessible by a specialist system, to classify new records or to allow the visualization of the results through electronic tables. The organization of the procedures in the pre-processing allowed to extract additional attributes or to transform information in an interactive process, with no need of new programs to extract the information. This way, all the essential activities of the pre-processing were defined and the sequence in which these should be developed. Additionally, this allowed the repetition of the procedures with no loss of units for the process of information extraction. A model of process for the interactive and quantifiable extraction of knowledge, in terms of the stages and procedures, was idealized in order to develop a product with the project of the knowledge databases for actions of retention of clients and rules for specific actions within clients\' segments. Banco de dados orientado a objetos Conhecimento (modelagem) Data mining Knowledge discovery in databases
382	Computação Evolutiva para a Construção de Regras de Conhecimento com Propriedades Específicas / Evolutionary Computing for Knowledge Rule Construction with Specific Properties Pila, Adriano Donizete 12 April 2007 (has links) A maioria dos algoritmos de aprendizado de máquina simbólico utilizam regras de conhecimento if-then como linguagem de descrição para expressar o conhecimento aprendido. O objetivo desses algoritmos é encontrar um conjunto de regras de classificação que possam ser utilizadas na predição da classe de novos casos que não foram vistos a priori pelo algoritmo. Contudo, este tipo de algoritmo considera o problema da interação entre as regras, o qual consiste na avaliação da qualidade do conjunto de regras induzidas (classificador) como um todo, ao invés de avaliar a qualidade de cada regra de forma independente. Assim, como os classificadores têm por objetivo uma boa precisão nos casos não vistos, eles tendem a negligenciar outras propriedades desejáveis das regras de conhecimento, como a habilidade de causar surpresa ou trazer conhecimento novo ao especialista do domínio. Neste trabalho, estamos interessados em construir regras de conhecimento com propriedades específicas de forma isolada, i.e. sem considerar o problema da interação entre as regras. Para esse fim, propomos uma abordagem evolutiva na qual cada individuo da população do algoritmo representa uma única regra e as propriedades específicas são codificadas como medidas de qualidade da regra, as quais podem ser escolhidas pelo especialista do domínio para construir regras com as propriedades desejadas. O algoritmo evolutivo proposto utiliza uma rica estrutura para representar os indivíduos (regras), a qual possibilita considerar uma grande variedade de operadores evolutivos. O algoritmo utiliza uma função de aptidão multi-objetivo baseada em ranking que considera de forma concomitante mais que uma medida de avaliação de regra, transformando-as numa função simples-objetivo. Como a avaliação experimental é fundamental neste tipo de trabalho, para avaliar nossa proposta foi implementada a Evolutionary Computing Learning Environment --- ECLE --- que é uma biblioteca de classes para executar e avaliar o algoritmo evolutivo sob diferentes cenários. Além disso, a ECLE foi implementada considerando futuras implementações de novos operadores evolutivos. A ECLE está integrada ao projeto DISCOVER, que é um projeto de pesquisa em desenvolvimento em nosso laboratório para a aquisição automática de conhecimento. Analises experimentais do algoritmo evolutivo para construir regras de conhecimento com propriedades específicas, o qual pode ser considerado uma forma de análise inteligente de dados, foram realizadas utilizando a ECLE. Os resultados mostram a adequabilidade da nossa proposta / Most symbolic machine learning approaches use if-then know-ledge rules as the description language in which the learned knowledge is expressed. The aim of these learners is to find a set of classification rules that can be used to predict new instances that have not been seen by the learner before. However, these sorts of learners take into account the rule interaction problem, which consists of evaluating the quality of the set of rules (classifier) as a whole, rather than evaluating the quality of each rule in an independent manner. Thus, as classifiers aim at good precision to classify unseen instances, they tend to neglect other desirable properties of knowledge rules, such as the ability to cause surprise or bring new knowledge to the domain specialist. In this work, we are interested in building knowledge rules with specific properties in an isolated manner, i.e. not considering the rule interaction problem. To this end, we propose an evolutionary approach where each individual of the algorithm population represents a single rule and the specific properties are encoded as rule quality measure, a set of which can be freely selected by the domain specialist. The proposed evolutionary algorithm uses a rich structure for individual representation which enables one to consider a great variety of evolutionary operators. The algorithm uses a ranking-based multi-objective fitness function that considers more than one rule evaluation measure concomitantly into a single objective. As experimentation plays an important role in this sort of work, in order to evaluate our proposal we have implemented the Evolutionary Computing Learning Environment --- ECLE --- which is a framework to evaluate the evolutionary algorithm in different scenarios. Furthermore, the ECLE has been implemented taking into account future development of new evolutionary operators. The ECLE is integrated into the DISCOVER project, a major research project under constant development in our laboratory for automatic knowledge acquisition and analysis. Experimental analysis of the evolutionary algorithm to construct knowledge rules with specific properties, which can also be considered an important form of intelligent data analysis, was carried out using ECLE. Results show the suitability of our proposal Computação evolutiva Descoberta de conhecimento Evolutionary computing Knowledge discovery Knowledge rules Regras de conhecimento
383	Hypothesis-free detection of genome-changing events in pedigree sequencing Garimella, Kiran January 2016 (has links) In high-diversity populations, a complete accounting of de novo mutations can be difficult to obtain. Most analyses involve identifying such mutations by sequencing pedigrees on second-generation sequencing platforms and aligning the short reads to a reference assembly, the genomic sequence of a canonical member (or members) of a species. Often, large regions of the genomes under study may be greatly diverged from the reference sequence, or not represented at all (e.g. the HLA, antigenic genes, or other regions under balancing selective pressure). If the haplotypic background upon which a mutation occurs is absent, events can easily be missed (as reads have nowhere to align) and false-positives may abound (as the software forces the reads to align elsewhere). This thesis presents a novel method for de novo mutation discovery and allele identification. Rather than relying on alignment, our method is based on the de novo assembly of short-read sequence data using a multi-color de Bruijn graph. In this data structure, each sample is assigned a unique index (or "color"), reads from each sample are decomposed into smaller subsequences of length k (or "kmers"), and color-specific adjacency information between kmers is recorded. Mutations can be discovered in the graph itself by searching for characteristic motifs (e.g. a "bubble motifs", indicative of a SNP or indel, and "linear motifs" indicative of allelic and non-allelic recombination). De novo mutations differ from inherited mutations in that the kmers spanning the variant allele are absent in the parents; in a sense, they facilitate their own discovery by generating "novel" sequence. We exploit this fact to limit processing of the graph to only those regions containing these novel kmers. We verified our approach using simulations, validation, and visualization. On the simulations, we developed genome and read generation software driven by empirical distributions computed from real data to emit genomes with realistic features: recombinations, de novo variants, read fragment sizes, sequencing errors, and coverage profiles. In 20 artifical samples, we determined our sensitivity and specificity for novel kmer recovery to be approximately 98% and 100% at worst, respectively. Not every novel stretch can be reconstituted as a variant, owing to errors and homology in the graph. In simulations, our false discovery rate was 10% for "bubble" events and 12% for "linear" events. On validation, we obtained a high-quality draft assembly for a single P. falciparum child using a third-generation sequencing platform. We discovered three de novo events in the draft assembly, all three of which are recapitulated in our calls on the second-generation sequencing data for the same sample; no false-positives are present. On visualization, we developed an interactive web application capable of rendering a multi-color subgraph that assists in visually distinguishing between true variation and sequencing artifacts. We applied our caller to real datasets: 115 progeny across four previously analyzed experimental crosses of Plasmodium falciparum. We demonstrate our ability to access subtelomeric compartments of the genome, regions harboring antigenic genes under tremendous selective pressure, thus highly divergent between geographically distinct isolates and routinely masked and ignored in reference-based analyses. We also show our caller's ability to recover an important form of structural de novo variation: non-allelic homologous recombination (NAHR) events, an important mechanism for the pathogen to diversify its own antigenic repertoire. We demonstrate our ability to recover the few events in these samples known to exist, and overturn some previous findings indicating exchanges between "core" (non-subtelomeric) genes. We compute the SNP mutation rate to be approximately 2.91 per sample, insertion and deletion mutation rates to be 0.55 and 1.04 per sample, respectively, multi-nucleotide polymorphisms to be 0.72 per sample, and NAHR events to be 0.33 per sample. These findings are consistent across crosses. Finally, we investigated our method's scaling capabilities by processing a quintet of previously analyzed Pan troglodytes verus (western chimpanzee) samples. The genome of the chimpanzee is two orders of magnitude larger than the malaria parasite's (3, 300 Mbp versus 23 Mbp), diploid rather than haploid, poorly assembled, and the read dataset is lower coverage (20x versus 120x). Comparing to Sequenom validation data as well as visual validation, our sensitivity is expectedly low. However, this can be attributed to overaggressiveness in data cleaning applied by the de novo assembler atop which our software is built. We discuss the precise changes that would likely need to be made in future work to adapt our method to low-coverage samples.
384	A computational model of Lakatos-style reasoning Pease, Alison January 2007 (has links) Lakatos outlined a theory of mathematical discovery and justification, which suggests ways in which concepts, conjectures and proofs gradually evolve via interaction between mathematicians. Different mathematicians may have different interpretations of a conjecture, examples or counterexamples of it, and beliefs regarding its value or theoremhood. Through discussion, concepts are refined and conjectures and proofs modified. We hypothesise that: (i) it is possible to computationally represent Lakatos's theory, and (ii) it is useful to do so. In order to test our hypotheses we have developed a computational model of his theory. Our model is a multiagent dialogue system. Each agent has a copy of a pre-existing theory formation system, which can form concepts and make conjectures which empirically hold for the objects of interest supplied. Distributing the objects of interest between agents means that they form different theories, which they communicate to each other. Agents then find counterexamples and use methods identified by Lakatos to suggest modifications to conjectures, concept definitions and proofs. Our main aim is to provide a computational reading of Lakatos's theory, by interpreting it as a series of algorithms and implementing these algorithms as a computer program. This is the first systematic automated realisation of Lakatos's theory. We contribute to the computational philosophy of science by interpreting, clarifying and extending his theory. We also contribute by evaluating his theory, using our model to test hypotheses about it, and evaluating our extended computational theory on the basis of criteria proposed by several theorists. A further contribution is to automated theory formation and automated theorem proving. The process of refining conjectures, proofs and concept definitions requires a flexibility which is inherently useful in fields which handle ill-specified problems, such as theory formation. Similarly, the ability to automatically modify an open conjecture into one which can be proved, is a valuable contribution to automated theorem proving. 004.01
385	Mining Oncology Data: Knowledge Discovery in Clinical Performance of Cancer Patients Hayward, John T 16 August 2006 (has links) "Our goal in this research is twofold: to develop clinical performance databases of cancer patients, and to conduct data mining and machine learning studies on collected patient records. We use these studies to develop models for predicting cancer patient medical outcomes. The clinical database is developed in conjunction with surgeons and oncologists at UMass Memorial Hospital. Aspects of the database design and representation of patient narrative are discussed here. Current predictive model design in medical literature is dominated by linear and logistic regression techniques. We seek to show that novel machine learning methods can perform as well or better than these traditional techniques. Our machine learning focus for this thesis is on pancreatic cancer patients. Classification and regression prediction targets include patient survival, wellbeing scores, and disease characteristics. Information research in oncology is often constrained by type variation, missing attributes, high dimensionality, skewed class distribution, and small data sets. We compensate for these difficulties using preprocessing, meta-learning, and other algorithmic methods during data analysis. The predictive accuracy and regression error of various machine learning models are presented as results, as are t-tests comparing these to the accuracy of traditional regression methods. In most cases, it is shown that the novel machine learning prediction methods offer comparable or superior performance. We conclude with an analysis of results and discussion of future research possibilities." Clinical Performance Databases Cancer oncology Knowledge Discovery in Databases data mining Cancer Treatment Data processing Data mining
386	Flexibilizando graus de colaboração, segurança e privacidade na descoberta de serviços / Flexible collaboration, security and privacy in service discovery systems Moschetta, Eduardo 28 February 2008 (has links) Made available in DSpace on 2015-03-05T13:59:44Z (GMT). No. of bitstreams: 0 Previous issue date: 28 / Nenhuma / Este trabalho apresenta Flexibel Secure Service Discovery (FSSD), um protocolo para a descoberta de serviços em sistemas ubíquos. Seu projeto é centrado no compromisso entre os níveis de colaboração, segurança e privacidade que os participantes desejam na descoberta. A abordagem proposta oferece gerenciamento de confiança, além de mecanismos de controle de exposição e de acesso descentralizados. As propriedades do protocolo foram avaliadas através de simulações, variando-se os níveis de segurança e privacidade do sistema para demonstrar que a abordagem proposta lida adequadamente com o compromisso em relação à colaboração entre pares / This work presents Flexibel Secure Service Discovery (FSSD), a protocol for service discovery in ubiquitous systems. Its design is centered by the participants. The proposed approach provides trust management, in addition to descentralized mechanisms to control the exposure and access to the service information. The protocol properties were evaluated with simulation, by varying both security and privacy levels of the system in order to demonstrate that the proposed approach properly addresses the tradeoff regarding peer collaboration Ciências Exatas e da Terra controle de exposição rede de confiança descoberta de serviços exposure control trust network service discovery
387	Establishing C. elegans as a high-throughput system for the identification of novel therapeutic strategies for Parkinson's disease Perni, Michele January 2017 (has links) No description available.
388	Fragment synthesis : pharmacophore and diversity oriented approaches North, Andrew James Peter January 2019 (has links) This thesis explores two approaches to fragment-based drug discovery. First, protein target CK2 was chosen due to its importance in the cancer phenotype. A literature fragment, NMR154L, proved to be a promising compound for fragment development, due to its binding at the interface site of the protein rather than the highly conserved ATP pocket. Analogues were synthesised of this fragment leading to a candidate with a better IC50. Additionally, computer modelling of the interface site suggested that a series of spirocyclic compounds would inhibit this protein. These were synthesised and tested in vitro. Results from these tests were analysed and informed the synthesis of new inhibitors with the aid of crystal structures and computer modelling. Secondly, to address the lack of spirocyclic scaffolds in fragment screening libraries a number of diversity-orientated synthetic campaigns were undertaken. The first of these utilised glycine as starting material. Two terminal alkenes were installed. The alkenes were linked and the amino and acidic residues cyclised. This allowed for the formation of a diverse range of spirocyclic scaffolds from this one starting material. Having established chemistry for linking amino and acidic residues a campaign with dehydroalanine was under taken. This would allow for the installation of the second ring by pericyclic chemistry as well as using chemistry previously established. This pericyclic chemistry was also applied to synthesising spirocycles from rings with exocyclic double bonds. These being readily installed from Wittig chemistry, this allowed utilisation of starting materials which contained a cyclic ketone. Of these azetidinone was a good candidate due to the fact it was a commercially available building block and allowed access to spirocycles containing a 4-membered ring; an underrepresented ring size. Finally, computation analysis was carried out on the library to assess it diversity and any potential biological targets which these fragments may inhibit.
389	A visual analytics approach for passing strateggies analysis in soccer using geometric features Malqui, José Luis Sotomayor January 2017 (has links) As estrategias de passes têm sido sempre de interesse para a pesquisa de futebol. Desde os inícios do futebol, os técnicos tem usado olheiros, gravações de vídeo, exercícios de treinamento e feeds de dados para coletar informações sobre as táticas e desempenho dos jogadores. No entanto, a natureza dinâmica das estratégias de passes são bastante complexas para refletir o que está acontecendo dentro do campo e torna difícil o entendimento do jogo. Além disso, existe uma demanda crecente pela deteção de padrões e analise de estrategias de passes popularizado pelo tiki-taka utilizado pelo FC. Barcelona. Neste trabalho, propomos uma abordagem para abstrair as sequências de pases e agrupálas baseadas na geometria da trajetória da bola. Para analizar as estratégias de passes, apresentamos um esquema de visualização interátiva para explorar a frequência de uso, a localização espacial e ocorrência temporal das sequências. A visualização Frequency Stripes fornece uma visão geral da frequencia dos grupos achados em tres regiões do campo: defesa, meio e ataque. O heatmap de trajetórias coordenado com a timeline de passes permite a exploração das formas mais recorrentes no espaço e tempo. Os resultados demostram oito trajetórias comunes da bola para sequências de três pases as quais dependem da posição dos jogadores e os ângulos de passe. Demonstramos o potencial da nossa abordagem com utilizando dados de várias partidas do Campeonato Brasileiro sob diferentes casos de estudo, e reportamos os comentários de especialistas em futebol. / Passing strategies analysis has always been of interest for soccer research. Since the beginning of soccer, managers have used scouting, video footage, training drills and data feeds to collect information about tactics and player performance. However, the dynamic nature of passing strategies is complex enough to reflect what is happening in the game and makes it hard to understand its dynamics. Furthermore, there exists a growing demand for pattern detection and passing sequence analysis popularized by FC Barcelona’s tiki-taka. We propose an approach to abstract passing strategies and group them based on the geometry of the ball trajectory. To analyse passing sequences, we introduce a interactive visualization scheme to explore the frequency of usage, spatial location and time occurrence of the sequences. The frequency stripes visualization provide, an overview of passing groups frequency on three pitch regions: defense, middle, attack. A trajectory heatmap coordinated with a passing timeline allow, for the exploration of most recurrent passing shapes in temporal and spatial domains. Results show eight common ball trajectories for three-long passing sequences which depend on players positioning and on the angle of the pass. We demonstrate the potential of our approach with data from the Brazilian league under several case studies, and report feedback from a soccer expert. Computação gráfica Reconhecimento : Padroes Futebol : Regras Visual analytics Visual knowledge discovery Sport analytics Pattern recognition
390	Förmedling av discovery-verktyg vid högskole- och universitetsbibliotek : En enkätstudie om undervisande bibliotekariers inställningar till discovery-verktyg och hur de förmedlar dessa vid referenssamtal och användarundervisning / Mediating discovery tools in higher education libraries : A survey of instructing librarians' attitudes towards discovery tools and how they mediate these through reference interviews and user training Hannerz, Einar, Wiborgh, Mika January 2014 (has links) The purpose of this thesis is to examine how Swedish highereducation librarians mediate discovery tools to its users. Thisstudy aims to investigate higher education librarians’ generalattitudes towards discovery tools, their perception ofstudents’ discovery tool usage, and how they mediatediscovery tools to students through reference interviews anduser training. The empirical ground of this study is a semistructuredsurvey that was answered by 115 instructinglibrarians. The study concludes that although librariansgenerally have a critical attitude towards discovery tools theyalso think that the discovery tools serve a useful purpose,especially as a starting point in the information searchprocess. The study also concludes that librarians generallyperceive students’ attitudes towards discovery tools aspositive, although students do not always use the tools totheir full potential. The librarians also raised the importanceof user training that is less focused on teaching searchtechniques and more focused on information literacy. / Program: Bibliotekarie discovery web scale informationssökning informationskompetens enkätstudie bibliotek användarundervisning referenssamtal Social Sciences Samhällsvetenskap

Search results