Spelling suggestions: "subject:"graphbased"" "subject:"graphsbased""
81 |
Agrupamento de sequências de miRNA utilizando aprendizado não-supervisionado baseado em grafosKasahara, Viviani Akemi 12 August 2016 (has links)
Submitted by Izabel Franco (izabel-franco@ufscar.br) on 2016-10-11T17:36:54Z
No. of bitstreams: 1
DissVAK.pdf: 4608619 bytes, checksum: 3022034b9035e4e8caf1195902d24581 (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-10-21T13:03:21Z (GMT) No. of bitstreams: 1
DissVAK.pdf: 4608619 bytes, checksum: 3022034b9035e4e8caf1195902d24581 (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-10-21T13:03:27Z (GMT) No. of bitstreams: 1
DissVAK.pdf: 4608619 bytes, checksum: 3022034b9035e4e8caf1195902d24581 (MD5) / Made available in DSpace on 2016-10-21T13:03:34Z (GMT). No. of bitstreams: 1
DissVAK.pdf: 4608619 bytes, checksum: 3022034b9035e4e8caf1195902d24581 (MD5)
Previous issue date: 2016-08-12 / Não recebi financiamento / Cluster analysis is the organization of a collection of patterns into clusters based on similarity
which is determined by using properties of data. Clustering techniques can be useful in a
variety of knowledge domains such as biotechnology, computer vision, document retrieval and
many others. An interesting area of biology involves the concept of microRNAs (miRNAs) that
are approximately 22 nucleotide-long non-coding RNA molecules that play important roles in
gene regulation. Clustering miRNA sequences can help to understand and explore sequences
belonging to the same cluster that has similar biological functions. This research work
investigates and explores seven unsupervised clustering algorithms based on graphs that can be
divided into three categories: algorithm based on region of influence, algorithm based on
minimum spanning tree and spectral algorithm. To assess the contribution of the proposed
algorithms, data from miRNA families stored in the online miRBase database were used in the
conducted experiments. The results of these experiments were presented, analysed and
evaluated using clustering validation indexes as well as visual analysis. / A análise de agrupamento é uma organização de coleção de padrões em grupos, baseando-se na
similaridade das propriedades pertencentes aos dados. A técnica de agrupamento pode ser
utilizado em muitas áreas de conhecimento como biotecnologia, visão computacional,
recuperação de documentos, entre outras. Uma área interessante da biologia envolve o conceito
de microRNAs (miRNAs), que são moléculas não-codificadas de RNA com aproximadamente
22 nucleotídeos e que desempenham um papel importante na regulação dos genes. O
agrupamento de sequências de miRNA podem ajudar em sua exploração e entendimento, pois
as sequências que pertencem ao mesmo grupo possuem uma função biológica similar. Esse
trabalho explora e investiga sete algoritmos de agrupamentos não-supervisionados baseados em
grafos que podem ser divididos em três categorias: algoritmos baseados em região de
influência, algoritmos baseados em árvore spanning minimal e algoritmo espectral. Para avaliar
a contribuição dos algoritmos propostos, os experimentos conduzidos utilizaram os dados das
famílias de miRNAs disponíveis no banco de dados denominado miRBase. Os resultados dos
experimentos foram apresentados, analisados e avaliados usando índices de validação de
agrupamento e análise visual.
|
82 |
A Framework for Secure Structural AdaptationSaman Nariman, Goran January 2018 (has links)
A (self-) adaptive system is a system that can dynamically adapt its behavior or structure during execution to "adapt" to changes to its environment or the system itself. From a security standpoint, there has been some research pertaining to (self-) adaptive systems in general but not enough care has been shown towards the adaptation itself. Security of systems can be reasoned about using threat models to discover security issues in the system. Essentially that entails abstracting away details not relevant to the security of the system in order to focus on the important aspects related to security. Threat models often enable us to reason about the security of a system quantitatively using security metrics. The structural adaptation process of a (self-) adaptive system occurs based on a reconfiguration plan, a set of steps to follow from the initial state (configuration) to the final state. Usually, the reconfiguration plan consists of multiple strategies for the structural adaptation process and each strategy consists of several steps steps with each step representing a specific configuration of the (self-) adaptive system. Different reconfiguration strategies have different security levels as each strategy consists of a different sequence configuration with different security levels. To the best of our knowledge, there exist no approaches which aim to guide the reconfiguration process in order to select the most secure available reconfiguration strategy, and the explicit security of the issues associated with the structural reconfiguration process itself has not been studied. In this work, based on an in-depth literature survey, we aim to propose several metrics to measure the security of configurations, reconfiguration strategies and reconfiguration plans based on graph-based threat models. Additionally, we have implemented a prototype to demonstrate our approach and automate the process. Finally, we have evaluated our approach based on a case study of our making. The preliminary results tend to expose certain security issues during the structural adaptation process and exhibit the effectiveness of our proposed metrics.
|
83 |
Software Fault Detection in Telecom Networks using Bi-level Federated Graph Neural Networks / Upptäckt av SW-fel i telekommunikationsnätverk med hjälp av federerade grafiska neurala nätverk på två nivåerBourgerie, Rémi January 2023 (has links)
The increasing complexity of telecom networks, induced by the recent development of 5G, is a challenge for detecting faults in the telecom network. In addition to the structural complexity of telecommunication systems, data accessibility has become an issue both in terms of privacy and access cost. We propose a method relying on bi-level Federated Graph Neural Networks to identify anomalies in the telecom network while ensuring reduced communication costs as well as data privacy. Our method considers telecom data as a bi-level graph, where the highest level graph represents the interaction between sites, and each site is further expanded to its software (SW) performance behaviour graph. We developed and compared 4G/5G SW Fault Detection models under 3 settings: (1) Centralized Temporal Graph Neural Networks model: we propose a model to detect anomalies in 4G/5G telecom data. (2) Federated Temporal Graph Neural Networks model: we propose Federated Learning (FL) as a mechanism for privacy-aware training of models for fault detection. (3) Personalized Federated Temporal Graph Neural Networks model: we propose a novel aggregation technique, referred to as FedGraph, leveraging both a graph and the similarities between sites for aggregating the models and proposing models more personalized to each site’s behaviour. We compare the benefits of Federated Learning (FL) models (2) and (3) with centralized training (1) in terms of SW performance data modelling, anomaly detection, and communication cost. The evaluation includes both a scenario with normal functioning sites and a scenario where only a subset of sites exhibit faulty behaviour. The combination of SW execution graphs with GNNs has shown improved modelling performance and minor gains in centralized settings (1). In a normal network context, FL models (2) and (3) perform comparably to centralized training (CL), with slight improvements observed when using the personalized strategy (3). However, in abnormal network scenarios, Federated Learning falls short of achieving comparable detection performance to centralized training. This is due to the unintended learning of abnormal site behaviour, particularly when employing the personalized model (3). These findings highlight the importance of carefully assessing and selecting suitable FL strategies for anomaly detection and model training on telecom network data. / Den ökande komplexiteten i telenäten, som är en följd av den senaste utvecklingen av 5G, är en utmaning när det gäller att upptäcka fel i telenäten. Förutom den strukturella komplexiteten i telekommunikationssystem har datatillgänglighet blivit ett problem både när det gäller integritet och åtkomstkostnader. Vi föreslår en metod som bygger på Federated Graph Neural Networks på två nivåer för att identifiera avvikelser i telenätet och samtidigt säkerställa minskade kommunikationskostnader samt dataintegritet. Vår metod betraktar telekomdata som en graf på två nivåer, där grafen på den högsta nivån representerar interaktionen mellan webbplatser, och varje webbplats utvidgas ytterligare till sin graf för programvarans (SW) prestandabeteende. Vi utvecklade och jämförde 4G/5G SW-feldetekteringsmodeller under 3 inställningar: (1) Central Temporal Graph Neural Networks-modell: vi föreslår en modell för att upptäcka avvikelser i 4G/5G-telekomdata. (2) Federated Temporal Graph Neural Networks-modell: vi föreslår Federated Learning (FL) som en mekanism för integritetsmedveten utbildning av modeller för feldetektering. I motsats till centraliserad inlärning aggregeras lokalt tränade modeller på serversidan och skickas tillbaka till klienterna utan att data läcker ut mellan klienterna och servern, vilket säkerställer integritetsskyddande samarbetsutbildning. (3) Personaliserad Federated Temporal Graph Neural Networks-modell: vi föreslår en ny aggregeringsteknik, kallad FedGraph, som utnyttjar både en graf och likheterna mellan webbplatser för att aggregera modellerna. Vi jämför fördelarna med modellerna Federated Learning (FL) (2) och (3) med centraliserad utbildning (1) när det gäller datamodellering av SW-prestanda, anomalidetektering och kommunikationskostnader. Utvärderingen omfattar både ett scenario med normalt fungerande anläggningar och ett scenario där endast en delmängd av anläggningarna uppvisar felaktigt beteende. Kombinationen av SW-exekveringsgrafer med GNN har visat förbättrad modelleringsprestanda och mindre vinster i centraliserade inställningar (1). I en normal nätverkskontext presterar FL-modellerna (2) och (3) jämförbart med centraliserad träning (CL), med små förbättringar observerade när den personliga strategin används (3). I onormala nätverksscenarier kan Federated Learning dock inte uppnå jämförbar detekteringsprestanda med centraliserad träning. Detta beror på oavsiktlig inlärning av onormalt beteende på webbplatsen, särskilt när man använder den personliga modellen (3). Dessa resultat belyser vikten av att noggrant bedöma och välja lämpliga FL-strategier för anomalidetektering och modellträning på telekomnätdata.
|
84 |
Prioritizing Causative Genomic Variants by Integrating Molecular and Functional Annotations from Multiple Biomedical OntologiesAlthagafi, Azza Th. 20 July 2023 (has links)
Whole-exome and genome sequencing are widely used to diagnose individual patients. However, despite its success, this approach leaves many patients undiagnosed. This could be due to the need to discover more disease genes and variants or because disease phenotypes are novel and arise from a combination of variants of multiple known genes related to the disease. Recent rapid increases in available genomic, biomedical, and phenotypic data enable computational analyses, reducing the search space for disease-causing genes or variants and facilitating the prediction of causal variants. Therefore, artificial intelligence, data mining, machine learning, and deep learning are essential tools that have been used to identify biological interactions, including protein-protein interactions, gene-disease predictions, and variant--disease associations. Predicting these biological associations is a critical step in diagnosing patients with rare or complex diseases.
In recent years, computational methods have emerged to improve gene-disease prioritization by incorporating phenotype information. These methods evaluate a patient's phenotype against a database of gene-phenotype associations to identify the closest match. However, inadequate knowledge of phenotypes linked with specific genes in humans and model organisms limits the effectiveness of the prediction. Information about gene product functions and anatomical locations of gene expression is accessible for many genes and can be associated with phenotypes through ontologies and machine-learning models. Incorporating this information can enhance gene-disease prioritization methods and more accurately identify potential disease-causing genes.
This dissertation aims to address key limitations in gene-disease prediction and variant prioritization by developing computational methods that systematically relate human phenotypes that arise as a consequence of the loss or change of gene function to gene functions and anatomical and cellular locations of activity. To achieve this objective, this work focuses on crucial problems in the causative variant prioritization pipeline and presents novel computational methods that significantly improve prediction performance by leveraging large background knowledge data and integrating multiple techniques.
Therefore, this dissertation presents novel approaches that utilize graph-based machine-learning techniques to leverage biomedical ontologies and linked biological data as background knowledge graphs. The methods employ representation learning with knowledge graphs and introduce generic models that address computational problems in gene-disease associations and variant prioritization. I demonstrate that my approach is capable of compensating for incomplete information in public databases and efficiently integrating with other biomedical data for similar prediction tasks. Moreover, my methods outperform other relevant approaches that rely on manually crafted features and laborious pre-processing. I systematically evaluate our methods and illustrate their potential applications for data analytics in biomedicine. Finally, I demonstrate how our prediction tools can be used in the clinic to assist geneticists in decision-making. In summary, this dissertation contributes to the development of more effective methods for predicting disease-causing variants and advancing precision medicine.
|
Page generated in 0.0688 seconds