Global ETD Search

1	Novel Semi-Supervised Learning Models to Balance Data Inclusivity and Usability in Healthcare Applications January 2019 (has links) abstract: Semi-supervised learning (SSL) is sub-field of statistical machine learning that is useful for problems that involve having only a few labeled instances with predictor (X) and target (Y) information, and abundance of unlabeled instances that only have predictor (X) information. SSL harnesses the target information available in the limited labeled data, as well as the information in the abundant unlabeled data to build strong predictive models. However, not all the included information is useful. For example, some features may correspond to noise and including them will hurt the predictive model performance. Additionally, some instances may not be as relevant to model building and their inclusion will increase training time and potentially hurt the model performance. The objective of this research is to develop novel SSL models to balance data inclusivity and usability. My dissertation research focuses on applications of SSL in healthcare, driven by problems in brain cancer radiomics, migraine imaging, and Parkinson’s Disease telemonitoring. The first topic introduces an integration of machine learning (ML) and a mechanistic model (PI) to develop an SSL model applied to predicting cell density of glioblastoma brain cancer using multi-parametric medical images. The proposed ML-PI hybrid model integrates imaging information from unbiopsied regions of the brain as well as underlying biological knowledge from the mechanistic model to predict spatial tumor density in the brain. The second topic develops a multi-modality imaging-based diagnostic decision support system (MMI-DDS). MMI-DDS consists of modality-wise principal components analysis to incorporate imaging features at different aggregation levels (e.g., voxel-wise, connectivity-based, etc.), a constrained particle swarm optimization (cPSO) feature selection algorithm, and a clinical utility engine that utilizes inverse operators on chosen principal components for white-box classification models. The final topic develops a new SSL regression model with integrated feature and instance selection called s2SSL (with “s2” referring to selection in two different ways: feature and instance). s2SSL integrates cPSO feature selection and graph-based instance selection to simultaneously choose the optimal features and instances and build accurate models for continuous prediction. s2SSL was applied to smartphone-based telemonitoring of Parkinson’s Disease patients. / Dissertation/Thesis / Doctoral Dissertation Industrial Engineering 2019 Industrial engineering Biomedical engineering Bioinformatics glioblastoma graph sampling migraine particle swarm optimization semi-supervised learning telemonitoring
2	Large-Scale Graph Visual Analytics Zhang, Fangyan 08 December 2017 (has links) Large-scale graph analysis and visualization is becoming a more challenging task, due to the increasing amount of graph data. This dissertation focuses on methods to ease the task of exploring large-scale graphs through graph sampling and visualization. Graph sampling aims to reduce the complexity of graph drawing, while preserving properties of the original graph, allowing analysis of the smaller sample which yields the characteristics similar to those of the original graph. Graph visualization is an effective and intuitive approach to observing structures within graph data. For large-scale graphs, graph sampling and visualization are straightforward strategies to gain insights into common issues that are often encountered. This dissertation evaluates commonly used graph sampling methods through a combined visual and statistical comparison of graphs sampled at various rates based on random graphs, small-world graphs, scaleree graphs, and real-world graphs. This benchmark study can be used as a guideline in choosing the appropriate method for a particular graph sampling task. In addition, this thesis proposes three types of distributed sampling algorithms and develops a sampling package on Spark. Compared with traditional/non-distributed graph sampling approaches, the scalable distributed sampling approaches are as reliable as the traditional/non-distributed graph sampling techniques, and they bring much needed improvement to sampling efficiency, especially with regards to topology-based sampling. This benchmark study in traditional/non-distributed graph sampling is also applicable to distributed graph sampling as well. A contribution to the area of graph visualization is also made through the presentation of a scalable graph visualization system-BGS (Big Graph Surfer) that creates hierarchical structure from an original graph and provides interactive navigation along the hierarchy by expanding or collapsing clusters when visualizing large-scale graphs. A distributed computing framework-Spark provides the backend for BGS on clustering and visualization. This architecture makes it capable of visualizing a graph up to 1 billion nodes or edges in real-time. In addition, BGS provides a series of hierarchy and graph exploration methods, such as hierarchy view, hierarchy navigation, hierarchy search, graph view, graph navigation, graph search, and other useful interactions. These functionalities facilitate the exploration of very large-scale graphs. Evaluation of BGS is performed through application to several representative of large-scale graph datasets and comparison with other existing graph visualization tools in scalability, usability, and flexibility. The dissertation concludes with a summarization of the contributions and their improvement on large-scale graph analysis and visualization, and a discussion about possible future work on this research field. large-scale graph graph sampling graph property graph visualization graph clustering graph hierarchy
3	Comparison study on graph sampling algorithms for interactive visualizations of large-scale networks Voroshilova, Alexandra January 2019 (has links) Networks are present in computer science, sociology, biology, and neuroscience as well as in applied fields such as transportation, communication, medical industries. The growing volumes of data collection are pushing scalability and performance requirements on graph algorithms, and at the same time, a need for a deeper understanding of these structures through visualization arises. Network diagrams or graph drawings can facilitate the understanding of data, making intuitive the identification of the largest clusters, the number of connected components, the overall structure, and detecting anomalies, which is not achievable through textual or matrix representations. The aim of this study was to evaluate approaches that would enable visualization of a large scale peer-to-peer video live streaming networks. The visualization of such large scale graphs has technical limitations which can be overcome by filtering important structural data from the networks. In this study, four sampling algorithms for graph reduction were applied to large overlay peer-to-peer network graphs and compared. The four algorithms cover different approaches: selecting links with the highest weight, selecting nodes with the highest cumulative weight, using betweenness centrality metrics, and constructing a focus-based tree. Through the evaluation process, it was discovered that the algorithm based on betweenness centrality approximation offers the best results. Finally, for each of the algorithms in comparison, their resulting sampled graphs were visualized using a forcedirected layout with a 2-step loading approach to depict their effect on the representation of the graphs. / Nätverk återfinns inom datavetenskap, sociologi, biologi och neurovetenskap samt inom tillämpade områden så som transport, kommunikation och inom medicinindustrin. Den växande mängden datainsamling pressar skalbarheten och prestandakraven på grafalgoritmer, samtidigt som det uppstår ett behov av en djupare förståelse av dessa strukturer genom visualisering. Nätverksdiagram eller grafritningar kan underlätta förståelsen av data, identifiera de största grupperna, ett antal anslutna komponenter, visa en övergripande struktur och upptäcka avvikelser, något som inte kan uppnås med texteller matrisrepresentationer. Syftet med denna studie var att utvärdera tillvägagångssätt som kunde möjliggöra visualisering av ett omfattande P2P (peer-to-peer) livestreamingnätverk. Visualiseringen av större grafer har tekniska begränsningar, något som kan lösas genom att samla viktiga strukturella data från nätverken. I den här studien applicerades fyra provtagningsalgoritmer för grafreduktion på stora överlagringar av P2P-nätverksgrafer för att sedan jämföras. De fyra algoritmerna är baserade på val av länkar med högsta vikt, av nodar med högsta kumulativa vikt, betweenness-centralitetsvärden för att konstruera ett fokusbaserat träd som har de längsta vägarna uteslutna. Under utvärderingsprocessen upptäcktes det att algoritmen baserad på betweenness-centralitetstillnärmning visade de bästa resultaten. Dessutom, för varje algoritm i jämförelsen, visualiserades deras slutliga samplade grafer genom att använda en kraftstyrd layout med ett 2-stegs laddningsinfart. Graph sampling graph filtering large graph visualization grafreduktion stor graf visualisering Computer and Information Sciences Data- och informationsvetenskap

1

Page generated in 0.0636 seconds