Global ETD Search

451	ANALYSIS OF CONTINUOUS LEARNING MODELS FOR TRAJECTORY REPRESENTATION Kendal Graham Norman (15344170) 24 April 2023 (has links) <p> Trajectory planning is a field with widespread utility, and imitation learning pipelines<br> show promise as an accessible training method for trajectory planning. MPNet is the state<br> of the art for imitation learning with respect to success rates. MPNet has two general<br> components to its runtime: a neural network predicts the location of the next anchor point in<br> a trajectory, and then planning infrastructure applies sampling-based techniques to produce<br> near-optimal, collision-less paths. This distinction between the two parts of MPNet prompts<br> investigation into the role of the neural architectures in the Neural Motion Planning pipeline,<br> to discover where improvements can be made. This thesis seeks to explore the importance<br> of neural architecture choice by removing the planning structures, and comparing MPNet’s<br> feedforward anchor point predictor with that of a continuous model trained to output a<br> continuous trajectory from start to goal. A new state of the art model in continuous learning<br> is the Neural Flow model. As a continuous model, it possess a low standard deviation runtime<br> which can be properly leveraged in the absence of planning infrastructure. Neural Flows also<br> output smooth, continuous trajectory curves that serve to reduce noisy path outputs in the<br> absence of lazy vertex contraction. This project analyzes the performance of MPNet, Resnet<br> Flow, and Coupling Flow models when sampling-based planning tools such as dropout, lazy<br> vertex contraction, and replanning are removed. Each neural planner is trained end-to-end in<br> an imitation learning pipeline utilizing a simple feedforward encoder, a CNN-based encoder,<br> and a Pointnet encoder to encode the environment, for purposes of comparison. Results<br> indicate that performance is competitive, with Neural Flows slightly outperforming MPNet’s<br> success rates on our reduced dataset in Simple2D, and being slighty outperformed by MPNet<br> with respect to collision penetration distance in our UR5 Cubby test suite. These results<br> indicate that continuous models can compete with the performance of anchor point predictor<br> models when sampling-based planning techniques are not applied. Neural Flow models also<br> have other benefits that anchor point predictors do not, like continuity guarantees, the ability<br> to select a proportional location in a trajectory to output, and smoothness. </p> Intelligent robotics Deep learning Neural networks Neural ODEs Neural Flows Deep Learning Neural Networks Robotics Trajectory Planning Path Planning Anchor Point Prediction
452	Generative adversarial networks for single image super resolution in microscopy images Gawande, Saurabh January 2018 (has links) Image Super resolution is a widely-studied problem in computer vision, where the objective is to convert a lowresolution image to a high resolution image. Conventional methods for achieving super-resolution such as image priors, interpolation, sparse coding require a lot of pre/post processing and optimization. Recently, deep learning methods such as convolutional neural networks and generative adversarial networks are being used to perform super-resolution with results competitive to the state of the art but none of them have been used on microscopy images. In this thesis, a generative adversarial network, mSRGAN, is proposed for super resolution with a perceptual loss function consisting of a adversarial loss, mean squared error and content loss. The objective of our implementation is to learn an end to end mapping between the low / high resolution images and optimize the upscaled image for quantitative metrics as well as perceptual quality. We then compare our results with the current state of the art methods in super resolution, conduct a proof of concept segmentation study to show that super resolved images can be used as a effective pre processing step before segmentation and validate the findings statistically. / Image Super-resolution är ett allmänt studerad problem i datasyn, där målet är att konvertera en lågupplösningsbild till en högupplöst bild. Konventionella metoder för att uppnå superupplösning som image priors, interpolation, sparse coding behöver mycket föroch efterbehandling och optimering.Nyligen djupa inlärningsmetoder som convolutional neurala nätverk och generativa adversariella nätverk är användas för att utföra superupplösning med resultat som är konkurrenskraftiga mot toppmoderna teknik, men ingen av dem har använts på mikroskopibilder. I denna avhandling, ett generativ kontradiktorisktsnätverk, mSRGAN, är föreslås för superupplösning med en perceptuell förlustfunktion bestående av en motsatt förlust, medelkvadratfel och innehållförlust.Mål med vår implementering är att lära oss ett slut på att slut kartläggning mellan bilder med låg / hög upplösning och optimera den uppskalade bilden för kvantitativa metriks såväl som perceptuell kvalitet. Vi jämför sedan våra resultat med de nuvarande toppmoderna metoderna i superupplösning, och uppträdande ett bevis på konceptsegmenteringsstudie för att visa att superlösa bilder kan användas som ett effektivt förbehandling steg före segmentering och validera fynden statistiskt. Deep Learning Generative adversarial networks Super resolution High content screening microscopy Deep Learning Generative adversarial networks Super resolution High content screening microscopy Computer and Information Sciences Data- och informationsvetenskap
453	Opto-Acoustic Slopping Prediction System in Basic Oxygen Furnace Converters Ghosh, Binayak January 2017 (has links) Today, everyday objects are becoming more and more intelligent and some-times even have self-learning capabilities. These self-learning capacities in particular also act as catalysts for new developments in the steel industry.Technical developments that enhance the sustainability and productivity of steel production are very much in demand in the long-term. The methods of Industry 4.0 can support the steel production process in a way that enables steel to be produced in a more cost-effective and environmentally friendly manner. This thesis describes the development of an opto-acoustic system for the early detection of slag slopping in the BOF (Basic Oxygen Furnace) converter process. The prototype has been installed in Salzgitter Stahlwerks, a German steel plant for initial testing. It consists of an image monitoring camera at the converter mouth, a sound measurement system and an oscillation measurement device installed at the blowing lance. The camera signals are processed by a special image processing software. These signals are used to rate the amount of spilled slag and for a better interpretation of both the sound data and the oscillation data. A certain aspect of the opto-acoustic system for slopping detection is that all signals, i.e. optic, acoustic and vibratory, are affected by process-related parameters which are not always relevant for the slopping event. These uncertainties affect the prediction of the slopping phenomena and ultimately the reliability of the entire slopping system. Machine Learning algorithms have been been applied to predict the Slopping phenomenon based on the data from the sensors as well as the other process parameters. / Idag blir vardagliga föremål mer och mer intelligenta och ibland har de självlärande möjligheter. Dessa självlärande förmågor fungerar också som katalysatorer för den nya utvecklingen inom stålindustrin. Teknisk utveckling som stärker hållbarheten och produktiviteten i stålproduktionen är mycket efterfrågad på lång sikt. Metoderna för Industry 4.0 kan stödja stålproduktionsprocessen på ett sätt som gör att stål kan produceras på ett mer kostnadseffektivt och miljövänligt sätt. Denna avhandling beskriver utvecklingen av ett opto-akustiskt system för tidig detektering av slaggsslipning i konverteringsprocessen BOF (Basic Oxygen Furnace). Prototypen har installerats i Salzgitter Stahlwerks, en tysk stålverk för första provning. Den består av en bildövervakningskamera på omvandlarens mun, ett ljudmätningssystem och en oscillationsmätningsenhet som installeras vid blåsans. Kamerans signaler behandlas av en speciell bildbehandlingsprogram. Dessa signaler används för att bestämma mängden spilld slagg och för bättre tolkning av både ljuddata och oscillationsdata. En viss aspekt av det optoakustiska systemet för släckningsdetektering är att alla signaler, dvs optiska, akustiska och vibrerande, påverkas av processrelaterade parametrar som inte alltid är relevanta för slöjningsevenemanget. Dessa osäkerheter påverkar förutsägelsen av slopfenomenerna och i slutändan tillförlitligheten för hela slöjningssystemet. Maskininlärningsalgoritmer har tillämpats för att förutsäga Slopping-fenomenet baserat på data från sensorerna liksom de andra processparametrarna. BOF Slopping Sensor fusion Image Processing Machine Learning Data Analysis Neural Networks Deep Learning BOF Slopping Sensorfusion Bildbehandling Maskininlärning Dataanalys Neurala nätverk Deep Learning Computer Sciences Datavetenskap (datalogi)
454	Deep Brain Dynamics and Images Mining for Tumor Detection and Precision Medicine Lakshmi Ramesh (16637316) 30 August 2023 (has links) <p>Automatic brain tumor segmentation in Magnetic Resonance Imaging scans is essential for the diagnosis, treatment, and surgery of cancerous tumors. However, identifying the hardly detectable tumors poses a considerable challenge, which are usually of different sizes, irregular shapes, and vague invasion areas. Current advancements have not yet fully leveraged the dynamics in the multiple modalities of MRI, since they usually treat multi-modality as multi-channel, and the early channel merging may not fully reveal inter-modal couplings and complementary patterns. In this thesis, we propose a novel deep cross-attention learning algorithm that maximizes the subtle dynamics mining from each of the input modalities and then boosts feature fusion capability. More specifically, we have designed a Multimodal Cross-Attention Module (MM-CAM), equipped with a 3D Multimodal Feature Rectification and Feature Fusion Module. Extensive experiments have shown that the proposed novel deep learning architecture, empowered by the innovative MM- CAM, produces higher-quality segmentation masks of the tumor subregions. Further, we have enhanced the algorithm with image matting refinement techniques. We propose to integrate a Progressive Refinement Module (PRM) and perform Cross-Subregion Refinement (CSR) for the precise identification of tumor boundaries. A Multiscale Dice Loss was also successfully employed to enforce additional supervision for the auxiliary segmentation outputs. This enhancement will facilitate effectively matting-based refinement for medical image segmentation applications. Overall, this thesis, with deep learning, transformer-empowered pattern mining, and sophisticated architecture designs, will greatly advance deep brain dynamics and images mining for tumor detection and precision medicine.</p> Computer vision Multimodal analysis and synthesis Deep learning Neural networks Semantic Segmentation Brain Tumor Segmentation Deep Learning Computer Vision Multimodal ML 3D Computer Vision Attention Cross-Attention Biomedical Segmentation
455	Models and Representation Learning Mechanisms for Graph Data Susheel Suresh (14228138) 15 December 2022 (has links) <p>Graph representation learning (GRL) has been increasing used to model and understand data from a wide variety of complex systems spanning social, technological, bio-chemical and physical domains. GRL consists of two main components (1) a parametrized encoder that provides representations of graph data and (2) a learning process to train the encoder parameters. Designing flexible encoders that capture the underlying invariances and characteristics of graph data are crucial to the success of GRL. On the other hand, the learning process drives the quality of the encoder representations and developing principled learning mechanisms are vital for a number of growing applications in self-supervised, transfer and federated learning settings. To this end, we propose a suite of models and learning algorithms for GRL which form the two main thrusts of this dissertation.</p> <p><br></p> <p>In Thrust I, we propose two novel encoders which build upon on a widely popular GRL encoder class called graph neural networks (GNNs). First, we empirically study the prediction performance of current GNN based encoders when applied to graphs with heterogeneous node mixing patterns using our proposed notion of local assortativity. We find that GNN performance in node prediction tasks strongly correlates with our local assortativity metric---thereby introducing a limit. We propose to transform the input graph into a computation graph with proximity and structural information as distinct types of edges. We then propose a novel GNN based encoder that operates on this computation graph and adaptively chooses between structure and proximity information. Empirically, adopting our transformation and encoder framework leads to improved node classification performance compared to baselines in real-world graphs that exhibit diverse mixing.</p> <p>Secondly, we study the trade-off between expressivity and efficiency of GNNs when applied to temporal graphs for the task of link ranking. We develop an encoder that incorporates a labeling approach designed to allow for efficient inference over the candidate set jointly, while provably boosting expressivity. We also propose to optimize a list-wise loss for improved ranking. With extensive evaluation on real-world temporal graphs, we demonstrate its improved performance and efficiency compared to baselines.</p> <p><br></p> <p>In Thrust II, we propose two principled encoder learning mechanisms for challenging and realistic graph data settings. First, we consider a scenario where only limited or even no labelled data is available for GRL. Recent research has converged on graph contrastive learning (GCL), where GNNs are trained to maximize the correspondence between representations of the same graph in its different augmented forms. However, we find that GNNs trained by traditional GCL often risk capturing redundant graph features and thus may be brittle and provide sub-par performance in downstream tasks. We then propose a novel principle, termed adversarial-GCL (AD-GCL), which enables GNNs to avoid capturing redundant information during the training by optimizing adversarial graph augmentation strategies used in GCL. We pair AD-GCL with theoretical explanations and design a practical instantiation based on trainable edge-dropping graph augmentation. We experimentally validate AD-GCL by comparing with state-of-the-art GCL methods and achieve performance gains in semi-supervised, unsupervised and transfer learning settings using benchmark chemical and biological molecule datasets. </p> <p>Secondly, we consider a scenario where graph data is silo-ed across clients for GRL. We focus on two unique challenges encountered when applying distributed training to GRL: (i) client task heterogeneity and (ii) label scarcity. We propose a novel learning framework called federated self-supervised graph learning (FedSGL), which first utilizes a self-supervised objective to train GNNs in a federated fashion across clients and then, each client fine-tunes the obtained GNNs based on its local task and available labels. Our framework enables the federated GNN model to extract patterns from the common feature (attribute and graph topology) space without the need of labels or being biased by heterogeneous local tasks. Extensive empirical study of FedSGL on both node and graph classification tasks yields fruitful insights into how the level of feature / task heterogeneity, the adopted federated algorithm and the level of label scarcity affects the clients’ performance in their tasks.</p> Data mining and knowledge discovery Graph, social and multimedia data Deep learning Neural networks Semi- and unsupervised learning Graph Neural Networks (GNNs) Deep Learning Self Supervised Learning Federated Learning frameworks
456	XAI-assisted Radio Resource Management: Feature selection and SHAP enhancement / XAI-assisterad radio-resursallokering: Feature selection och förbättring av SHAP Sibuet Ruiz, Nicolás January 2022 (has links) With the fast development of radio technologies, wireless systems have become more convoluted. This complexity, accompanied by an increase of the number of connections, is translated into a need for more parameters to analyse and decisions to take at each instant. AI comes into play by automating these processes, particularly with Deep Learning techniques, that often show the best accuracy. However, the high performance by these methods also comes with the drawback of behaving like a black box from the view of a human. To this end, eXplainable AI serves as a technique to better understand the decision process of these algorithms. This thesis proposes an eXplainable AI framework to be used on Reinforcement Learning agents, particularly within the use case of antenna resource adaptation for network energy reduction. The framework puts a special emphasis on model adaptation/reduction, therefore focusing on feature importance techniques. The proposed framework presents a pre-model block using Concrete Autoencoders for feature reduction and a post-model block using self-supervised learning to estimate feature importance. Both of these can be used alone or in combination with DeepSHAP, in order to mitigate some of this popular method’s drawbacks. The explanations provided by the pipeline prove useful in order to reduce model complexity without loss of accuracy and to understand the usage of the input features by the AI model. / Med den snabba utvecklingen av radioteknologier har trådlösa system blivit alltmer invecklade. Denna komplexitet, kombinerat med en ökning av antalet anslutningar, innebär att fler parametrar behöver analyseras, och fler beslut behöver fattas vid varje ögonblick. AI kommer in i bilden genom att automatisera dessa processer, särskilt med Deep Learning-tekniker, som ofta uppvisar bäst noggrannhet. Men den höga prestandan med dessa metoder kommer också med nackdelen att tekniken beter sig som en svart låda från en människas synvinkel. Förklarlig AI fungerar därför som en teknik för att bättre förstå beslutet som fattas av dessa algoritmer. Denna avhandling föreslår ett förklarligt AI-ramverk som ska användas inom förstärkningsinlärning, särskilt inom användningsfallet med antenn-resursanpassning för energireduktion i trådlösa nätverk. Det föreslagna ramverket sätter en särskild tonvikt på modellanpassning/modellreduktion. Ramverket innehåller ett förmodellblock som använder Concrete Autoencoders för Feature Reduction och ett post-modellblock som använder självövervakad inlärning för att uppskatta Feature Importance. Båda dessa kan användas ensamt eller i kombination med DeepSHAP, för att lindra några av denna populära metods nackdelar. Feature Importance-uppskattningarna från ramverket visar sig vara användbara för att minska modellkomplexitet utan förlust av noggrannhet och för att förstå användningen av Input Features av AI-modellen. Deep Learning Explainable Artficial Intelligence 5G Model Reduction Deep Learning Förklarlig AI 5G Modellreduktion Computer Sciences Datavetenskap (datalogi) Computer Engineering Datorteknik Computer and Information Sciences Data- och informationsvetenskap
457	Reinforcement Learning for Hydrobatic AUVs / Reinforcement learning för Hydrobatiska AUV Woźniak, Grzegorz January 2022 (has links) This master thesis focuses on developing a Reinforcement Learning (RL) controller to perform hydrobatic maneuvers on an Autonomous Underwater Vehicle (AUV) successfully. This work also aims to analyze the robustness of the RL controller, as well as provide a comparison between RL algorithms and Proportional Integral Derivative (PID) control. Training of the algorithms is initially conducted in a Numpy simulation in Python. We show how to model the Equations of Motion (EOM) of the AUV and how to use it to train the RL controllers. We use the stablebaselines3 RL framework and create a training environment with the OpenAI gym. The Twin-Delay Deep Deterministic Policy Gradient (TD3) algorithm offers good performance in the simulation. The following maneuvers are studied: trim control, waypoint following, and an inverted pendulum. We test the maneuvers both in the Numpy simulation and Stonefish simulator. Also, we test the robustness of the RL trim controller by simulating noise in the state feedback. Lastly, we run the RL trim controller on a real AUV hardware called SAM. We show that the RL algorithm trained in the Numpy simulator can achieve similar performance to the PID controller in the Stonefish simulator. We generate a policy that can perform the trim control and the Inverted Pendulum maneuver in the Numpy simulation. We show that we can generate a robust policy that executes other types of maneuvers by providing a parameterized cost function to the RL algorithm. We discuss the results of every maneuver we perform with the SAM AUV and provide a discussion about the advantages and disadvantages of this control method applied to underwater robotics. We conclude that RL can be used to create policies that perform hydrobatic maneuvers. This data-driven approach can be applied in the future to more complex problems in underwater robotics. / Denna masteruppsats fokuserar på att utveckla en Reinforcement Learning (RL) kontroller för att framgångsrikt utföra hydrobatiska manövrar på ett autonomt undervattensfordon (AUV). Detta arbete syftar också till att analysera robustheten hos RL-kontrollern, samt tillhandahålla en jämförelse mellan RL-algoritmer och Proportional Integral Derivative (PID) kontroll. Träning av algoritmerna utförs initialt i Numpy-simuleringen i Python. Vi visar hur man modellerar rörelseekvationerna (EOM) för AUV, och hur man använder den för att träna RL-kontrollerna. Vi använder ramverket stablebaselines3 RL och skapar en träningsmiljö med gymmet OpenAI. Algoritmen Twin-Delay Deep Deterministic Policy Gradient (TD3) erbjuder bra prestanda i simuleringen. Följande manövrar studeras: trimkontroll, waypointföljning och en inverterad pendel. Vi testar manövrarna både i Numpy-simulering och Stonefish-simulator. Vi testar också robustheten hos RL-trimkontrollern genom att simulera bruset i tillståndsåterkopplingen. Slutligen kör vi RL-trimkontrollern på den riktiga SAM AUV-hårdvaran. Vi visar att RL-algoritmen tränad i Numpy-simulatorn kan uppnå liknande prestanda som PID-regulatorn i Stonefish-simulatorn. Vi genererar en policy som kan utföra trimkontrollen och manövern med inverterad pendel i Numpy-simuleringen. Vi visar att vi kan generera en robust policy som utför andra typer av manövrar genom att tillhandahålla en parameteriserad kostnadsfunktion till RL-algoritmen. Vi diskuterar resultaten av varje manöver vi utför med SAM AUV och ger en diskussion om fördelarna och nackdelarna med denna kontrollmetod som tillämpas på undervattensrobotik. Vi drar slutsatsen att RL kan användas för att skapa policyer som utför hydrobatiska manövrar. Detta datadrivna tillvägagångssätt kan tillämpas i framtiden på mer komplexa problem inom undervattensrobotik. Deep Reinforcement learning Deep learning Optimal control Hydrobatics Deep Reinforcement learning Deep learning Optimal control Hydrobatics Computer and Information Sciences Data- och informationsvetenskap
458	Detecting Security Patches in Java OSS Projects Using NLP Stefanoni, Andrea January 2022 (has links) The use of Open Source Software is becoming more and more popular, but it comes with the risk of importing vulnerabilities in private codebases. Security patches, providing fixes to detected vulnerabilities, are vital in protecting against cyber attacks, therefore being able to apply all the security patches as soon as they are released is key. Even though there is a public database for vulnerability fixes the majority of them remain undisclosed to the public, therefore we propose a Machine Learning algorithm using NLP to detect security patches in Java Open Source Software. To train the model we preprocessed and extract patches from the commits present in two databases provided by Debricked and a public one released by Ponta et al. [57]. Two experiments were conducted, one performing binary classification and the other trying to have higher granularity classifying the macro-type of vulnerability. The proposed models leverage the structure of the input to have a better patch representation and they are based on RNNs, Transformers and CodeBERT [22], with the best performing model being the Transformer that surprisingly outperformed CodeBERT. The results show that it is possible to classify security patches but using more relevant pre-training techniques or tree-based representation of the code might improve the performance. / Användningen av programvara med öppen källkod blir alltmer populär, men det innebär en risk för att sårbarheter importeras från privata kodbaser. Säkerhetspatchar, som åtgärdar upptäckta sårbarheter, är viktiga för att skydda sig mot cyberattacker, och därför är det viktigt att kunna tillämpa alla säkerhetspatchar så snart de släpps. Även om det finns en offentlig databas för korrigeringar av sårbarheter förblir de flesta hemliga för allmänheten. Vi föreslår därför en maskininlärningsalgoritm som med hjälp av NLP upptäcker säkerhetspatchar i Java Open Source Software. För att träna modellen har vi förbehandlat och extraherat patchar från de commits som finns i två databaser, ena som tillhandahålls av Debricked och en annan offentlig databas som släppts av Ponta et al. [57]. Två experiment genomfördes, varav ett utförde binär klassificering och det andra försökte få en högre granularitet genom att klassificera makro-typen av sårbarheten. De föreslagna modellerna utnyttjar strukturen i indatat för att få en bättre representation av patcharna och de är baserade på RNNs, Transformers och CodeBERT [22], där den bäst presterande modellen var Transformer som överraskande nog överträffade CodeBERT. Resultaten visar att det är möjligt att klassificera säkerhetspatchar, men genom att använda mer relevanta förträningstekniker eller trädbaserade representationer av koden kan prestandan förbättras. NLP Deep Learning vulnerability detection security patch Open Source Software NLP Deep Learning sårbarhetsdetektering säkerhetspatch programvara med öppen källkod Computer and Information Sciences Data- och informationsvetenskap
459	PREDICTION OF MULTI-PHASE LIVER CT VOLUMES USING DEEP NEURAL NETWORK Afroza Haque (17544888) 04 December 2023 (has links) <p dir="ltr">Progress in deep learning methodologies has transformed the landscape of medical image analysis, opening fresh pathways for precise and effective diagnostics. Currently, multi-phase liver CT scans follow a four-stage process, commencing with an initial scan carried out before the administration of <a href="" target="_blank">intravenous (IV) contrast-enhancing material</a>. Subsequently, three additional scans are performed following the contrast injection. The primary objective of this research is to automate the analysis and prediction of 50% of liver CT scans. It concentrates on discerning patterns of intensity change during the second, third, and fourth phases concerning the initial phase. The thesis comprises two key sections. The first section employs the non-contrast phase (first scan), late hepatic arterial phase (second scan), and portal venous phase (third scan) to predict the delayed phase (fourth scan). In the second section, the non-contrast phase and late hepatic arterial phase are utilized to predict both the portal venous and delayed phases. The study evaluates the performance of two deep learning models, U-Net and U²-Net. The process involves preprocessing steps like subtraction and normalization to compute contrast difference images, followed by post-processing techniques to generate the predicted 2D CT scans. Post-processing steps have similar techniques as in preprocessing but are performed in reverse order. Four fundamental evaluation metrics, including <a href="" target="_blank">Mean Absolute Error (MAE), Signal-to-Reconstruction Error Ratio (SRE), Peak Signal-to-Noise Ratio (PSNR), and Structural Similarity Index Measure (SSIM), </a>are employed for assessment. Based on these evaluation metrics, U²-Net performed better than U-Net for the prediction of both portal venous (third) and delayed (fourth) phases. Specifically, U²-Net exhibited superior MAE and PSNR results for the predicted third and fourth scans. However, U-Net did show slightly better SRE and SSIM performance in the predicted scans. On the other hand, for the exclusive prediction of the fourth scan, U-Net outperforms U²-Net in all four evaluation metrics. This implementation shows promising results which will eliminate the need for additional CT scans and reduce patients’ exposure to harmful radiation. Predicting 50% of liver CT volumes will reduce exposure to harmful radiation by half. The proposed method is not limited to liver CT scans and can be applied to various other multi-phase medical imaging techniques, including multi-phase CT angiography, multi-phase renal CT, contrast-enhanced breast MRI, and more.</p> Biomedical imaging Computer vision Deep learning Multi-phase liver CT Deep Learning U-Net U²-Net Biomedical Image Reconstruction
460	Deep Image Processing with Spatial Adaptation and Boosted Efficiency & Supervision for Accurate Human Keypoint Detection and Movement Dynamics Tracking Chao Yang Dai (14709547) 31 May 2023 (has links) <p>This thesis aims to design and develop the spatial adaptation approach through spatial transformers to improve the accuracy of human keypoint recognition models. We have studied different model types and design choices to gain an accuracy increase over models without spatial transformers and analyzed how spatial transformers increase the accuracy of predictions. A neural network called Widenet has been leveraged as a specialized network for providing the parameters for the spatial transformer. Further, we have evaluated methods to reduce the model parameters, as well as the strategy to enhance the learning supervision for further improving the performance of the model. Our experiments and results have shown that the proposed deep learning framework can effectively detect the human key points, compared with the baseline methods. Also, we have reduced the model size without significantly impacting the performance, and the enhanced supervision has improved the performance. This study is expected to greatly advance the deep learning of human key points and movement dynamics. </p> Computer vision Deep learning computer vision method Artifical intelligence HUMAN POSE ESTIMATION human keypoint estimation Deep Learning (DL) spatial transformers Machine Learning (ML)

Search results