Global ETD Search

81	Data Synthesis in Deep Learning for Object Detection / Syntetiskt Data i Djupinlärning för Objektdetektion Haddad, Josef January 2021 (has links) Deep neural networks typically require large amounts of labeled data for training, but a problem is that collecting data can be expensive. Our study aims at revealing insights into how training with synthetic data affects performance in real-world object detection tasks. This is achieved by synthesising annotated image data in the automotive domain using a car simulator for the tasks of detecting cars in images from the real world. We furthermore perform experiments in the aviation domain where we incorporate synthetic images extracted from an airplane simulator with real-world data for detecting runways. In our experiments, the synthetic data sets are leveraged by pre-training a deep learning based object detector, which is then fine-tuned and evaluated on real-world data. We evaluate this approach on three real-world data sets across the two domains and furthermore evaluate how the classification performance scales as synthetic and real-world data varies in the automotive domain. In the automotive domain, we additionally perform image-to-image translation both from the synthetic domain to the real-world domain, and the other way around, as a means of domain adaptation to assess whether it further improves performance. The results show that adding synthetic data improves performance in the automotive domain and that pre-training with more synthetic data results in further performance improvements, but that the performance boost of adding more real-world data exceeds that of the addition of more synthetic data. We can not conclude that using CycleGAN for domain adaptation further improves the performance. / Djupa neurala nätverk behöver normalt stora mängder annoterad träningsdata, men ett problem är att data kan vara dyrt att sampla in. Syftet med denna studie är att undersöka hur träning med syntetiskt data påverkar en objektdetektors prestanda på verkligt data. Detta undersöks genom att syntetisera data i bildomänen med hjälp av en bilsimulator för uppgiften att identifiera bilar i den verkliga världen. Dessutom utför vi experiment i flygdomänen där vi inkorporerar syntetiskt flygbilddata från en flygsimulator med riktigt flygdata för detektion av landningsbanor. Det syntetiska datat i vår studie används till att förträna en djupinlärningsbaserad objektdetektor, som sedan fintränas och evalueras på data insamlat från den verkliga världen. Vi evaluerar denna approach på totalt tre riktiga dataset över våra två domäner och dessutom undersöker vi hur prestandan skalar när mängden syntetiskt och riktigt data varierar i bildomänen. I bildomänen tillämpar vi dessutom bildtillbild translation mellan de syntetiska och riktiga bilderna för att undersöka om denna sorts domänadaption förbättrar prestandan. Resultaten visar att tillägg av syntetiskt data förbättrar prestandan i bildomänen och att förträning med en större mängd syntetiskt data resulterar i ytterligare prestandaförbättringar, men att prestandaförbättringen när mer riktigt data läggs till är större i jämförelse. Vi kan inte dra slutsatsen att domänadaption med CycleGAN leder till förbättrad prestanda. Deep Learning Computer vision Object detection Synthetic data Domain Adaptation Machine Learning Djupinlärning Datorseende Objektdetektion Syntetiskt data Domänadaption Maskininlärning Computer and Information Sciences Data- och informationsvetenskap
82	Action Recognition with Knowledge Transfer Choi, Jin-Woo 07 January 2021 (has links) Recent progress on deep neural networks has shown remarkable action recognition performance from videos. The remarkable performance is often achieved by transfer learning: training a model on a large-scale labeled dataset (source) and then fine-tuning the model on the small-scale labeled datasets (targets). However, existing action recognition models do not always generalize well on new tasks or datasets because of the following two reasons. i) Current action recognition datasets have a spurious correlation between action types and background scene types. The models trained on these datasets are biased towards the scene instead of focusing on the actual action. This scene bias leads to poor generalization performance. ii) Directly testing the model trained on the source data on the target data leads to poor performance as the source, and target distributions are different. Fine-tuning the model on the target data can mitigate this issue. However, manual labeling small- scale target videos is labor-intensive. In this dissertation, I propose solutions to these two problems. For the first problem, I propose to learn scene-invariant action representations to mitigate the scene bias in action recognition models. Specifically, I augment the standard cross-entropy loss for action classification with 1) an adversarial loss for the scene types and 2) a human mask confusion loss for videos where the human actors are invisible. These two losses encourage learning representations unsuitable for predicting 1) the correct scene types and 2) the correct action types when there is no evidence. I validate the efficacy of the proposed method by transfer learning experiments. I trans- fer the pre-trained model to three different tasks, including action classification, temporal action localization, and spatio-temporal action detection. The results show consistent improvement over the baselines for every task and dataset. I formulate human action recognition as an unsupervised domain adaptation (UDA) problem to handle the second problem. In the UDA setting, we have many labeled videos as source data and unlabeled videos as target data. We can use already exist- ing labeled video datasets as source data in this setting. The task is to align the source and target feature distributions so that the learned model can generalize well on the target data. I propose 1) aligning the more important temporal part of each video and 2) encouraging the model to focus on action, not the background scene, to learn domain-invariant action representations. The proposed method is simple and intuitive while achieving state-of-the-art performance without training on a lot of labeled target videos. I relax the unsupervised target data setting to a sparsely labeled target data setting. Then I explore the semi-supervised video action recognition, where we have a lot of labeled videos as source data and sparsely labeled videos as target data. The semi-supervised setting is practical as sometimes we can afford a little bit of cost for labeling target data. I propose multiple video data augmentation methods to inject photometric, geometric, temporal, and scene invariances to the action recognition model in this setting. The resulting method shows favorable performance on the public benchmarks. / Doctor of Philosophy / Recent progress on deep learning has shown remarkable action recognition performance. The remarkable performance is often achieved by transferring the knowledge learned from existing large-scale data to the small-scale data specific to applications. However, existing action recog- nition models do not always work well on new tasks and datasets because of the following two problems. i) Current action recognition datasets have a spurious correlation between action types and background scene types. The models trained on these datasets are biased towards the scene instead of focusing on the actual action. This scene bias leads to poor performance on the new datasets and tasks. ii) Directly testing the model trained on the source data on the target data leads to poor performance as the source, and target distributions are different. Fine-tuning the model on the target data can mitigate this issue. However, manual labeling small-scale target videos is labor-intensive. In this dissertation, I propose solutions to these two problems. To tackle the first problem, I propose to learn scene-invariant action representations to mitigate background scene- biased human action recognition models for the first problem. Specifically, the proposed method learns representations that cannot predict the scene types and the correct actions when there is no evidence. I validate the proposed method's effectiveness by transferring the pre-trained model to multiple action understanding tasks. The results show consistent improvement over the baselines for every task and dataset. To handle the second problem, I formulate human action recognition as an unsupervised learning problem on the target data. In this setting, we have many labeled videos as source data and unlabeled videos as target data. We can use already existing labeled video datasets as source data in this setting. The task is to align the source and target feature distributions so that the learned model can generalize well on the target data. I propose 1) aligning the more important temporal part of each video and 2) encouraging the model to focus on action, not the background scene. The proposed method is simple and intuitive while achieving state-of-the-art performance without training on a lot of labeled target videos. I relax the unsupervised target data setting to a sparsely labeled target data setting. Here, we have many labeled videos as source data and sparsely labeled videos as target data. The setting is practical as sometimes we can afford a little bit of cost for labeling target data. I propose multiple video data augmentation methods to inject color, spatial, temporal, and scene invariances to the action recognition model in this setting. The resulting method shows favorable performance on the public benchmarks. Computer Vision Machine learning Deep learning (Machine learning) Convolutional Neural Networks Representation Learning Action Recognition Domain Adaptation Bias Reduction Semi-Supervised Learning Unsupervised Learning
83	Handling Domain Shift in 3D Point Cloud Perception Saltori, Cristiano 10 April 2024 (has links) This thesis addresses the problem of domain shift in 3D point cloud perception. In the last decades, there has been tremendous progress in within-domain training and testing. However, the performance of perception models is affected when training on a source domain and testing on a target domain sampled from different data distributions. As a result, a change in sensor or geo-location can lead to a harmful drop in model performance. While solutions exist for image perception, addressing this problem in point clouds remains unresolved. The focus of this thesis is the study and design of solutions for mitigating domain shift in 3D point cloud perception. We identify several settings differing in the level of target supervision and the availability of source data. We conduct a thorough study of each setting and introduce a new method to solve domain shift in each configuration. In particular, we study three novel settings in domain adaptation and domain generalization and propose five new methods for mitigating domain shift in 3D point cloud perception. Our methods are used by the research community, and at the time of writing, some of the proposed approaches hold the state-of-the-art. In conclusion, this thesis provides a valuable contribution to the computer vision community, setting the groundwork for the development of future works in cross-domain conditions.
84	Advanced Deep Learning Approaches for Automated Assessment of Ultrasound Imaging (Lungs and Heart) Fatima, Noreen 21 January 2025 (has links) Ultrasound is a widely used imaging modality to evaluate various anatomical structures in the human body, such as the lungs and heart, due to its non-invasive nature and real-time imaging capabilities. However, the effectiveness of these evaluations is highly dependent on the manual annotation/segmentation of visual patterns. In lung assessments, patterns such as hyper-echoic horizontal and vertical artifacts and hypo-echoic consolidations are crucial for diagnosing both adult and neonatal respiratory conditions. Similarly, in cardiac evaluations, ultrasound imaging facilitates the visualization of anatomical structures, including the endocardium, myocardium, and atrium of the left ventricle, which are essential for the measurement of clinical indices such as ejection fraction (EF). Despite the advancements in ultrasound technology, manual annotation remains the standard practice in clinical routines due to the lack of fully automated AI solutions. However, manual annotation/segmentation of ultrasound patterns is not only time-consuming but also prone to inter-observer variability (IOV). Given with these challenges, evaluating inter-rater reliability across multiple operators, medical centers, and diverse patient populations is crucial for advancing lung ultrasound (LUS) diagnostics and improving patient outcomes. Additionally, the integration of AI to reduce the impact of IOV in ultrasound pattern interpretations has not been fully explored in the literature. Moving forward, many stateof- the-art studies have primarily focused on the interpretation of LUS patterns in adults, resulting in a notable gap in research concerning neonates. This thesis addresses this gap by introducing an advanced methodology aimed at standardizing and automating the interpretation of LUS patterns in neonates. To achieve this, various deep learning approaches, including classical neural networks and advanced transformer-based models, are employed. Additionally, domain adaptation techniques are introduced to facilitate the transfer of knowledge from adult LUS assessments to neonates. Furthermore, IOV also contributes to inconsistencies in data distribution, leading to an unequal representation of different classes within the dataset. To address these challenges, this thesis explores the application of generative AI, emphasizing the techniques that could effectively balance the data distributions. Building on this foundation, this thesis examines the use of generative AI models for the automated segmentation of left ventricle (LV ) regions to mitigate the effects of IOV. The proposed segmentation method was rigorously evaluated through qualitative and quantitative analyses, setting a new benchmark for future studies by demonstrating improved performance of EF estimation over state-of-the-art techniques. Lastly, this thesis introduces a novel approach that leverages generative AI models for automated labeling of LV regions using adjacent anatomical structures, such as utilizing the myocardium to segment the endocardium region or vice versa. This novel approach significantly reduces the need for manual labeling, ultimately minimizing IOV and saving time in clinical practice.
85	Recurrent neural network language generation for dialogue systems Wen, Tsung-Hsien January 2018 (has links) Language is the principal medium for ideas, while dialogue is the most natural and effective way for humans to interact with and access information from machines. Natural language generation (NLG) is a critical component of spoken dialogue and it has a significant impact on usability and perceived quality. Many commonly used NLG systems employ rules and heuristics, which tend to generate inflexible and stylised responses without the natural variation of human language. However, the frequent repetition of identical output forms can quickly make dialogue become tedious for most real-world users. Additionally, these rules and heuristics are not scalable and hence not trivially extensible to other domains or languages. A statistical approach to language generation can learn language decisions directly from data without relying on hand-coded rules or heuristics, which brings scalability and flexibility to NLG. Statistical models also provide an opportunity to learn in-domain human colloquialisms and cross-domain model adaptations. A robust and quasi-supervised NLG model is proposed in this thesis. The model leverages a Recurrent Neural Network (RNN)-based surface realiser and a gating mechanism applied to input semantics. The model is motivated by the Long-Short Term Memory (LSTM) network. The RNN-based surface realiser and gating mechanism use a neural network to learn end-to-end language generation decisions from input dialogue act and sentence pairs; it also integrates sentence planning and surface realisation into a single optimisation problem. The single optimisation not only bypasses the costly intermediate linguistic annotations but also generates more natural and human-like responses. Furthermore, a domain adaptation study shows that the proposed model can be readily adapted and extended to new dialogue domains via a proposed recipe. Continuing the success of end-to-end learning, the second part of the thesis speculates on building an end-to-end dialogue system by framing it as a conditional generation problem. The proposed model encapsulates a belief tracker with a minimal state representation and a generator that takes the dialogue context to produce responses. These features suggest comprehension and fast learning. The proposed model is capable of understanding requests and accomplishing tasks after training on only a few hundred human-human dialogues. A complementary Wizard-of-Oz data collection method is also introduced to facilitate the collection of human-human conversations from online workers. The results demonstrate that the proposed model can talk to human judges naturally, without any difficulty, for a sample application domain. In addition, the results also suggest that the introduction of a stochastic latent variable can help the system model intrinsic variation in communicative intention much better.
86	Neural Methods Towards Concept Discovery from Text via Knowledge Transfer Das, Manirupa January 2019 (has links) No description available. Computer Engineering Computer Science Information Science Library Science Linguistics
87	Machine Learning personalizationfor hypotension prediction / Personalisering av maskininlärning förhypotoniförutsägelse Escorihuela Altaba, Clara January 2022 (has links) Perioperative hypotension (PH), commonly a side effect of anesthesia,is one of the main mortality causes during the 30 posterior days of asurgical procedure. Novel research lines propose combining machinelearning algorithms with the Arterial Blood Pressure (ABP) waveform tonotify healthcare professionals about the onset of a hypotensive event withtime advance and prevent its occurrence. Nevertheless, ABP waveformsare heterogeneous among patients, consequently, a general model maypresent different predictive capabilities per individual. This project aimsat improving the performance of an artificial neural network (ANN) topredict hypotension events with time advance by applying personalizedmachine learning techniques, like data grouping and domain adaptation. Wehypothesize its implementation will allow us to cluster patients with similardemographic and ABP discriminative characteristics and tailor the modelto each specific group, resulting in a worst overall but better individualperformance. Results present a slight but not clinical significant improvementwhen comparing AUROC values between the group-specific and the generalmodel. This suggests even though personalization could be a good approach todealing with patient heterogeneity, the clustering algorithm presented in thisthesis is not sufficient to make the ANN clinically feasible. / Perioperativ hypotoni (PH), vanligtvis en sidoeffekt av anestesi, är en av dehuvudsakliga dödsorsakerna under de första 30 dagarna efter ett kirurgiskt ingrepp. Nya forskningslinjer föreslår att kombinera maskininlärningsalgo-ritmer med vågformen av det arteriella blodtrycket (ABP) för att förvarna sjukvårdspersonalen om uppkomsten av en hypotensiv episod, and därmedförhindra förekomsten. ABP-vågformen är dock heterogen bland patienter,så en allmän modell kan ha olik prediktiv förmåga för olika individer.I det här projektet används personaliserade maskininlärningstekniker, somdatagruppering och domänanpassning, för att försöka förbättra ett artificielltneuralt nätverk (ANN) som förutspår hytotensiva episoder. Vår hypotes är attimplementeringen kommer låta oss klustra patienter med liknande demografioch ABP-karakteristik för att skräddarsy modellen till varje specifik grupp,vilket leder till en sämre övergripande men bättre individuell prestanda. Resultaten visar små men inte kliniskt signifikanta förbättringar när AUROC-värden jämförs mellan den gruppspecifika och den allmänna modellen. Detta tyder på att även fast personalisering kan vara en bra tillnärmning till patientersheterogenitet, är inte klusteralgoritmen som presenteras här tillräcklig förklinisk användning av ANN. Arterial blood pressure Hypotension prediction Machine learning personal- ization Domain adaptation Data grouping Arteriellt blodtryck Börutsägelse av hypotoni Personaliserad maskininlär- ning Domänanpassning Datagruppering. Medical Engineering Medicinteknik Cardiac and Cardiovascular Systems Kardiologi Anesthesiology and Intensive Care Anestesi och intensivvård
88	Unsupervised Domain Adaptation for Regressive Annotation : Using Domain-Adversarial Training on Eye Image Data for Pupil Detection / Oövervakad domänadaptering för regressionsannotering : Användning av domänmotstående träning på ögonbilder för pupilldetektion Zetterström, Erik January 2023 (has links) Machine learning has seen a rapid progress the last couple of decades, with more and more powerful neural network models continuously being presented. These neural networks require large amounts of data to train them. Labelled data is especially in great demand, but due to the time consuming and costly nature of data labelling, there exists a scarcity for labelled data, whereas there usually is an abundance of unlabelled data. In some cases, data from a certain distribution, or domain, is labelled, whereas the data we actually want to optimise our model on is unlabelled and from another domain. This falls under the umbrella of domain adaptation and the purpose of this thesis is to train a network using domain-adversarial training on eye image datasets consisting of a labelled source domain and an unlabelled target domain, with the goal of performing well on target data, i.e., overcoming the domain gap. This was done on two different datasets: a proprietary dataset from Tobii with real images and the public U2Eyes dataset with synthetic data. When comparing domain-adversarial training to a baseline model trained conventionally on source data and a oracle model trained conventionally on target data, the proposed DAT-ResNet model outperformed the baseline on both datasets. For the Tobii dataset, DAT-ResNet improved the Huber loss by 22.9% and the Intersection over Union (IoU) by 7.6%, and for the U2Eyes dataset, DAT-ResNet improved the Huber loss by 67.4% and the IoU by 37.6%. Furthermore, the IoU measures were extended to also include the portion of predicted ellipsis with no intersection with the corresponding ground truth ellipsis – referred to as zero-IoUs. By this metric, the proposed model improves the percentage of zero-IoUs by 34.9% on the Tobii dataset and by 90.7% on the U2Eyes dataset. / Maskininlärning har sett en snabb utveckling de senaste decennierna med mer och mer kraftfulla neurala nätverk-modeller presenterades kontinuerligt. Dessa neurala nätverk kräver stora mängder data för att tränas. Data med etiketter är det framförallt stor efterfrågan på, men på grund av det är tidskrävande och kostsamt att etikettera data så finns det en brist på sådan data medan det ofta finns ett överflöd av data utan etiketter. I vissa fall så är data från en viss fördelning, eller domän, etiketterad, medan datan som vi faktiskt vill optimera vår modell efter saknar etiketter och är från en annan domän. Det här faller under området domänadaptering och målet med det här arbetet är att träna ett nätverk genom att använda domänmoststående träning på dataset med ögonbilder som har en källdomän med etiketter och en måldomän utan etiketter, där målet är att prestera bra på data från måldomänen, i.e., att lösa ett domänadapteringsproblem. Det här gjordes på två olika dataset: ett dataset som ägs av Tobii med riktiga ögonbilder och det offentliga datasetet U2Eyes med syntetiska bilder. När domänadapteringsmodellen jämförs med en basmodell tränad konventionellt på källdata och en orakelmodell tränad konventionellt på måldata, så utklassar den presenterade DAT-ResNet-modellen basmodellen på båda dataseten. På Tobii-datasetet så förbättrade DAT-ResNet förlusten med 22.9% och Intersection over Union (IoU):n med 7.6%, och på U2Eyes-datasetet, förbättrade DAT-ResNet förlusten med 67.4% och IoU:n med 37.6%. Dessutom så utökades IoU-måtten till att också innefatta andelen av förutspådda ellipser utan något överlapp med tillhörande grundsanningsellipser – refererat till som noll-IoU:er. Enligt detta mått så förbättrar den föreslagna modellen noll-IoU:erna med 34.9% på Tobii-datasetet och 90.7% på U2Eyes-datasetet. Neural networks Deep learning Convolutional neural networks Transfer learning Domain adaptation Unsupervised training Adversarial training Keypoint detection Regression Neurala nätverk Djupinlärning Faltningsnätverk Överförningsinlärning Domänadaptering Oövervakad inlärning Motstående träning Nyckelpunktsdetektion Regression Computer and Information Sciences Data- och informationsvetenskap
89	Unsupervised Domain Adaptation for 3D Object Detection Using Adversarial Adaptation : Learning Transferable LiDAR Features for a Delivery Robot / Icke-vägledd Domänanpassning för 3D-Objektigenkänning Genom Motspelaranpassning : Inlärning av Överförbara LiDAR-Drag för en Leveransrobot Hansson, Mattias January 2023 (has links) 3D object detection is the task of detecting the full 3D pose of objects relative to an autonomous platform. It is an important perception system that can be used to plan actions according to the behavior of other dynamic objects in an environment. Due to the poor generalization of object detectors trained and tested on different datasets, this thesis concerns the utilization of unsupervised domain adaptation to train object detectors fit for mobile robotics without any labeled training data. To tackle the problem a novel approach Unsupervised Adversarial Domain Adaptation 3D (UADA3D) is presented to adapt LiDAR-based detectors, through drawing inspiration from the success of adversarial adaptation for 2D object detection in RGB images. The method adds learnable discriminator layers that discriminate between the features and bounding box predictions in the labeled source and unlabeled target data. The gradients are then reversed through gradient reversal layers during backpropagation to the base detector, which in turn learns to extract features that are similar between the domains in order to fool the discriminator. The method works for multi-class detection by simultaneous adaptation of all classes in an end-to-end trainable network and works for both point-based and voxel-based single-stage detectors. The results show that the proposed method increases detection scores for adaptation from dense to sparse point clouds and from simulated data toward the data of a mobile delivery robot, successfully handling the two relevant domain gaps given by differences in marginal and conditional probability distributions. / 3D-objektdetektering handlar om att upptäcka hela 3D-positionen för objekt i förhållande till en autonom plattform. Det är ett viktigt perceptionsystem som kan användas för att planera åtgärder baserat på beteendet hos andra dynamiska objekt i en miljö. På grund av den dåliga generaliseringen av objektavkännare som tränats och testats på olika datamängder, handlar denna avhandling om användningen av osuperviserad domänanpassning för att träna objektavkännare som är anpassade för mobila robotar utan några märkta träningsdata. För att tackla problemet presenteras ett nytt tillvägagångssätt Unsupervised Adversarial Domain Adaptation 3D (UADA3D) för att anpassa LiDAR-baserade avkännare, genom att ta inspiration från framgången av mospelaranpassning för 2D-objektdetektering i RGB-bilder. Metoden lägger till inlärbara diskriminatorlager som diskriminerar mellan egenskaperna och prediktionerna i annoterad käll- och oannoterad måldata. Gradienterna är sedan reverserae genom gradientreversering under bakåtpropagering till basdetekorn, som i sin tur lär sig att extrahera egenskaper som är liknande mellan domänerna för att lura diskriminatorn. Metoden fungerar för flerklassdetektering genom samtidig anpassning av alla klasser i ett end-to-end-träningsbart nätverk och fungerar för både punktbaserade och voxelbaserade enstegs detektorere. Resultaten visar att den föreslagna metoden förbättrar detektionen för domänanpassning från täta till glesa punktmoln och från simulerad data till data från en mobil leveransrobot, därmed hanterar metoden framgångsrikt de två relevanta domänskillnaderna i marginella- och betingade sannolikhetsfördelningar. Unsupervised Domain Adaptation 3D Object Detection Mobile Robotics Adversarial Adaptation Computer Vision Oövervakad Domänanpassning 3D Objektigenkänning Mobila Robotar Motspelaranpassning Datorseende Robotics Robotteknik och automation Computer and Information Sciences Data- och informationsvetenskap
90	Adaptive Model-Based Temperature Monitoring for Electric Powertrains : Investigation and Comparative Analysis of Transfer Learning Approaches / Adaptiv modellbaserad temperaturövervakning för elektriska drivlinor : Undersökning och jämförande analys av metoder för överföring av lärande Huang, Chenzhou January 2023 (has links) In recent years, deep learning has been widely used in industry to solve many complex problems such as condition monitoring and fault diagnosis. Powertrain condition monitoring is one of the most vital and complicated problems in the automation industry since the condition of the drive affects its health, performance, and reliability. Traditional methods based on thermal modeling require expertise in drive geometry, heat transfer, and system identification. Although the data-driven deep learning methods could avoid physical modeling, they commonly face another predicament: models trained and tested on the same dataset cannot be applied to other different situations. In real applications, where the monitoring devices are different and the working environment changes constantly, poor model generalization will lead to unreliable predictions. Transfer learning, which adapts the model from the source domain to the target domain, can improve model generalization and enhance the reliability and accuracy of the predictions in real-world scenarios. This thesis investigates the applicability of mainstream transfer learning approaches in the context of drive condition monitoring using multiple datasets with different probability distributions. Through the comparison and discussion of models and results, the scope of their application, as well as their advantages and disadvantages are expounded. Finally, it is concluded that in the drive condition monitoring under the industrial background, the target domain data has enough labels, and it is not necessary to maintain the performance of the model in the source domain. In this case, fine-tuning based on the model trained in the source domain is the best method for this scenario. / Under de senaste åren har djupinlärning använts i stor utsträckning inom industrin för att lösa många komplexa problem såsom tillståndsövervakning och feldiagnos. Övervakning av drivlinans tillstånd är ett av de viktigaste och mest komplicerade problemen inom automationsindustrin eftersom driftens tillstånd påverkar dess hälsa, prestanda och tillförlitlighet. Traditionella metoder baserade på termisk modellering kräver expertis inom drivgeometri, värmeöverföring och systemidentifiering. Även om de datadrivna djupinlärningsmetoderna skulle kunna undvika fysisk modellering står de ofta inför en annan situation: modeller som tränats och testats på samma datauppsättning kan inte tillämpas på andra situationer. I verkliga applikationer, där övervakningsenheterna är olika och arbetsmiljön förändras ständigt, kommer dålig modellgeneralisering att leda till opålitliga förutsägelser. Transfer learning, som anpassar modellen från källdomänen till måldomänen, kan förbättra modellgeneraliseringen och öka tillförlitligheten och noggrannheten i förutsägelserna i verkliga scenarier. Denna avhandling undersöker tillämpligheten av traditionella överföringsinlärningsmetoder i samband med övervakning av drivtillstånd med hjälp av flera datauppsättningar med olika sannolikhetsfördelningar. Genom jämförelse och diskussion av modeller och resultat förklaras omfattningen av deras tillämpning, liksom deras fördelar och nackdelar. Slutligen dras slutsatsen att måldomändata vid övervakning av drivtillståndet under industriell bakgrund har tillräckligt med etiketter och att det inte är nödvändigt att upprätthålla modellens prestanda inom källdomänen. I det här fallet är finjustering baserad på modellen utbildad i källdomänen den bästa metoden för detta scenario. / Viime vuosina syväoppimista on käytetty laajalti teollisuudessa monien monimutkaisten ongelmien, kuten kunnonvalvonnan ja vikadiagnoosin, ratkaisemiseen. Voimansiirron kunnonvalvonta on yksi automaatioteollisuuden tärkeimmistä ja monimutkaisimmista ongelmista, koska taajuusmuuttajan kunto vaikuttaa sen kuntoon, suorituskykyyn ja luotettavuuteen. Perinteiset lämpömallinnukseen perustuvat menetelmät edellyttävät käyttögeometrian, lämmönsiirron ja järjestelmän tunnistamisen asiantuntemusta. Vaikka dataan perustuvat syväoppimismenetelmät voisivat välttää fyysisen mallinnuksen, ne kohtaavat usein toisen ahdingon: samalla tietojoukolla koulutettuja ja testattuja malleja ei voida soveltaa muihin erilaisiin tilanteisiin. Todellisissa sovelluksissa, joissa valvontalaitteet ovat erilaisia ja työympäristö muuttuu jatkuvasti, huono mallin yleistäminen johtaa epäluotettaviin ennusteisiin. Siirto-oppiminen, joka mukauttaa mallin lähdealueelta kohdealueelle, voi parantaa mallin yleistämistä ja parantaa ennusteiden luotettavuutta ja tarkkuutta todellisissa skenaarioissa. Tässä väitöskirjassa tutkitaan valtavirran siirto-oppimisen lähestymistapojen soveltuvuutta taajuusmuuttajan kunnonvalvonnan kontekstissa käyttämällä useita tietojoukkoja erilaisilla todennäköisyysjakaumilla. Mallien ja tulosten vertailun ja keskustelun avulla selitetään niiden soveltamisala sekä niiden edut ja haitat. Lopuksi päätellään, että taajuusmuuttajan kunnonvalvonnassa teollisen taustan alla kohdealueen tiedoilla on tarpeeksi tarroja, eikä mallin suorituskykyä tarvitse ylläpitää lähdealueella. Tässä tapauksessa lähdetoimialueella koulutettuun malliin perustuva hienosäätö on paras tapa tähän skenaarioon. Transfer Learning Condition Monitoring Domain Adaptation Neural Network Powerstrain. Siirto-oppiminen kunnonvalvonta verkkotunnuksen mukauttaminen neuroverkko voimansiirto. Överföring lärande tillståndsövervakning domänanpassning neuralt nätverk Powerstrain Elektroteknik och elektronik

Search results