Global ETD Search

241	Efficient Multispeaker Speech Synthesis and Voice Cloning McHargue, James 26 May 2023 (has links) No description available. Artificial Intelligence Computer Science
242	Explaining rifle shooting factors through multi-sensor body tracking : Using transformers and attention to mine actionable patterns from skeleton graphs Andersson, Filip, Flyckt, Jonatan January 2021 (has links) There is a lack of data-driven training instructions for sports shooters, as instruction has commonly been based on subjective assessments. Many studies have correlated body posture and balance to shooting performance in rifle shooting tasks, but most of them have focused on single aspects of postural control. This thesis has focused on finding relevant rifle shooting factors by examining the entire body over sequences of time. We performed a data collection with 13 human participants who carried out live rifle shooting scenarios while being recorded with multiple biometric sensors, including several body trackers. An experiment was conducted to identify what aspects of rifle shooting could be predicted and explained using these data. We employed a pre-processing pipeline to produce a novel skeleton sequence representation, and used it to train a transformer model. The predictions from this model could be explained on a per sample basis using the attention mechanism, and visualised in an interactive format for humans to interpret. It was possible to separate the different phases of a shooting scenario from body posture with a high classification accuracy (81%). However, no correlation could be shown between shooting performance and body posture from our data. Future work could focus on novel feature engineering, and on examining alternative machine learning approaches. The dataset and pre-processing pipeline, as well as the techniques for generating explainable predictions presented in this thesis has laid the groundwork for future research in the sports shooting domain. Machine learning Explainable AI Transformers Skeleton graphs Rifle shooting Computer Systems Datorsystem
243	Socially Connected Internet-of-things Devices for Crowd Management Systems Hamrouni, Aymen 04 May 2023 (has links) Autonomously monitoring and analyzing the behavior of the crowd is an open research topic in the transportation field because of its criticality to the safety of people. Real-time identification, tracking, and prediction of crowd behavior are primordial to ensure smooth crowd management operations and the welfare of the public in many public areas, such as public transport stations and streets. This being said, enabling such systems is not a straightforward procedure. First, the complexity brought by the interaction and fusion from individual to group needs to be assessed and analyzed. Second, the classification of these actions might be useful in identifying danger and avoiding any undesirable consequences. The adoption of the Internet-of-things (IoT) in such systems has made it possible to gather a large amount of data. However, it raises diverse compatibility and trustworthiness challenges, among others, hindering the use of conventional service discovery and network navigability processes for enabling crowd management systems. In fact, as the IoT network is known for its highly dynamic topology and frequently changing characteristics (e.g., the devices' status, such as availability, battery capacity, and memory usage), traditional methods fail to learn and understand the evolving behavior of the network so as to enable real-time and context-aware service discovery to assign and select relevant IoT devices for monitoring and managing the crowd. In large-scale IoT networks, crowd management systems usually collect large data streams of images from different heterogeneous sources (e.g., CCTVs, IoT devices, or people with their smartphones) in an inadvertent way. Due to the limitations and challenges related to communication bandwidth, storage, and processing capabilities, it is unwise to transfer unselectively all the collected images since some of these images either contain duplicate information, are inaccurate, or might be falsely submitted by end-users; hence, a filtering and quality check mechanism must be put in place. As images can only provide limited information about the crowd by capturing only a snapshot of the scene at a specific point in time with limited context, an extension to deal with videos to enable efficient analysis such as crowd tracking and identification is essential for the success of crowd management systems. In this thesis, we propose to design a smart image enhancement and quality control system for resource pooling and allocation in the Internet-of-Things applied to crowd management systems. We first rely on the Social IoT (SIoT) concept, which defines the relationships among the connected objects, to extract accurate information about the network and enable trustworthy and context-aware service exchange and resource allocation. We investigate the service discovery process in SIoT networks and essentially focus on graph-based techniques while overviewing their utilization in SIoT and discussing their advantages. We also propose an alternative to these scalable methods by introducing a low-complexity context-aware Graph Neural Network (GNN) approach to enable rapid and dynamic service discovery in a large-scale heterogeneous IoT network to enable efficient crowd management systems. Secondly, we propose to design a smart image selection procedure using an asymmetric multi-modal neural network autoencoder to select a subset of photos with high utility coverage for multiple incoming streams in the IoT network. The proposed architecture enables the selection of high-context data from an evolving picture stream and ensures relevance while discarding images that are irrelevant or falsely submitted by smartphones, for example. The approach uses the photo's metadata, such as geolocation and timestamps, along with the pictures' semantics to decide which photos can be submitted and which ones must be discarded. To extend our framework beyond just images and deal with real-time videos, we propose a transformer-based crowd management monitoring framework called V3Trans-Crowd that captures information from video data and extracts meaningful output to categorize the crowd's behavior. The proposed 3D Video Transformer is inspired from Video Swin-Transformer/VIVIT and provides an improved hierarchical transformer for multi-modal tasks with spatial and temporal fusion layers. Our simulations show that due to its ability to embed the devices' features and relations, the GNN is capable of providing more concise clusters compared to traditional techniques, allowing for better IoT network learning and understanding. Moreover, we show that the GNN approach speeds up the service lookup search space and outperforms the traditional graph-based techniques to select suitable IoT devices for reporting and monitoring. Simulation results for three different multi-modal autoencoder architectures indicate that a hierarchical asymmetric autoencoder approach can yield better results, outperforming the mixed asymmetric autoencoder and a concatenated input autoencoder, while leveraging user-side rendering to reduce bandwidth consumption and computational overhead. Also, performance evaluation for the proposed V3Trans-Crowd model has shown great results in terms of accuracy for crowd behavior classification compared to state-of-the-art methods such as C3D pre-trained, I3D pre-trained, and ResNet 3D pre-trained on the Crowd-11 and MED datasets. crowd management service discovery graph neural networks optimization transformers autoencoders smart cities
244	The Effect of Winding Curvature and Core Permeability on the Power Losses and Leakage Inductance of High-Frequency Transformers Whitman, Daniel J. 13 August 2021 (has links) No description available. Electrical Engineering Electromagnetism Electromagnetics transformers skin effect permeability proximity effect Dowells equation high-frequency
245	Applicability of GPT models to high-performance compute languages Icimpaye, Urlich January 2023 (has links) This thesis aims to investigate the feasibility of generating code in high-performance computing languages such as C++ with neural networks. This has been investigated by transfer learning publicly available pretrained transformers on C++ code. The models chosen for transfer learning are CodeT5, an encode-decoder model with 770 million parameters, and two decoder-only models called CodeGen, one with 350 million parameters and the other having one billion parameters. All models were trained on a labeled dataset where each sample had a prompt in natural language and an answer in C++ code. However, the CodeT5 model was also trained on an unlabelled dataset of C++ code since that model did not come pretrained on C++ code. The models were evaluated using the CodeBERTScore, which measures the cosine similarity of model-generated code with the reference code. The CodeT5 model gave the best score. However, looking at the types of programming tasks the model solved, the results indicate that they can only solve trivial programming tasks. This is likely due to the training corpus size and the models' size. Nevertheless, due to the limitations of computing resources available during the thesis, training larger models on a more extensive training corpus, specially labeled data, was not feasible, which would have given a performance gain. Additional computing resources would be required to train larger models on larger datasets to improve performance. Data Science Machine Learning Artificial Intelligence Transformers Engineering and Technology Teknik och teknologier
246	Improving Relation Extraction from Unstructured Genealogical Texts Using Fine-Tuned Transformers Parrolivelli, Carloangello 01 June 2022 (has links) (PDF) Though exploring one’s family lineage through genealogical family trees can be insightful to developing one’s identity, this knowledge is typically held behind closed doors by private companies or require expensive technologies, such as DNA testing, to uncover. With the ever-booming explosion of data on the world wide web, many unstructured text documents, both old and new, are being discovered, written, and processed which contain rich genealogical information. With access to this immense amount of data, however, entails a costly process whereby people, typically volunteers, have to read large amounts of text to find relationships between people. This delays having genealogical information be open and accessible to all. This thesis explores state-of-the-art methods for relation extraction across the genealogical and biomedical domains and bridges new and old research by proposing an updated three-tier system for parsing unstructured documents. This system makes use of recently developed and massively pretrained transformers and fine-tuning techniques to take advantage of these deep neural models’ inherent understanding of English syntax and semantics for classification. With only a fraction of labeled data typically needed to train large models, fine-tuning a LUKE relation classification model with minimal added features can identify genealogical relationships with macro precision, recall, and F1 scores of 0.880, 0.867, and 0.871, respectively, in data sets with scarce (∼10%) positive relations. Further- more, with the advent of a modern coreference resolution system utilizing SpanBERT embeddings and a modern named entity parser, our end-to-end pipeline can extract and correctly classify relationships within unstructured documents with macro precision, recall, and F1 scores of 0.794, 0.616, and 0.676, respectively. This thesis also evaluates individual components of the system and discusses future improvements to be made. Relation Extraction Genealogical Relations Transformers Unstructured Text Machine Learning Natural Language Processing Other Computer Engineering
247	Machine Learning Algorithms for Pattern Discovery in Spatio-temporal Data With Application to Brain Imaging Analysis Asadi, Nima, 0000-0002-5102-6927 January 2022 (has links) Temporal networks have become increasingly pervasive in many real-world applications. Due to the existence of diverse and evolving entities in such networks, understanding the structure and characterizing patterns in them is a complex task. A prime real-world example of such networks is the functional connectivity of the brain. These networks are commonly generated by measuring the statistical relationship between the oxygenation level-dependent signal of spatially separate regions of the brain over the time of an experiment involving a task being performed or at rest in an MRI scanner. Due to certain characteristics of fMRI data, such as high dimensionality and high noise level, extracting spatio-temporal patterns in such networks is a complicated task. Therefore, it is necessary for state-of-the-art data-driven analytical methods to be developed and employed for this domain. In this thesis, we suggest methodological tools within the area of spatio-temporal pattern discovery to explore and address several questions in the domain of computational neuroscience. One of the important objectives in neuroimaging research is the detection of informative brain regions for characterizing the distinction between the activation patterns of the brains among groups with different cognitive conditions. Popular approaches for achieving this goal include the multivariate pattern analysis(MVPA), regularization-based methods, and other machine learning based approaches. However, these approaches suffer from a number of limitations, such as requirement of manual tuning of parameter as well as incorrect identification of truly informative regions in certain cases. We therefore propose a maximum relevance minimum redundancy search algorithm to alleviate these limitations while increasing the precision of detection of infor- mative activation clusters. The second question that this thesis work addresses is how to detect the temporal ties in a dynamic connectivity network that are not formed at random or due to local properties of the nodes. To explore the solution to this problem, a null model is proposed that estimates the latent characteristics of the distributions of the temporal links through optimization, followed by a statistical test to filter the links whose formation can be reduced to the local properties of their interacting nodes. We demonstrate the benefits of this approach by applying it to a real resting state fMRI dataset, and provide further discussion on various aspects and advantages of it. Lastly, this dissertation delves into the task of learning a spatio-temporal representation to discover contextual patterns in evolutionary structured data. For this purpose, a representation learning approach is proposed based on the transformer model to extract the spatio-temporal contextual information from the fMRI data. Representation learning is a core component in data-driven modeling of various complex phenomena. Learning a contextually informative set of features can specially benefit the analysis of fMRI data due to the complexities and dynamic dependencies present in such datasets. The proposed framework takes the multivariate BOLD time series of the regions of the brain as well as their functional connectivity network simultaneously as the input to create a set of meaningful features which can in turn be used in var- ious downstream tasks such as classification, feature extraction, and statistical analysis. This architecture uses the attention mechanism as well as the graph convolution neural network to jointly inject the contextual information regarding the dynamics in time series data and their connectivity into the representation. The benefits of this framework are demonstrated by applying it to two resting state fMRI datasets, and further discussion is provided on various aspects and advantages of it over a number of commonly adopted architectures. / Computer and Information Science Computer science Algorithms Computational neuroscience Deep learning fMRI data Machine learning Transformers
248	Online power transformer diagnostics using multiple modes of microwave radiation Dalarsson, Mariana January 2013 (has links) In the present thesis, we propose and investigate a new approach to diagnose the effects of the various degradation mechanisms, including thermal degradation at hot spots, winding deformations due to the mechanical forces from short circuit currents, partial discharges due to local electric field surges, and increased moisture levels in the cellulose insulation due to decomposition, that affect electric power transformers during their normal operation in an electric power grid. Although the proposed diagnostics method can in principle be used to detect various degradation mechanisms mentioned above, we focus in the present thesis on mechanical deformations of transformer winding structures. Such mechanical deformations are most often caused by mechanical forces from short circuit currents, but they may also be caused by initial manufacturing errors and inconsistencies not detected by the power transformers’ suppliers quality assurance processes. We model a transformer winding surrounded by the transformer-tank wall and the magnetic core as a two-dimensional parallel plate waveguide or as a three-dimensional coaxial waveguide, where one metallic boundary (plate or cylinder) represents the wall of the transformer tank and the other metallic boundary (plate or cylinder) represents the iron core that conducts the magnetic flux. In between there is a set of parallel or coaxial conductors representing the winding segments. The new principle proposed in the present thesis is to insert a number of antennas into a transformer tank to radiate and measure microwave fields interacting with metallic structures and insulation. The responses from the emitted microwave radiation are expected to be sensitive to material properties that reflect the changes caused by any harmful deterioration processes mentioned above. Specifically, we investigate the mechanical deformations of transformer winding structures by determining the locations of the individual winding segments or turns, using measurements of the scattered fields at both ends of the winding structure. We solve the propagation problem using conventional waveguide theory, including mode-matching and cascading techniques. The inverse problem is solved using modified steepest-descent optimization methods. The optimization model is tested by comparing our calculated scattering data with synthetic measurement data generated by the commercial program HFSS. A good agreement is obtained between the calculated and measured positions of winding segments for a number of studied cases, which indicates that the diagnostics method proposed in the present thesis couldbe potentially useful as a basis for the design of a future commercial on-line winding monitoring device. However, further development of the theoretical analysis of a number of typical winding deformations, improvements of the optimization algorithms and a practical study with measurements on an actual power transformer structure are all needed to make an attempt to design a commercial winding monitoring device feasible. / <p>QC 20131007</p> Annan elektroteknik och elektronik
249	Question Answering auf dem Lehrbuch 'Health Information Systems' mit Hilfe von unüberwachtem Training eines Pretrained Transformers Keller, Paul 27 November 2023 (has links) Die Extraktion von Wissen aus Büchern ist essentiell und komplex. Besonders in der Medizininformatik ist ein einfacher und vollständiger Zugang zu Wissen wichtig. In dieser Arbeit wurde ein vortrainiertes Sprachmodell verwendet, um den Inhalt des Buches Health Information Systems von Winter u. a. (2023) effizienter und einfacher zugänglich zu machen. Während des Trainings wurde die Qualität des Modells zu verschiedenen Zeitpunkten evaluiert. Dazu beantwortete das Modell Prüfungsfragen aus dem Buch und aus Modulen der Universität Leipzig, die inhaltlich auf dem Buch aufbauen. Abschließend wurde ein Vergleich zwischen den Trainingszeitpunkten, dem nicht weiter trainierten Modell und dem Stand der Technik Modell GPT4 durchgeführt. Mit einem MakroF1-Wert von 0,7 erreichte das Modell GPT4 die höchste Korrektheit bei der Beantwortung der Klausurfragen. Diese Leistung konnte von den anderen Modellen nicht erreicht werden. Allerdings stieg die Leistung von einem anfänglichen MakroF1-Wert von 0,13 durch kontinuierliches Training auf 0,33. Die Ergebnisse zeigen eine deutliche Leistungssteigerung durch diesen Ansatz und bieten eine Grundlage für zukünftige Erweiterungen. Damit ist die Machbarkeit der Beantwortung von Fragen zu Informationssystemen im Gesundheitswesen und der Lösung einer Beispielklausur mit Hilfe von weiter trainierten Sprachmodellen gezeigt, eine praktische Anwendung erreichen diese Modelle jedoch nicht, da sowohl die Leistung unter dem aktuellen Stand der Technik liegt als auch die hier vorgestellten Modelle einen Großteil der gestellten Fragen nicht vollständig korrekt beantworten können.:1 Einleitung 1.1 Gegenstand 1.2 Problemstellung 1.3 Motivation 1.4 Zielsetzung 1.5 Bezug zu ethischen Leitlinien der GMDS 1.6 Aufgabenstellung 1.7 Aufbau der Arbeit 2 Grundlagen 9 2.1 Sprachmodelle 2.1.1 Transformer-Modelle 2.1.2 Transformer-spezifische Architekturen 2.1.3 Eigenheiten von Transformer-Modellen 2.1.4 Eingaben von Transformer-Modellen 2.2 Neuronale Netze 2.2.1 Architektur 2.2.2 Funktionsweise 2.2.3 Training 2.3 Datenverarbeitung 2.3.1 Glossar der Daten 3 Stand der Forschung 3.1 Continual Pretraining 3.2 Aktuelle Modelle und deren Nutzbarkeit 3.3 Forschung und Probleme von Modellen 4 Lösungsansatz 4.1 Auswahl von Sprachmodellen 4.2 Datenkuration 4.2.1 Extraktion des Textes 4.2.2 Unverständliche Formate 4.2.3 Textpassagen ohne Wissen oder Kontext 4.2.4 Optionale Textentfernungen 4.2.5 Bleibende Texte 4.2.6 Formatierung von Text 4.2.7 Potentielle Extraktion von Fragen 4.3 Unüberwachtes Weitertrainieren 4.3.1 Ausführen der Training-Programme 4.4 Klausurfragen 4.5 Modellevaluation 5 Ausführung der Lösung 5.1 Herunterladen des Modells 5.2 Training des Modells 5.2.1 Konfiguration des Modells 5.2.2 Konfiguration der Trainingsdaten 5.2.3 Konfiguration des Trainings 5.2.4 Konfiguration des DeepSpeed Trainings 5.2.5 Verwendete Bibliotheken zum Training 5.2.6 Training auf einem GPU Computing Cluster 5.2.7 Probleme während des Trainings 5.3 Generierung von Antworten 5.3.1 Erstellung des Evaluierungsdatensatzes 5.4 Bewertung der generierten Antworten 5.5 Evaluation der Modelle 5.5.1 Kriterium: Korrektheit 5.5.2 Kriterium: Erklärbarkeit 5.5.3 Kriterium: Fragenverständnis 5.5.4 Kriterium: Robustheit 6 Ergebnisse 6.1 Analyse Korrektheit 6.1.1 Vergleich totaler Zahlen 6.1.2 Stärken und Schwächen der Modelle 6.1.3 Verbesserungen durch Training 6.1.4 Vergleich MakroF1 6.1.5 Zusammenfassung 6.2 Analyse Erklärbarkeit 6.3 Analyse Fragenverständnis 6.4 Analyse Robustheit 6.5 Zusammenfassung 7 Diskussion 7.1 Grenzen der Modelle 7.2 Probleme bei Kernfragen 7.3 Bewertung der Fragen mit Prüfungspunkten 7.4 Lösung des Problems 8 Ausblick 8.1 Modellvergrößerung 8.1.1 Training durch Quantisierung 8.2 Human Reinforcement Learning 8.3 Datensatzvergrößerung 8.4 Domänenspezifische Modelle 8.5 Adapter-basiertes Training 8.6 Textextraktion aus Kontext 8.7 Retrieval Augmented Generation 8.8 Zusammenfassung Zusammenfassung info:eu-repo/classification/ddc/004 ddc:004
250	Deep Image Processing with Spatial Adaptation and Boosted Efficiency & Supervision for Accurate Human Keypoint Detection and Movement Dynamics Tracking Dai, Chao Yang 05 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / This thesis aims to design and develop the spatial adaptation approach through spatial transformers to improve the accuracy of human keypoint recognition models. We have studied different model types and design choices to gain an accuracy increase over models without spatial transformers and analyzed how spatial transformers increase the accuracy of predictions. A neural network called Widenet has been leveraged as a specialized network for providing the parameters for the spatial transformer. Further, we have evaluated methods to reduce the model parameters, as well as the strategy to enhance the learning supervision for further improving the performance of the model. Our experiments and results have shown that the proposed deep learning framework can effectively detect the human key points, compared with the baseline methods. Also, we have reduced the model size without significantly impacting the performance, and the enhanced supervision has improved the performance. This study is expected to greatly advance the deep learning of human key points and movement dynamics. Computer Vision Artificial Intelligence Human Keypoint Estimation Deep Learning Spatial Transformers

Search results