Global ETD Search

51	Club Head Tracking : Visualizing the Golf Swing with Machine Learning Herbai, Fredrik January 2023 (has links) During the broadcast of a golf tournament, a way to show the audience what a player's swing looks like would be to draw a trace following the movement of the club head. A computer vision model can be trained to identify the position of the club head in an image, but due to the high speed at which professional players swing their clubs coupled with the low frame rate of a typical broadcast camera, the club head is not discernible whatsoever in most frames. This means that the computer vision model is only able to deliver a few sparse detections of the club head. This thesis project aims to develop a machine learning model that can predict the complete motion of the club head, in the form of a swing trace, based on the sparse club head detections. Slow motion videos of golf swings are collected, and the club head's position is annotated manually in each frame. From these annotations, relevant data to describe the club head's motion, such as position and time parameters, is extracted and used to train the machine learning models. The dataset contains 256 annotated swings of professional and competent amateur golfers. The two models that are implemented in this project are XGBoost and a feed forward neural network. The input given to the models only contains information in specific parts of the swing to mimic the pattern of the sparse detections. Both models learned the underlying physics of the golf swing, and the quality of the predicted traces depends heavily on the amount of information provided in the input. In order to produce good predictions with only the amount of input information that can be expected from the computer vision model, a lot more training data is required. The traces predicted by the neural network are significantly smoother and thus look more realistic than the predictions made by the XGBoost model. Golf Machine learning Neural network XGBoost Interpolation Deep learning Data collection Data augmentation Computer Sciences Datavetenskap (datalogi)
52	Techniques for Multilingual Document Retrieval for Open-Domain Question Answering : Using hard negatives filtering, binary retrieval and data augmentation / Tekniker för flerspråkig dokumenthämtning för OpenQA : Använder hård negativ filtrering, binär sökning och dataförstärkning Lago Solas, Carlos January 2022 (has links) Open Domain Question Answering (OpenQA) systems find an answer to a question from a large collection of unstructured documents. In this information era, we have an immense amount of data at our disposal. However, filtering all the content and trying to find the answers to our questions can be too time-consuming and ffdiicult. In addition, in such a globalised world, the information we look for to answer a question may be in a different language. Current research is focused on improving monolingual (English) OpenQA performance. This creates a disparity between the tools accessible between English and non-English speakers. The techniques explored in this study involve the combination of different methods, such as data augmentation and hard negative filtering for performance increase, and binary embeddings for improving the efficiency, with multilingual Transformers. The downstream performance is evaluated using sentiment multilingual datasets covering Cross-Lingual Transfer (XLT), question and answer in the same language, and Generalised Cross-Lingual Transfer (G-XLT), different languages for question and answer. The results show that data augmentation increased Recall by 37.0% and Mean Average Precision (MAP) by 67.0% using languages absent from the test set for XLT. Combining binary embeddings and hard negatives can reduce inference time and index size to 12.5% and 3.1% of the original, retaining 97.1% of the original Recall and 94.8% of MAP (averages of XLT and MAP). / Open Domain Question Answering (OpenQA)-system hittar svar på frågor till stora samlingar av ostrukturerade dokument. I denna informationsepok har vi en enorm mängd kunskap till vårt förfogande. Att filtrera allt innehåll för att försöka att hitta svar på våra frågor kan dock vara mycket tidskrävande och svårt. I en globaliserad värld kan informationen vi söker för att besvara en fråga dessutom vara på ett annat språk. Nuvarande forskning är primärt inriktad på att förbättra OpenQA:s enspråkiga (engelska) prestanda. Detta skapar ett gap mellan de verktyg som är tillgängliga för engelsktalande och icke-engelsktalande personer. De tekniker som undersöks i den här studien innebär en kombination av olika metoder, t.ex. dataförstärkning och hård negativ filtrering för att öka prestandan, och binära embeddings för att förbättra effektiviteten med flerspråkiga Transformatorer. Prestandan nedströms utvärderas med hjälp av flerspråkiga dataset som omfattar Cross-Lingual Transfer (XLT), fråga och svar på samma språk, och Generalised Cross-Lingual Transfer (G-XLT), olika språk för fråga och svar. Resultaten visar att dataförstärkning ökade recall med 37.0% och 67.0% för Mean Average Precision (MAP) med hjälp av språk som inte fanns med i testuppsättningen för XLT. Genom att kombinera binära embeddings och hårda negationer kan man minska tiden för inferens och indexstorleken till 12.5% och 3.1% av originalet, samtidigt som man behåller 97.1% av ursprunglig recall samt 94.8% av MAP (medelvärden av XLT och MAP). OpenQA Multilingual Transformers Document retrieval Data augmentation. OpenQA Flerspråkiga Transformatorer Dokumenthämtning Dataförstärkning. Computer Sciences Datavetenskap (datalogi) Computer Engineering Datorteknik
53	Data augmentation for latent variables in marketing Kao, Ling-Jing 13 September 2006 (has links) No description available. Business Administration, Marketing marketing Bayesian statistics data augmentation state-space models choice models consumer preference change media effects
54	Explainable AI in Eye Tracking / Förklarbar AI inom ögonspårning Liu, Yuru January 2024 (has links) This thesis delves into eye tracking, a technique for estimating an individual’s point of gaze and understanding human interactions with the environment. A blossoming area within eye tracking is appearance-based eye tracking, which leverages deep neural networks to predict gaze positions from eye images. Despite its efficacy, the decision-making processes inherent in deep neural networks remain as ’black boxes’ to humans. This lack of transparency challenges the trust human professionals place in the predictions of appearance-based eye tracking models. To address this issue, explainable AI is introduced, aiming to unveil the decision-making processes of deep neural networks and render them comprehensible to humans. This thesis employs various post-hoc explainable AI methods, including saliency maps, gradient-weighted class activation mapping, and guided backpropagation, to generate heat maps of eye images. These heat maps reveal discriminative areas pivotal to the model’s gaze predictions, and glints emerge as of paramount importance. To explore additional features in gaze estimation, a glint-free dataset is derived from the original glint-preserved dataset by employing blob detection to eliminate glints from each eye image. A corresponding glint-free model is trained on this dataset. Cross-evaluations of the two datasets and models discover that the glint-free model extracts complementary features (pupil, iris, and eyelids) to the glint-preserved model (glints), with both feature sets exhibiting comparable intensities in heat maps. To make use of all the features, an augmented dataset is constructed, incorporating selected samples from both glint-preserved and glint-free datasets. An augmented model is then trained on this dataset, demonstrating a superior performance compared to both glint-preserved and glint-free models. The augmented model excels due to its training process on a diverse set of glint-preserved and glint-free samples: it prioritizes glints when of high quality, and adjusts the focus to the entire eye in the presence of poor glint quality. This exploration enhances the understanding of the critical factors influencing gaze prediction and contributes to the development of more robust and interpretable appearance-based eye tracking models. / Denna avhandling handlar om ögonspårning, en teknik för att uppskatta en individs blickpunkt och förstå människors interaktioner med miljön. Ett viktigt område inom ögonspårning är bildbaserad ögonspårning, som utnyttjar djupa neuronnät för att förutsäga blickpositioner från ögonbilder. Trots dess effektivitet förblir beslutsprocesserna i djupa neuronnät som ”svarta lådor” för människor. Denna brist på transparens utmanar det förtroende som yrkesverksamma sätter i förutsägelserna från bildbaserade ögonspårningsmodeller. För att ta itu med detta problem introduceras förklarbar AI, med målet att avslöja beslutsprocesserna hos djupa neuronnät och göra dem begripliga för människor. Denna avhandling använder olika efterhandsmetoder för förklarbar AI, inklusive saliency maps, gradient-weighted class activation mapping och guidad backpropagation, för att generera värmekartor av ögonbilder. Dessa värmekartor avslöjar områden som är avgörande för modellens blickförutsägelser, och ögonblänk framstår som av yttersta vikt. För att utforska ytterligare funktioner i blickuppskattning, härleds ett dataset utan ögonblänk från det ursprungliga datasetet genom att använda blobdetektering för att eliminera blänk från varje ögonbild. En motsvarande blänkfri modell tränas på detta dataset. Korsutvärderingar av de två datamängderna och modellerna visar att den blänkfria modellen tar fasta på kompletterande särdrag (pupill, iris och ögonlock) jämfört med den blänkbevarade modellen, men båda modellerna visar jämförbara intensiteter i värmekartorna. För att utnyttja all information konstrueras ett förstärkt dataset, som inkorporerar utvalda exempel från både blänkbevarade och blänkfria dataset. En förstärkt modell tränas sedan på detta dataset, och visar överlägsen prestanda jämfört med de båda andra modellerna. Den förstärkta modellen utmärker sig på grund av sin träning på en mångfaldig uppsättning av exempel med och utan blänk: den prioriterar blänk när de är av hög kvalitet och justerar fokuset till hela ögat vid dålig blänkkvalitet. Detta arbete förbättrar förståelsen för de kritiska faktorerna som påverkar blickförutsägelse och bidrar till utvecklingen av mer robusta och tolkningsbara modeller för bildbaserad ögonspårning. Eye Tracking Explainable AI Post-hoc Explanation Data Augmentation Ögonspårning Förklarbar AI Efterhandsmetoder Datatillväxt Computer and Information Sciences Data- och informationsvetenskap
55	A comparative study of the effect of different data augmentation methods on the accuracy of a CNN model to detect Pneumothorax of the lungs / En komparativ studie om påverkan av olika dataförstärkningsmetoder på noggrannheten hos en CNN-modell för att detektera Pneumothorax i lungorna Staifo, Gabriel, Hanna, Rabi January 2024 (has links) The use of AI in the medical field is becoming more widespread, and research on its various applications is very popular. In biomedical image analysis, Convolutional Neural Networks (CNN), which are specialized in image processing, can analyze X-rays and detect signs of different diseases. However, to achieve that, CNNs require vast amounts of X-ray images with labels specifying the disease (labeled training data), which is not always available. One method to overcome this obstacle is the use of data augmentation. Data augmentation is manipulating images through flipping, rotating, or changing the saturation or brightness, among other methods. The purpose is to increase and diversify the training data to make the CNN model more robust. Our study aims to investigate the effects of different data augmentation techniques on the performance of a CNN model in detecting Pneumothorax. After fine-tuning our CNN model’s hyper-parameters, three data augmentation methods (color, geometric, and noise) and their combinations were applied to our model. We then tested and compared the effects of each data augmentation method on the accuracy of our model. Our study concluded that color augmentation performed the best compared to the other augmentation methods, while geometric augmentation had the worst performance. However, none of the augmentation methods significantly improved the original model’s performance, which can be attributed to the model’s configuration of hyper-parameters, leaving no room for improvement. / Användningen av AI inom det medicinska området blir mer utbredd och forskning om dess olika tillämpningar är mycket populär. Inom biomedicinsk bildanalys kan Convolutional Neural Networks (CNN), som är specialiserade på bildbehandling, analysera röntgenstrålar och upptäcka tecken på olika sjukdomar. Men för att uppnå det kräver CNN stora mängder röntgenbilder med etiketter som anger sjukdomen (märkta träningsdata), vilket inte alltid är tillgängligt. En metod för att övervinna detta hinder är användningen av dataförstärkning. Dataförstärkning är att manipulera bilder genom att bläddra, rotera eller ändra mättnad eller ljusstyrka, bland andra metoder. Syftet är att öka och diversifiera träningsdata för att göra CNN-modellen mer robust. Vår studie syftar till att undersöka effekterna av olika dataförstärkningstekniker på prestandan hos en CNN-modell vid detektering av pneumothorax. Efter att ha finjusterat vår CNN-modells hyperparametrar, tillämpades tre dataförstärkningsmetoder (färg, geometrisk och brus) och deras kombinationer på vår modell. Vi testade och jämförde sedan effekterna av varje dataförstärkningsmetod på noggrannheten i vår modell. Vår studie drog slutsatsen att färgförstärkning presterade bäst jämfört med andra förstärkningsmetoder, medan geometrisk förstärkning hade sämst prestanda. Ingen av förstärkningsmetoderna förbättrade dock den ursprungliga modellens prestanda avsevärt, vilket kan tillskrivas modellens konfiguration av hyperparametrar, vilket inte lämnar något utrymme för förbättringar. Data augmentation Pneumothorax CNN VGG-16 Chest X-RAY Dataförstärkning Pneumothorax CNN VGG-16 Bröströntgen Computer and Information Sciences Data- och informationsvetenskap
56	Monitoring von ökologischen und biometrischen Prozessen mit statistischen Filtern Frühwirth-Schnatter, Sylvia January 1991 (has links) (PDF) Diese Arbeit ist ein Überblick über die Ideen und Methoden der dynamischen stochastischen Modellierung von normalverteilten und nicht-normalverteilten Prozessen. Nach einer Einführung der allgemeinen Modellform werden Aussagemöglichkeiten wie Filtern, Glätten und Vorhersagen diskutiert und das Problem der Identifikation unbekannter Hyperparameter behandelt. Die allgemeinen Ausführungen werden an zwei Fallstudien, einer Zeitreihe des mittleren jährlichen Grundwasserspiegels und einer Zeitreihe von Tagesmittelwerten von SO2-Emissionen illustriert. (Autorenref.) / Series: Forschungsberichte / Institut für Statistik
57	Trénovatelné metody pro automatické zpracování biomedicínských obrazů / Trainable Methods for Automatic Biomedical Image Processing Uher, Václav January 2018 (has links) This thesis deals with possibilities of automatic segmentation of biomedical images. For the 3D image segmentation, a deep learning method has been proposed. In the work problems of network design, memory optimization method and subsequent composition of the resulting image are solved. The uniqueness of the method lies in 3D image processing on a GPU in combination with augmentation of training data and preservation of the output size with the original image. This is achieved by dividing the image into smaller parts with the overlay and then folding to the original size. The functionality of the method is verified on the segmentation of human brain tissue on magnetic resonance imaging, where it overcomes human accuracy when compared a specialist vs. specialist, and cell segmentation on a slices of the Drosophila brain from an electron microscope, where published results from the impacted paper are overcome.
58	Detekce a rozměření elektronového svazku v obrazech z TEM / Detection and measurement of electron beam in TEM images Polcer, Simon January 2020 (has links) This diploma thesis deals with automatic detection and measurement of the electron beam in the images from a transmission electron microscope (TEM). The introduction provides a description of the construction and the main parts of the electron microscope. In the theoretical part, there are summarized modes of illumination from the fluorescent screen. Machine learning, specifically convolution neural network U-Net is used for automatic detection of the electron beam in the image. The measurement of the beam is based on ellipse approximation, which defines the size and dimension of the beam. Neural network learning requires an extensive database of images. For this purpose, the own augmentation approach is proposed, which applies a specific combination of geometric transformations for each mode of illumination. In the conclusion of this thesis, the results are evaluated and summarized. This proposed algorithm achieves 0.815 of the DICE coefficient, which describes an overlap between two sets. The thesis was designed in Python programming language.
59	Data Quality Evaluation and Improvement for Machine Learning Chen, Haihua 05 1900 (has links) In this research the focus is on data-centric AI with a specific concentration on data quality evaluation and improvement for machine learning. We first present a practical framework for data quality evaluation and improvement, using a legal domain as a case study and build a corpus for legal argument mining. We first created an initial corpus with 4,937 instances that were manually labeled. We define five data quality evaluation dimensions: comprehensiveness, correctness, variety, class imbalance, and duplication, and conducted a quantitative evaluation on these dimensions for the legal dataset and two existing datasets in the medical domain for medical concept normalization. The first group of experiments showed that class imbalance and insufficient training data are the two major data quality issues that negatively impacted the quality of the system that was built on the legal corpus. The second group of experiments showed that the overlap between the test datasets and the training datasets, which we defined as "duplication," is the major data quality issue for the two medical corpora. We explore several widely used machine learning methods for data quality improvement. Compared to pseudo-labeling, co-training, and expectation-maximization (EM), generative adversarial network (GAN) is more effective for automated data augmentation, especially when a small portion of labeled data and a large amount of unlabeled data is available. The data validation process, the performance improvement strategy, and the machine learning framework for data evaluation and improvement discussed in this dissertation can be used by machine learning researchers and practitioners to build high-performance machine learning systems. All the materials including the data, code, and results will be released at: https://github.com/haihua0913/dissertation-dqei. Data quality data-centric AI machine learning data augmentation transfer learning semi-supervised learning medical concept normalization legal text classification Information Science
60	Advanced Data Augmentation : With Generative Adversarial Networks and Computer-Aided Design Thaung, Ludwig January 2020 (has links) CNN-based (Convolutional Neural Network) visual object detectors often reach human level of accuracy but need to be trained with large amounts of manually annotated data. Collecting and annotating this data can frequently be time-consuming and financially expensive. Using generative models to augment the data can help minimize the amount of data required and increase detection per-formance. Many state-of-the-art generative models are Generative Adversarial Networks (GANs). This thesis investigates if and how one can utilize image data to generate new data through GANs to train a YOLO-based (You Only Look Once) object detector, and how CAD (Computer-Aided Design) models can aid in this process. In the experiments, different models of GANs are trained and evaluated by visual inspection or with the Fréchet Inception Distance (FID) metric. The data provided by Ericsson Research consists of images of antenna and baseband equipment along with annotations and segmentations. Ericsson Research supplied the YOLO detector, and no modifications are made to this detector. Finally, the YOLO detector is trained on data generated by the chosen model and evaluated by the Average Precision (AP). The results show that the generative models designed in this work can produce RGB images of high quality. However, the quality reduces if binary segmentation masks are to be generated as well. The experiments with CAD input data did not result in images that could be used for the training of the detector. The GAN designed in this work is able to successfully replace objects in images with the style of other objects. The results show that training the YOLO detector with GAN-modified data compared to training with real data leads to the same detection performance. The results also show that the shapes and backgrounds of the antennas contributed more to detection performance than their style and colour. computer vision machine learning YOLO GANs data augmentation object recognition object detection CAD

Search results