Global ETD Search

451	Reducing Training Time in Text Visual Question Answering Behboud, Ghazale 15 July 2022 (has links) Artificial Intelligence (AI) and Computer Vision (CV) have brought the promise of many applications along with many challenges to solve. The majority of current AI research has been dedicated to single-modal data processing meaning they use only one modality such as visual recognition or text recognition. However, real-world challenges are often a combination of different modalities of data such as text, audio and images. This thesis focuses on solving the Visual Question Answering (VQA) problem which is a significant multi-modal challenge. VQA is defined as a computer vision system that when given a question about an image will answer based on an understanding of both the question and image. The goal is improving the training time of VQA models. In this thesis, Look, Read, Reason and Answer (LoRRA), which is a state-of-the-art architecture, is used as the base model. Then, Reduce Uni-modal Biases (RUBi) is applied to this model to reduce the importance of uni- modal biases in training. Finally, an early stopping strategy is employed to stop the training process once the model accuracy has converged to prevent the model from overfitting. Numerical results are presented which show that training LoRRA with RUBi and early stopping can converge in less than 5 hours. The impact of batch size, learning rate and warm up hyper parameters is also investigated and experimental results are presented. / Graduate AI ML Deep Learning Machine Learning Visual Question Answering Convolutional Neural Network Recurrent Neural Network Long Short Term Memory Early Stopping
452	Klassificering av kvitton med hjälp av maskininlärning Enerstrand, Simon January 2019 (has links) Maskininlärning nyttjas inom fler och fler områden. Det har potential att ersätta många repetitiva arbetsuppgifter, eller åtminstone förenkla dem. Dokumenthantering inom ekonomisystem är ett område maskininlärning kan hjälpa till med. Det behövs ofta mycket manuell input i olika fält genom att avläsa fakturor eller kvitton. Målet med projektet är att skapa en applikation som nyttjar maskininlärning åt företaget Centsoft AB. Applikationen ska ta emot OCR-tolkad textmassa från en bild på ett kvitto och sedan, med hög säkerhet, kunna avgöra vilken kategori kvittot tillhör. Den här rapporten syftar till att visa utvecklingen av maskininlärningsmodellen i applikationen. Rapporten svarar på frågeställningen: ”Hur kan kvitton klassificeras med hjälp av maskininlärning?”.Undersökningsmetoden fallstudie och projektmetoden MoSCoW tillämpas i projektet. Projektet tar även hänsyn till åtagandetriangeln. Maskininlärningsramverk används för att utvärdera den upptränade modellen. Den tränade modellen klarar av att, med hög säkerhet, tolka kvitton den inte stött på tidigare. För att få en meningsfull tolkning måste kvitton ha i avsikt att tillhöra någon av de åtta tränade kategorierna.Valet av metoder passade bra till projektet för att besvara frågeställningen. Applikationen kan utvecklas vidare och implementeras i fakturahanteringssystemet. Genomförandet av projektet ger kunskap att arbeta med maskininlärningslösningar. Tekniken kan i framtiden appliceras på flera områden. / Machine learning is used in more and more areas. It has the potential to replace many repetitive tasks, or at least simplify them. Document management within financial systems is an area machine learning can help with. A lot of manual input is often needed in different fields by reading invoices or receipts. The goal of the project is to create an application that uses machine learning for the company Centsoft AB. The application should receive OCR-interpreted texts from an image of a receipt and then, with high certainty, be able to determine which category the receipt belongs to. This report aims to show the development of the machine learning model in the application. The report answers the question: "How can receipts be classified using machine learning?".The methodology case study and the research method MoSCoW will be applied during the project. The project also considers the triangle method described by Eklund. Machine learning frameworks are used to evaluate the trained model. The trained model can, with high certainty, interpret receipts it has not encountered before. In order to get a meaningful interpretation, receipts must have the intention of belonging to one of the eight trained categories.The choice of methods suited the project well to answer the question. The application can be further developed and be implemented in the invoice management system. The implementation of the project gives knowledge about how to work with machine learning solutions. In the future, the technology can be applied in several areas. Computer and Information Sciences Data- och informationsvetenskap
453	Quality inspection of multiple product variants using neural network modules Vuoluterä, Fredrik January 2022 (has links) Maintaining quality outcomes is an essential task for any manufacturing organization. Visual inspections have long been an avenue to detect defects in manufactured products, and recent advances within the field of deep learning has led to a surge of research in how technologies like convolutional neural networks can be used to perform these quality inspections automatically. An alternative to these often large and deep network structures is the modular neural network, which can instead divide a classification task into several sub-tasks to decrease the overall complexity of a problem. To investigate how these two approaches to image classification compare in a quality inspection task, a case study was performed at AR Packaging, a manufacturer of food containers. The many different colors, prints and geometries present in the AR Packaging product family served as a natural occurrence of complexity for the quality classification task. A modular network was designed, being formed by one routing module to classify variant type which is subsequently used to delegate the quality classification to an expert module trained for that specific variant. An image dataset was manually generated from within the production environment portraying a range of product variants in both defective and non-defective form. An image processing algorithm was developed to minimize image background and align the products in the pictures. To evaluate the adaptability of the two approaches, the networks were initially trained on same data from five variants, and then retrained with added data from a sixth variant. The modular networks were found to be overall less accurate and slower in their classification than the conventional single networks were. However, the modular networks were more than six times smaller and required less time to train initially, though the retraining times were roughly equivalent in both approaches. The retraining of the single network did also cause some fluctuation in the predictive accuracy, something which was not noted in the modular network. / <p>Det finns övrigt digitalt material (t.ex. film-, bild- eller ljudfiler) eller modeller/artefakter tillhörande examensarbetet som ska skickas till arkivet.</p> quality inspection defect detection variants modular neural network convolutional neural network case study
454	Monitoring Bicycle Safety through GPS data and Deep Learning Anomaly Detection Yaqoob, Shumayla, Cafiso, Salvatore, Morabito, Giacomo, Pappalardo, Giuseppina 02 January 2023 (has links) Cycling has always been considered a sustainable and healthy mode of transport. Moreover, during Covid-19 period, cycling was further appreciated. by citizens as an individual opportunity of mobility. As a counterpart of the growth in the num.ber ofbicyclists and of riding k:ilometres, bicyclist safety has become a challenge as the unique road transport mode with an increasing trend of crash fatalities in EU (Figure 1). When compared to the traditional road safety network screening. availability of suitable data for crashes involving bicyclists is more difficult because of underreporting and traffic flow issues. In such framework, new technologies and digital transformation in smart cities and communities is offering new opportunities of data availability which requires also different approaches for collection and analysis. An experimental test was carried out to collect data ftom different users with an instrumented bicycle equipped with Global Navigation Satellite Systems (GNSS) and cameras. A panel of experts was asked to review the collected data to identify and score the severity of the safety critical events (CSE) reaching a good consensus. Anyway, manual observation and classi.fication of CSE is a time consu.ming and unpractical approach when large amount of data must be analysed. Moreover, due to the complex correlation between precrash driving behaviour and due to high dimensionality of the data, traditional statistical methods might not be appropriate in t.bis context. Deep learning-based model have recently gained significant attention in the lit.erature for time series data analysis and for anomaly detection, but generally applied to vehicles' mobility and not to micro-mobility. We present and discuss data requirements and treatment to get suitable infonnation from the GNSS devices, the development of an experimental :framework: where convolutional neural networks (CNN) is applied to integrate multiple GPS data streams of bicycle kinematics to detect the occurrence of a CSE.
455	[pt] DESENVOLVIMENTO DE PIV ULTRA PRECISO PARA BAIXOS GRADIENTES USANDO ABORDAGEM HÍBRIDA DE CORRELAÇÃO CRUZADA E CASCATA DE REDE NEURAIS CONVOLUCIONAIS / [en] DEVELOPMENT OF ULTRA PRECISE PIV FOR LOW GRADIENTS USING HYBRID CROSS-CORRELATION AND CASCADING NEURAL NETWORK CONVOLUTIONAL APPROACH CARLOS EDUARDO RODRIGUES CORREIA 31 January 2022 (has links) [pt] Ao longo da história a engenharia de fluidos vem se mostrado como uma das áreas mais importantes da engenharia devido ao seu impacto nas áreas de transporte, energia e militar. A medição de campos de velocidade, por sua vez, é muito importante para estudos nas áreas de aerodinâmica e hidrodinâmica. As técnicas de medição de campo de velocidade em sua maioria são técnicas ópticas, se destacando a técnica de Particle Image Velocimetry (PIV). Por outro lado, nos últimos anos importantes avanços na área de visão computacional, baseados em redes neurais convolucionais, se mostram promissores para a melhoria do processamento das técnicas ópticas. Nesta dissertação, foi utilizada uma abordagem híbrida entre correlação cruzada e cascata de redes neurais convolucionais, para desenvolver uma nova técnica de PIV. O projeto se baseou nos últimos trabalhos de PIV com redes neurais artificiais para desenvolver a arquitetura das redes e sua forma de treinamento. Diversos formatos de cascata de redes neurais foram testados até se chegar a um formato que permitiu reduzir o erro em uma ordem de grandeza para escoamento uniforme. Além do desenvolvimento da cascata para escoamento uniforme, gerou-se conhecimento para fazer cascatas para outros tipos de escoamentos. / [en] Throughout history, fluid engineering is one of the most important areas of engineering due to its impact in the areas of transportation, energy and the military. The measurement of velocity fields is important for studies in aerodynamics and hydrodynamics. The techniques for measuring the velocity field are mostly optical techniques, with emphasis on the PIV technique. On the other hand, in recent years, important advances in computer vision, based on convolutional neural networks, have shown promise for improving the processing of optical techniques. In this work, a hybrid approach between cross-correlation and cascade of convolutional neural networks was used to develop a new PIV technique. The project was based on the latest work of PIV with an artificial neural network to develop the architecture of the networks and their form of training. Several cascade formats of neural networks were tested until they reached a format that allowed the error to be reduced by an order of magnitude for uniform flow. In addition to the development of the cascade for uniform flow, knowledge was generated to make cascades for other types of flows. [pt] PIV [pt] MONTAGEM DE REDES NEURAIS [pt] CNN [pt] REDE NEURAL CONVOLUCIONAL [en] PIV [en] CONVOLUTIONAL NEURAL NETWORK [en] PARTICLE IMAGE VELOCIMETRY [en] CNN [en] CONVOLUTIONAL NEURAL NETWORK
456	Jämförelse av artificiella neurala nätverksalgoritmerför klassificering av omdömen / Comparing artificial neural network algorithms forclassification of reviews Gilljam, Daniel, Youssef, Mario January 2018 (has links) Vid stor mängd data i form av kundomdömen kan det vara ett relativt tidskrävande arbeteatt bedöma varje omdömes sentiment manuellt, om det är positivt eller negativt laddat. Denna avhandling har utförts för att automatiskt kunna klassificera kundomdömen efter positiva eller negativa omdömen vilket hanterades med hjälp av maskininlärning. Tre olika djupa neurala nätverk testades och jämfördes med hjälp av två olika ramverk, TensorFlow och Keras, på både större och mindre datamängder. Även olika inbäddningsmetoder testades med de neurala nätverken. Den bästa kombination av neuralt nätverk, ramverk och inbäddningsmetod var ett Convolutional Neural Network (CNN) som använde ordinbäddningsmetoden Word2Vec, var skriven i ramverket Keras och gav en träffsäkerhetpå ca 88.87% med en avvikelse på ca 0.4%. CNN gav bäst resultat i alla olika tester framför de andra två neurala nätverken, Recurrent Neural Network (RNN) och Convolutional Recurrent Neural Network (CRNN) / With large amount of data in the form of customer reviews, it could be time consuming to manually go through each review and decide if its sentiment is positive or negative. This thesis have been done to automatically classify client reviews to determine if a review is positive or negative. This was dealt with by machine learning. Three different deep neural network was tested on greater and lesser datasets, and compared with the help of two different frameworks, TensorFlow and Keras. Different embedding methods were tested on the neural networks. The best combination of a neural network, a framework and anembedding was the Convolutional Neural Network (CNN) which used the word embedding method Word2Vec, was written in Keras framework and gave an accuracy of approximately 88.87% with a deviation of approximately 0.4%. CNN scored a better result in all of the tests in comparison with the two other neural networks, Recurrent NeuralNetwork (RNN) and Convolutional Recurrent Neural Network (CRNN). Machine learning neural networks reviews Convolutional Neural Network TensorFlow Keras language embedding methods Maskininlärning neurala nätverk omdömen Convolutional Neural Network TensorFlow Keras inbäddningsmetoder Computer Engineering Datorteknik
457	Impact of data augmentations when training the Inception model for image classification Barai, Milad, Heikkinen, Anthony January 2017 (has links) Image classification is the process of identifying to which class a previously unobserved object belongs to. Classifying images is a commonly occurring task in companies. Currently many of these companies perform this classification manually. Automated classification however, has a lower expected accuracy. This thesis examines how automated classification could be improved by the addition of augmented data into the learning process of the classifier. We conduct a quantitative empirical study on the effects of two image augmentations, random horizontal/vertical flips and random rotations (<180◦). The data set that is used is from an auction house search engine under the commercial name of Barnebys. The data sets contain 700 000, 50 000 and 28 000 images with each set containing 28 classes. In this bachelor’s thesis, we re-trained a convolutional neural network model called the Inception-v3 model with the two larger data sets. The remaining set is used to get more class specific accuracies. In order to get a more accurate value of the effects we used a tenfold cross-validation method. Results of our quantitative study shows that the Inception-v3 model can reach a base line mean accuracy of 64.5% (700 000 data set) and a mean accuracy of 51.1% (50 000 data set). The overall accuracy decreased with augmentations on our data sets. However, our results display an increase in accuracy for some classes. The highest flat accuracy increase observed is in the class "Whine & Spirits" in the small data set where it went from 42.3% correctly classified images to 72.7% correctly classified images of the specific class. / Bildklassificering är uppgiften att identifiera vilken klass ett tidigare osett objekt tillhör. Att klassificera bilder är en vanligt förekommande uppgift hos företag. För närvarande utför många av dessa företag klassificering manuellt. Automatiserade klassificerare har en lägre förväntad nogrannhet. I detta examensarbete studeradas hur en maskinklassificerar kan förbättras genom att lägga till ytterligare förändrad data i inlärningsprocessen av klassificeraren. Vi genomför en kvantitativ empirisk studie om effekterna av två bildförändringar, slumpmässiga horisontella/vertikala speglingar och slumpmässiga rotationer (<180◦). Bilddatasetet som används är från ett auktionshus sökmotor under det kommersiella namnet Barnebys. De dataseten som används består av tre separata dataset, 700 000, 50 000 och 28 000 bilder. Var och en av dataseten innehåller 28 klasser vilka mappas till verksamheten. I det här examensarbetet har vi tränat Inception-v3-modellen med dataset av storlek 700 000 och 50 000. Vi utvärderade sedan noggrannhet av de tränade modellerna genom att klassificera 28 000-datasetet. För att få ett mer exakt värde av effekterna använde vi en tiofaldig korsvalideringsmetod. Resultatet av vår kvantitativa studie visar att Inceptionv3-modellen kan nå en genomsnittlig noggrannhet på 64,5% (700 000 dataset) och en genomsnittlig noggrannhet på 51,1% (50 000 dataset). Den övergripande noggrannheten minskade med förändringar på vårat dataset. Dock visar våra resultat en ökad noggrannhet i vissa klasser. Den observerade högsta noggrannhetsökningen var i klassen Åhine & Spirits", där vi gick från 42,3 % korrekt klassificerade bilder till 72,7 % korrekt klassificerade bilder i det lilla datasetet med förändringar. Image Classification Image Recognition Inception Data Augmentation Convolutional Neural Network Machine Learning Bildklassificering Bildigenkänning Inception Data förändring Convolutional Neural Network Maskininlärning Computer and Information Sciences Data- och informationsvetenskap
458	Prediction of securities' behavior using a multi-level artificial neural network with extra inputs between layers / Förutsägelse av värdepapperens beteende med hjälp av ett artificiellt neuralt nätverk med flera nivåer med extra ingångar mellan skikten Törnqvist, Eric, Guan, Xing January 2017 (has links) This paper discusses the possibilities of predicting changes in stock pricing at a high frequency applying a multi-level neural network without the use of recurrent neurons or any other time series analysis, as suggested in a paper byChen et al. [2017]. The paper tries to adapt the model presented in a paper by Chen et al. [2017] by making the network deeper, feeding it data of higher resolution and changing the activation functions. While the resulting accuracy is not as high as other models, this paper might prove useful for those interested in further developing neural networks using data with high resolution and to the fintech business as a whole. high frequency neural network computer science stock market finance fintech machine learning yield prediction forecast deep neural network algo trading financial instruments correlation Computer Sciences Datavetenskap (datalogi)
459	Spatio-temporal prediction of residential burglaries using convolutional LSTM neural networks Holm, Noah, Plynning, Emil January 2018 (has links) The low amount solved residential burglary crimes calls for new and innovative methods in the prevention and investigation of the cases. There were 22 600 reported residential burglaries in Sweden 2017 but only four to five percent of these will ever be solved. There are many initiatives in both Sweden and abroad for decreasing the amount of occurring residential burglaries and one of the areas that are being tested is the use of prediction methods for more efficient preventive actions. This thesis is an investigation of a potential method of prediction by using neural networks to identify areas that have a higher risk of burglaries on a daily basis. The model use reported burglaries to learn patterns in both space and time. The rationale for the existence of patterns is based on near repeat theories in criminology which states that after a burglary both the burgled victim and an area around that victim has an increased risk of additional burglaries. The work has been conducted in cooperation with the Swedish Police authority. The machine learning is implemented with convolutional long short-term memory (LSTM) neural networks with max pooling in three dimensions that learn from ten years of residential burglary data (2007-2016) in a study area in Stockholm, Sweden. The model's accuracy is measured by performing predictions of burglaries during 2017 on a daily basis. It classifies cells in a 36x36 grid with 600 meter square grid cells as areas with elevated risk or not. By classifying 4% of all grid cells during the year as risk areas, 43% of all burglaries are correctly predicted. The performance of the model could potentially be improved by further configuration of the parameters of the neural network, along with a use of more data with factors that are correlated to burglaries, for instance weather. Consequently, further work in these areas could increase the accuracy. The conclusion is that neural networks or machine learning in general could be a powerful and innovative tool for the Swedish Police authority to predict and moreover prevent certain crime. This thesis serves as a first prototype of how such a system could be implemented and used. crime prediction crime forecasting residential burglary deep convolutional neural network CNN long short-term memory LSTM recurrent neural network Other Civil Engineering Annan samhällsbyggnadsteknik
460	ISAR Imaging Enhancement Without High-Resolution Ground Truth Enåkander, Moltas January 2023 (has links) In synthetic aperture radar (SAR) and inverse synthetic aperture radar (ISAR), an imaging radar emits electromagnetic waves of varying frequencies towards a target and the backscattered waves are collected. By either moving the radar antenna or rotating the target and combining the collected waves, a much longer synthetic aperture can be created. These radar measurements can be used to determine the radar cross-section (RCS) of the target and to reconstruct an estimate of the target. However, the reconstructed images will suffer from spectral leakage effects and are limited in resolution. Many methods of enhancing the images exist and some are based on deep learning. Most commonly the deep learning methods rely on high-resolution ground truth data of the scene to train a neural network to enhance the radar images. In this thesis, a method that does not rely on any high-resolution ground truth data is applied to train a convolutional neural network to enhance radar images. The network takes a conventional ISAR image subject to spectral leakage effects as input and outputs an enhanced ISAR image which contains much more defined features. New RCS measurements are created from the enhanced ISAR image and the network is trained to minimise the difference between the original RCS measurements and the new RCS measurements. A sparsity constraint is added to ensure that the proposed enhanced ISAR image is sparse. The synthetic training data consists of scenes containing point scatterers that are either individual or grouped together to form shapes. The scenes are used to create synthetic radar measurements which are then used to reconstruct ISAR images of the scenes. The network is tested using both synthetic data and measurement data from a cylinder and two aeroplane models. The network manages to minimise spectral leakage and increase the resolution of the ISAR images created from both synthetic and measured RCSs, especially on measured data from target models which have similar features to the synthetic training data. The contributions of this thesis work are firstly a convolutional neural network that enhances ISAR images affected by spectral leakage. The neural network handles complex-valued signals as a single channel and does not perform any rescaling of the input. Secondly, it is shown that it is sufficient to calculate the new RCS for much fewer frequency samples and angular positions and compare those measurements to the corresponding frequency samples and angular positions in the original RCS to train the neural network. SAR SAR Imaging ISAR ISAR Imaging Machine learning Convolutional neural network CNN neural network Super resolution Unsupervised learning Signal Processing Signalbehandling Computer Systems Datorsystem

Search results