• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 266
  • 35
  • 14
  • 10
  • 5
  • 4
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 386
  • 386
  • 386
  • 248
  • 165
  • 159
  • 141
  • 87
  • 84
  • 81
  • 79
  • 76
  • 70
  • 67
  • 67
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
191

Experiments of Federated Learning on Raspberry Pi Boards

Sondén, Simon, Madadzade, Farhad January 2022 (has links)
In recent years, companies of all sizes have become increasingly dependent on customer user data and processing it using machine learning (ML) methods. These methods do, however, require the raw user data to be stored locally on a server or cloud service, raising privacy concerns. Hence, the purpose of this paper is to analyze a new alternative ML method, called federated learning (FL). FL allows the data to remain on each respective device while still being able to create a global model by averaging local models on each client device. The analysis in this report is based on two different types of simulations. The first is simulations in a virtual environment where a larger number of devices can be included, while the second is simulations on a physical testbed of Raspberry Pi (RPI) single-board computers. Different parameters are changed and altered to find the optimal performance, accuracy, and loss of computations in each case. The results of all simulations show that fewer clients and more training epochs increase the accuracy when using independent and identically distributed (IID) data. However, when using non-IID data, the accuracy is not dependent on the number of epochs, and it becomes chaotic when decreasing the number of clients which are sampled each round. Furthermore, the tests on the RPIs show results which agree with the virtual simulation. / På den senaste tiden har företag blivit allt mer beroende av ku rs användardata och har börjat använda maskininlärningsmodeller för att processera datan. För att skapa dessa modeller behövs att användardata lagras lokalt på en server eller en molntjänst, vilket kan leda till integritetsproblematik. Syftet med denna rapport är därför att analysera en ny alternativ metod, vid namn ”federated learning” (FL). Denna metod möjliggör skapandet av en global modell samtidigt som användardata förblir kvar på varje klients enhet. Detta görs genom att den globala modellen bestäms genom att beräkna medelvärdet av samtliga enheters lokala modeller. Analysen av metoden görs baserat på två olika typer av simuleringar. Den första görs i en virtuell miljö för att kunna inkluderastörre mängder klientenheter medan den andra typen görs på en fysisk testbädd som består av enkortsdatorerna Raspberry Pi (RPI). Olika parametrar justeras och ändras för att finna modellens optimala prestanda och nogrannhet. Resultaten av simuleringarna visar att färre klienter och flera träningsepoker ökar noggrannheten när oberoende och likafördelad (på engelska förkortat till IID) data används. Däremot påvisas att noggrannheten inte är beroende av antalet epoker när icke-IID data nyttjas. Noggrannheten blir däremot kaotisk när antalet klienter som används för att träna på varje runda minskas. Utöver observeras det även att testresultaten från RPI enheterna stämmer överens med resultatet från simuleringarna. / Kandidatexjobb i elektroteknik 2022, KTH, Stockholm
192

Robustness of Image Classification Using CNNs in Adverse Conditions

Ingelstam, Theo, Skåntorp, Johanna January 2022 (has links)
The usage of convolutional neural networks (CNNs) has revolutionized the field of computer vision. Though the algorithms used in image recognition have improved significantly in the past decade, they are still limited by the availability of training data. This paper aims to gain a better understanding of how limitations in the training data might affect the performance of the system. A robustness study was conducted. The study utilizes three different image datasets; pre-training CNN models on the ImageNet or CIFAR-10 datasets, and then training on the MAdWeather dataset, whose main characteristic is containing images with differing levels of obscurity in front of the objects in the images. The MAdWeather dataset is used in order to test how accurately a model can identify images that differ from its training dataset. The study shows that CNNs performance on one condition does not translate well to other conditions. / Bildklassificering med hjälp av datorer har revolutionerats genom introduktionen av CNNs. Och även om algoritmerna har förbättrats avsevärt, så är de fortsatt begränsade av tillgänglighet av data. Syftet med detta projekt är att få en bättre förståelse för hur begränsningar i träningsdata kan påverka prestandan för en modell. En studie genomförs för att avgöra hur robust en modell är mot att förutsättningarna, under vilka bilderna tas, förändras. Studien använder sig av tre olika dataset: ImageNet och CIFAR-10, för förträning av modellerna, samt MAdWeather för vidare träning. MAdWeather är speciellt framtaget med bilder där objekten är till olika grad grumlade. MAdWeather datasetet används vidare för att avgöra hur bra en modell är på att klassificera bilder som tagits fram under omständigheter som avviker från träningsdatan. Studien visar att CNNs prestanda på en viss omständighet, inte kan generaliseras till andra omständigheter. / Kandidatexjobb i elektroteknik 2022, KTH, Stockholm
193

Investigation of 8-bit Floating-Point Formats for Machine Learning

Lindberg, Theodor January 2023 (has links)
Applying machine learning to various applications has gained significant momentum in recent years. However, the increasing complexity of networks introduces challenges such as a larger memory footprint and decreased throughput. This thesis aims to address these challenges by exploring the use of 8-bit floating-point numbers for machine learning. The numerical accuracy was evaluated empirically by implementing software models of the arithmetic and running experiments on a neural network provided by MediaTek. While the initial findings revealed poor accuracy when performing computations solely with 8-bit floating-point arithmetic, a significant improvement could be achieved by using a higher-precision accumulator register. The hardware cost was evaluated using a synthesis tool by measuring the increase in silicon area and impact on clock frequency after four new vector instructions had been implemented. A large increase in area was measured for the functional blocks, but the hardware cost for interconnect and instruction decoding were negligible. A slight decrease in system clock frequency was observed, although marginally. Ideas that likely could improve the accuracy of inference calculations and decrease the hardware cost are proposed in the section for future work.
194

Learning a Grasp Prediction Model for Forestry Applications

Olofsson, Elias January 2024 (has links)
Since the advent of machine learning and machine vision methods, progress has been made in tackling the long-standing research question of autonomous grasping of arbitrary objects using robotic end-effectors. Building on these efforts, we focus on a subset of the general grasping problem concerning the automation of a forwarder. This forestry vehicle collects and transports felled and cut tree logs in a forest environment to a nearby roadside landing. The forwarder must safely and energy-efficiently grip logs to minimize fuel consumption and reduce loading times. In this thesis project, we develop a data-driven model for predicting the expected outcome of grasping attempts made by the forwarder's crane. For a given pile of logs, such a model can estimate the optimal horizontal location and angle for applying the claw grapple, enabling effective grasp planning. We utilize physics-based simulations to create a ground truth dataset of 12 500 000 simulated grasps distributed across 5000 randomly generated log piles. Our semi-generative, supervised model is a fully convolutional network that inputs the orthographic depth image of a pile and returns images predicting the corresponding grasps' initial grapple angle and outcome metrics as a function of position. Over five folds of cross-validation, our model predicted the number of grasped logs and the initial grapple angle with a normalized root mean squared error of 15.77(2)% and 2.64(4)%, respectively. The grasps' energy efficiency and energy waste were similarly predicted with a relative error of 14.43(2)% and 21.06(3)%. / Sedan tillkomsten av maskininlärnings- och maskinseendebaserade metoder har betydande framsteg gjorts inom forskningsområdet för autonom greppning av godtyckliga objekt med en robotisk sluteffektor. Vi bygger vidare på dessa resultat och fokuserar på en del av det generella greppningsproblemet gällande automatisering av en skotare. Denna skogsmaskin samlar in och transporterar fällda och kapade trädstammar från avverkningsplats till upplag intill närliggande skogsbilväg. Skotaren måste greppa och lyfta stockarna på ett säkert och energieffektivt sätt för att minimera bränsleförbrukningen samt minska lastningstiderna. I detta examensarbete utvecklar vi en datadriven modell för att förutsäga det förväntade resultatet av gripförsök utförda av skotarens kran. För en given timmerstockshög kan en sådan modell uppskatta den optimala positionen och vinkeln för att applicera skotarens gripklo, vilket möjliggör effektiv planering av lastningen. Vi använder fysikbaserade simuleringar för att skapa ett dataset med 12 500 000 simulerade gripförsök fördelade över 5000 slumpmässigt genererade timmerhögar. Vår semi-generativa, övervakade modell är ett djupt faltningsnätverk utan helt sammankopplade neuronlager som tar in en ortografisk djupbild av en timmerhög och returnerar bilder som predikterar de motsvarande gripförsökens initiala gripvinkel och resultatmått som en funktion av position. Vid en femfaldig korsvalidering förutsåg vår modell antalet greppade stockar och den initiala gripvinkeln med ett normaliserat rotmedelkvadratfel på 15.77(2)% respektive 2.64(4)%. Gripförsökens energieffektivitet och energiförlust predikterades på liknande sätt med ett relativt fel på 14.43(2)% och 21.06(3)%.
195

Predicting Location and Training Effectiveness (PLATE)

Bruenner, Erik Rolf 01 June 2023 (has links) (PDF)
Abstract Predicting Location and Training Effectiveness (PLATE) Erik Bruenner Physical activity and exercise have been shown to have an enormous impact on many areas of human health and can reduce the risk of many chronic diseases. In order to better understand how exercise may affect the body, current kinesiology studies are designed to track human movements over large intervals of time. Procedures used in these studies provide a way for researchers to quantify an individual’s activity level over time, along with tracking various types of activities that individuals may engage in. Movement data of research subjects is often collected through various sensors, such as accelerometers. Data from these specialized sensors may be fed into a deep learning model which can accurately predict what movements a person is making based on aggregated sensor data. However, in order for prediction models to produce accurate classifications of activities, they must be ‘trained’. Training occurs through the process of supervised learning on large amounts of data where movements are already known. These training data sets are also known as ‘validation’ data or ‘ground truth’. Currently, generation of these ground truth sets is very labor-intensive. To generate these labeled data sets, research assistants must analyze many hours of video footage with research subjects. These research assistants painstakingly categorize each video, second by second, with a description of the activity the subject was engaging in. Using only labeled video, the PLATE project facilitates the generation of ground truth data by developing an artificial intelligence (AI) that predicts video quality labels, along with labels that denote the physical location that these activities occurred in. The PLATE project builds on previous work by a former graduate student, Roxanne Miller. Miller developed a classification system to categorize subject activities into groups such as ‘Stand’, ‘Sit’, ‘Walk’, ‘Run’, etc. The PLATE project focuses instead on development of AI to generate ground truth training in order to accurately detect and identify the quality of video data, and the location of the video data. In the context of the PLATE project, video quality refers to whether or not a test subject is visible in the frame. Location classifications include categorizing ‘indoors’, ‘outdoors’, and ‘traveling’. More specifically, indoor categories are further identified as ‘house’, ‘office’, ‘school’, ‘store’ or ‘commercial’ space. Outdoor locations are further classified as ‘commercial space’, ‘park/greenspace’, ‘residential’ or ‘neighborhood’. The nature of our location classification problem lends itself particularly well to a hierarchical classification approach, where general indoor, outdoor, or travel categories are predicted, then separate models predict the subclassifications of these categories. The PLATE project uses three convolutional neural networks in its hierarchical location prediction pipeline, and one convolutional neural network to predict if video frames are high or low quality. Results from the PLATE project demonstrate that quality can be predicted with an accuracy of 96%, general location with an accuracy of 75%, and specific locations with an accuracy of 31%. The findings and model produced by the PLATE project are utilized in the PathML project as part of a ground truth prediction software for activity monitoring studies. PathML is a project funded by the NIH as part of a Small Business Research Initiative. Cal Poly partnered with Sentimetrix Inc, a data analytics/machine learning company, to build a methodology for automated labeling of human physical activity. The partnership aims to utilize this methodology to develop a software tool that performs automatic labeling and facilitates the subsequent human inspection. Phase I (proof of concept) of the project took place from September 2021 to August 2022, Phase II (final software production) is pending. This thesis is part of the research that took place during Phase I lifetime, and continues to support Phase II development.
196

Development and Application of Novel Computer Vision and Machine Learning Techniques

Depoian, Arthur Charles, II 08 1900 (has links)
The following thesis proposes solutions to problems in two main areas of focus, computer vision and machine learning. Chapter 2 utilizes traditional computer vision methods implemented in a novel manner to successfully identify overlays contained in broadcast footage. The remaining chapters explore machine learning algorithms and apply them in various manners to big data, multi-channel image data, and ECG data. L1 and L2 principal component analysis (PCA) algorithms are implemented and tested against each other in Python, providing a metric for future implementations. Selected algorithms from this set are then applied in conjunction with other methods to solve three distinct problems. The first problem is that of big data error detection, where PCA is effectively paired with statistical signal processing methods to create a weighted controlled algorithm. Problem 2 is an implementation of image fusion built to detect and remove noise from multispectral satellite imagery, that performs at a high level. The final problem examines ECG medical data classification. PCA is integrated into a neural network solution that achieves a small performance degradation while requiring less then 20% of the full data size.
197

Deep multi-modal U-net fusion methodology of infrared and ultrasonic images for porosity detection in additive manufacturing

Zamiela, Christian E 10 December 2021 (has links)
We developed a deep fusion methodology of non-destructive (NDT) in-situ infrared and ex- situ ultrasonic images for localization of porosity detection without compromising the integrity of printed components that aims to improve the Laser-based additive manufacturing (LBAM) process. A core challenge with LBAM is that lack of fusion between successive layers of printed metal can lead to porosity and abnormalities in the printed component. We developed a sensor fusion U-Net methodology that fills the gap in fusing in-situ thermal images with ex-situ ultrasonic images by employing a U-Net Convolutional Neural Network (CNN) for feature extraction and two-dimensional object localization. We modify the U-Net framework with the inception and LSTM block layers. We validate the models by comparing our single modality models and fusion models with ground truth X-ray computed tomography images. The inception U-Net fusion model localized porosity with the highest mean intersection over union score of 0.557.
198

Event Detection and Extraction from News Articles

Wang, Wei 21 February 2018 (has links)
Event extraction is a type of information extraction(IE) that works on extracting the specific knowledge of certain incidents from texts. Nowadays the amount of available information (such as news, blogs, and social media) grows in exponential order. Therefore, it becomes imperative to develop algorithms that automatically extract the machine-readable information from large volumes of text data. In this dissertation, we focus on three problems in obtaining event-related information from news articles. (1) The first effort is to comprehensively analyze the performance and challenges in current large-scale event encoding systems. (2) The second problem involves event detection and critical information extractions from news articles. (3) Third, the efforts concentrate on event-encoding which aims to extract event extent and arguments from texts. We start by investigating the two large-scale event extraction systems (ICEWS and GDELT) in the political science domain. We design a set of experiments to evaluate the quality of the extracted events from the two target systems, in terms of reliability and correctness. The results show that there exist significant discrepancies between the outputs of automated systems and hand-coded system and the accuracy of both systems are far away from satisfying. These findings provide preliminary background and set the foundation for using advanced machine learning algorithms for event related information extraction. Inspired by the successful application of deep learning in Natural Language Processing (NLP), we propose a Multi-Instance Convolutional Neural Network (MI-CNN) model for event detection and critical sentences extraction without sentence level labels. To evaluate the model, we run a set of experiments on a real-world protest event dataset. The result shows that our model could be able to outperform the strong baseline models and extract the meaningful key sentences without domain knowledge and manually designed features. We also extend the MI-CNN model and propose an MIMTRNN model for event extraction with distant supervision to overcome the problem of lacking fine level labels and small size training data. The proposed MIMTRNN model systematically integrates the RNN, Multi-Instance Learning, and Multi-Task Learning into a unified framework. The RNN module aims to encode into the representation of entity mentions the sequential information as well as the dependencies between event arguments, which are very useful in the event extraction task. The Multi-Instance Learning paradigm makes the system does not require the precise labels in entity mention level and make it perfect to work together with distant supervision for event extraction. And the Multi-Task Learning module in our approach is designed to alleviate the potential overfitting problem caused by the relatively small size of training data. The results of the experiments on two real-world datasets(Cyber-Attack and Civil Unrest) show that our model could be able to benefit from the advantage of each component and outperform other baseline methods significantly. / Ph. D.
199

COCO-Bridge: Common Objects in Context Dataset and Benchmark for Structural Detail Detection of Bridges

Bianchi, Eric Loran 14 February 2019 (has links)
Common Objects in Context for bridge inspection (COCO-Bridge) was introduced for use by unmanned aircraft systems (UAS) to assist in GPS denied environments, flight-planning, and detail identification and contextualization, but has far-reaching applications such as augmented reality (AR) and other artificial intelligence (AI) platforms. COCO-Bridge is an annotated dataset which can be trained using a convolutional neural network (CNN) to identify specific structural details. Many annotated datasets have been developed to detect regions of interest in images for a wide variety of applications and industries. While some annotated datasets of structural defects (primarily cracks) have been developed, most efforts are individualized and focus on a small niche of the industry. This effort initiated a benchmark dataset with a focus on structural details. This research investigated the required parameters for detail identification and evaluated performance enhancements on the annotation process. The image dataset consisted of four structural details which are commonly reviewed and rated during bridge inspections: bearings, cover plate terminations, gusset plate connections, and out of plane stiffeners. This initial version of COCO-Bridge includes a total of 774 images; 10% for evaluation and 90% for training. Several models were used with the dataset to evaluate model overfitting and performance enhancements from augmentation and number of iteration steps. Methods to economize the predictive capabilities of the model without the addition of unique data were investigated to reduce the required number of training images. Results from model tests indicated the following: additional images, mirrored along the vertical-axis, provided precision and accuracy enhancements; increasing computational step iterations improved predictive precision and accuracy, and the optimal confidence threshold for operation was 25%. Annotation recommendations and improvements were also discovered and documented as a result of the research. / MS / Common Objects in Context for bridge inspection (COCO-Bridge) was introduced to improve a drone-conducted bridge inspection process. Drones are a great tool for bridge inspectors because they bring flexibility and access to the inspection. However, drones have a notoriously difficult time operating near bridges, because the signal can be lost between the operator and the drone. COCO-Bridge is an imagebased dataset that uses Artificial Intelligence (AI) as a solution to this particular problem, but has applications in other facets of the inspection as well. This effort initiated a dataset with a focus on identifying specific parts of a bridge or structural bridge elements. This would allow a drone to fly without explicit direction if the signal was lost, and also has the potential to extend its flight time. Extending flight time and operating autonomously are great advantagesfor drone operators and bridge inspectors. The output from COCO-Bridge would also help the inspectors identify areas that are prone to defects by highlighting regions that require inspection. The image dataset consisted of 774 images to detect four structural bridge elements which are commonly reviewed and rated during bridge inspections. The goal is to continue to increase the number of images and encompass more structural bridge elements in the dataset so that it may be used for all types of bridges. Methods to reduce the required number of images were investigated, because gathering images of structural bridge elements is challenging,. The results from model tests helped build a roadmap for the expansion and best-practices for developing a dataset of this type.
200

Semi-Supervised Deep Learning Approach for Transportation Mode Identification Using GPS Trajectory Data

Dabiri, Sina 11 December 2018 (has links)
Identification of travelers' transportation modes is a fundamental step for various problems that arise in the domain of transportation such as travel demand analysis, transport planning, and traffic management. This thesis aims to identify travelers' transportation modes purely based on their GPS trajectories. First, a segmentation process is developed to partition a user's trip into GPS segments with only one transportation mode. A majority of studies have proposed mode inference models based on hand-crafted features, which might be vulnerable to traffic and environmental conditions. Furthermore, the classification task in almost all models have been performed in a supervised fashion while a large amount of unlabeled GPS trajectories has remained unused. Accordingly, a deep SEmi-Supervised Convolutional Autoencoder (SECA) architecture is proposed to not only automatically extract relevant features from GPS segments but also exploit useful information in unlabeled data. The SECA integrates a convolutional-deconvolutional autoencoder and a convolutional neural network into a unified framework to concurrently perform supervised and unsupervised learning. The two components are simultaneously trained using both labeled and unlabeled GPS segments, which have already been converted into an efficient representation for the convolutional operation. An optimum schedule for varying the balancing parameters between reconstruction and classification errors are also implemented. The performance of the proposed SECA model, trip segmentation, the method for converting a raw trajectory into a new representation, the hyperparameter schedule, and the model configuration are evaluated by comparing to several baselines and alternatives for various amounts of labeled and unlabeled data. The experimental results demonstrate the superiority of the proposed model over the state-of-the-art semi-supervised and supervised methods with respect to metrics such as accuracy and F-measure. / Master of Science / Identifying users' transportation modes (e.g., bike, bus, train, and car) is a key step towards many transportation related problems including (but not limited to) transport planning, transit demand analysis, auto ownership, and transportation emissions analysis. Traditionally, the information for analyzing travelers' behavior for choosing transport mode(s) was obtained through travel surveys. High cost, low-response rate, time-consuming manual data collection, and misreporting are the main demerits of the survey-based approaches. With the rapid growth of ubiquitous GPS-enabled devices (e.g., smartphones), a constant stream of users' trajectory data can be recorded. A user's GPS trajectory is a sequence of GPS points, recorded by means of a GPS-enabled device, in which a GPS point contains the information of the device geographic location at a particular moment. In this research, users' GPS trajectories, rather than traditional resources, are harnessed to predict their transportation mode by means of statistical models. With respect to the statistical models, a wide range of studies have developed travel mode detection models using on hand-designed attributes and classical learning techniques. Nonetheless, hand-crafted features cause some main shortcomings including vulnerability to traffic uncertainties and biased engineering justification in generating effective features. A potential solution to address these issues is by leveraging deep learning frameworks that are capable of capturing abstract features from the raw input in an automated fashion. Thus, in this thesis, deep learning architectures are exploited in order to identify transport modes based on only raw GPS tracks. It is worth noting that a significant portion of trajectories in GPS data might not be annotated by a transport mode and the acquisition of labeled data is a more expensive and labor-intensive task in comparison with collecting unlabeled data. Thus, utilizing the unlabeled GPS trajectory (i.e., the GPS trajectories that have not been annotated by a transport mode) is a cost-effective approach for improving the prediction quality of the travel mode detection model. Therefore, the unlabeled GPS data are also leveraged by developing a novel deep-learning architecture that is capable of extracting information from both labeled and unlabeled data. The experimental results demonstrate the superiority of the proposed models over the state-of-the-art methods in literature with respect to several performance metrics.

Page generated in 0.0911 seconds