• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 15
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 24
  • 24
  • 13
  • 12
  • 10
  • 7
  • 6
  • 6
  • 5
  • 5
  • 4
  • 4
  • 4
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Sécurité des applications Web : Analyse, modélisation et détection des attaques par apprentissage automatique / Web application security : analysis, modeling and attacks detection using machine learning

Makiou, Abdelhamid 16 December 2016 (has links)
Les applications Web sont l’épine dorsale des systèmes d’information modernes. L’exposition sur Internet de ces applications engendre continuellement de nouvelles formes de menaces qui peuvent mettre en péril la sécurité de l’ensemble du système d’information. Pour parer à ces menaces, il existe des solutions robustes et riches en fonctionnalités. Ces solutions se basent sur des modèles de détection des attaques bien éprouvés, avec pour chaque modèle, des avantages et des limites. Nos travaux consistent à intégrer des fonctionnalités de plusieurs modèles dans une seule solution afin d’augmenter la capacité de détection. Pour atteindre cet objectif, nous définissons dans une première contribution, une classification des menaces adaptée au contexte des applications Web. Cette classification sert aussi à résoudre certains problèmes d’ordonnancement des opérations d’analyse lors de la phase de détection des attaques. Dans une seconde contribution, nous proposons une architecture de filtrage des attaques basée sur deux modèles d’analyse. Le premier est un module d’analyse comportementale, et le second utilise l’approche d’inspection par signature. Le principal défi à soulever avec cette architecture est d’adapter le modèle d’analyse comportementale au contexte des applications Web. Nous apportons des réponses à ce défi par l’utilisation d’une approche de modélisation des comportements malicieux. Ainsi, il est possible de construire pour chaque classe d’attaque son propre modèle de comportement anormal. Pour construire ces modèles, nous utilisons des classifieurs basés sur l’apprentissage automatique supervisé. Ces classifieurs utilisent des jeux de données d’apprentissage pour apprendre les comportements déviants de chaque classe d’attaques. Ainsi, un deuxième verrou en termes de disponibilité des données d’apprentissage a été levé. En effet, dans une dernière contribution, nous avons défini et conçu une plateforme de génération automatique des données d’entrainement. Les données générées par cette plateforme sont normalisées et catégorisées pour chaque classe d’attaques. Le modèle de génération des données d’apprentissage que nous avons développé est capable d’apprendre "de ses erreurs" d’une manière continue afin de produire des ensembles de données d’apprentissage de meilleure qualité. / Web applications are the backbone of modern information systems. The Internet exposure of these applications continually generates new forms of threats that can jeopardize the security of the entire information system. To counter these threats, there are robust and feature-rich solutions. These solutions are based on well-proven attack detection models, with advantages and limitations for each model. Our work consists in integrating functionalities of several models into a single solution in order to increase the detection capacity. To achieve this objective, we define in a first contribution, a classification of the threats adapted to the context of the Web applications. This classification also serves to solve some problems of scheduling analysis operations during the detection phase of the attacks. In a second contribution, we propose an architecture of Web application firewall based on two analysis models. The first is a behavioral analysis module, and the second uses the signature inspection approach. The main challenge to be addressed with this architecture is to adapt the behavioral analysis model to the context of Web applications. We are responding to this challenge by using a modeling approach of malicious behavior. Thus, it is possible to construct for each attack class its own model of abnormal behavior. To construct these models, we use classifiers based on supervised machine learning. These classifiers use learning datasets to learn the deviant behaviors of each class of attacks. Thus, a second lock in terms of the availability of the learning data has been lifted. Indeed, in a final contribution, we defined and designed a platform for automatic generation of training datasets. The data generated by this platform is standardized and categorized for each class of attacks. The learning data generation model we have developed is able to learn "from its own errors" continuously in order to produce higher quality machine learning datasets .
12

Effects of Transfer Learning on Data Augmentation with Generative Adversarial Networks / Effekten av transferlärande på datautökning med generativt adversarialt nätverk

Berglöf, Olle, Jacobs, Adam January 2019 (has links)
Data augmentation is a technique that acquires more training data by augmenting available samples, where the training data is used to fit model parameters. Data augmentation is utilized due to a shortage of training data in certain domains and to reduce overfitting. Augmenting a training dataset for image classification with a Generative Adversarial Network (GAN) has been shown to increase classification accuracy. This report investigates if transfer learning within a GAN can further increase classification accuracy when utilizing the augmented training dataset. The method section describes a specific GAN architecture for the experiments that includes a label condition. When using transfer learning within the specific GAN architecture, a statistical analysis shows a statistically significant increase in classification accuracy for a classification problem with the EMNIST dataset, which consists of images of handwritten alphanumeric characters. In the discussion section, the authors analyze the results and motivates other use cases for the proposed GAN architecture. / Datautökning är en metod som skapar mer träningsdata genom att utöka befintlig träningsdata, där träningsdatan används för att anpassa modellers parametrar. Datautökning används på grund av en brist på träningsdata inom vissa områden samt för att minska overfitting. Att utöka ett träningsdataset för att genomföra bildklassificering med ett generativt adversarialt nätverk (GAN) har visats kunna öka precisionen av klassificering av bilder. Denna rapport undersöker om transferlärande inom en GAN kan vidare öka klassificeringsprecisionen när ett utökat träningsdataset används. Metoden beskriver en specific GANarkitektur som innehåller ett etikettvillkor. När transferlärande används inom den utvalda GAN-arkitekturen visar en statistisk analys en statistiskt säkerställd ökning av klassificeringsprecisionen för ett klassificeringsproblem med EMNIST datasetet, som innehåller bilder på handskrivna bokstäver och siffror. I diskussionen diskuteras orsakerna bakom resultaten och fler användningsområden nämns.
13

Improving classification accuracy for machine learning / 機械学習における分類精度の向上 / キカイ ガクシュウ ニオケル ブンルイ セイド ノ コウジョウ

鄭 弯弯, Wanwan Zheng 22 March 2021 (has links)
本論文は,5章より構成されている。第1章では,機械学習の現状,応用及び構成を述べた上,本研究で扱った三つの課題を挙げた。第2章では,小サンプルデータの特徴選択方法を提案した。第3章では,クラスの不均衡性と学習データのサイズが分類器精度への影響を検討した。第4章では,ノイズが分類器の学習を妨げる問題点に対して,多要素ベースの学習に基づいた高速クラスノイズの検出方法を提案した。第5章では,分析の主な結果をまとめ,今後の課題と展望を述べた。 / This thesis is organized under five chapters. Chapter 1 gives a brief explanation of what machine learning is and why it matters. Chapter 2 makes a proposal to improve the performance of feature selection methods with low-sample-size data. Chapter 3 studies the effects of class imbalance and training data size on classifier learning empirically. Chapter 4 proposes a fast noise detector referring to the problems of noise detection algorithms, which are over-cleansing, large computational complexity and long response time. Chapter 5 draws a summary and the closing. / 博士(文化情報学) / Doctor of Culture and Information Science / 同志社大学 / Doshisha University
14

[pt] AVALIAÇÃO DE AUMENTO DE DADOS VIA GERAÇÃO DE IMAGENS SINTÉTICAS PARA SEGMENTAÇÃO E DETECÇÃO DE PÓLIPOS EM IMAGENS DE COLONOSCOPIA UTILIZANDO APRENDIZADO DE MÁQUINA / [en] EVALUATION OF DATA AUGMENTATION THROUGH SYNTHETIC IMAGES GENERATION FOR SEGMENTATION AND DETECTION OF POLYPS IN COLONOSCOPY IMAGES USING MACHINE LEARNING

VICTOR DE ALMEIDA THOMAZ 17 August 2020 (has links)
[pt] O câncer de cólon é atualmente a segunda principal causa de morte por câncer no mundo. Nos últimos anos houve um aumento do interesse em pesquisas voltadas para o desenvolvimento de métodos automáticos para detecção de pólipos e os resultados mais relevantes foram alcançados por meio de técnicas de aprendizado profundo. No entanto, o desempenho destas abordagens está fortemente associado ao uso de grandes e variados conjuntos de dados. Amostras de imagens de colonoscopia estão disponíveis publicamente, porém a quantidade e a variação limitada podem ser insuficientes para um treinamento bem-sucedido. O trabalho de pesquisa desta tese propõe uma estratégia para aumentar a quantidade e variação de imagens de colonoscopia, melhorando os resultados de segmentação e detecção de pólipos. Diferentemente de outros trabalhos encontrados na literatura que fazem uso de abordagens tradicionais de aumento de dados (data augmentation) e da combinação de imagens de outras modalidades de exame, esta metodologia enfatiza a criação de novas amostras inserindo pólipos em imagens de colonoscopia publicamente disponíveis. A estratégia de inserção faz uso de pólipos gerados sinteticamente e também de pólipos reais, além de aplicar técnicas de processamento para preservar o aspecto realista das imagens, ao mesmo tempo em que cria automaticamente amostras mais diversas com seus rótulos apropriados para fins de treinamento. As redes neurais convolucionais treinadas com estes conjuntos de dados aprimorados apresentaram resultados promissores no contexto de segmentação e detecção. As melhorias obtidas indicam que a implementação de novos métodos para aprimoramento automático de amostras em conjuntos de imagens médicas tem potencial de afetar positivamente o treinamento de redes convolucionais. / [en] Nowadays colorectal cancer is the second-leading cause of cancer death worldwide. In recent years there has been an increase in interest in research aimed at the development of automatic methods for the detection of polyps and the most relevant results have been achieved through deep learning techniques. However, the performance of these approaches is strongly associated with the use of large and varied datasets. Samples of colonoscopy images are publicly available, but the amount and limited variation may be insufficient for successful training. Based on this observation, a new approach is described in this thesis with the objective of increasing the quantity and variation of colonoscopy images, improving the results of segmentation and detection of polyps. Unlike other works found in the literature that use traditional data augmentation approaches and the combination of images from other exam modalities, the proposed methodology emphasizes the creation of new samples by inserting polyps in publicly available colonoscopy images. The insertion strategy makes use of synthetically generated polyps as well as real polyps, in addition to applying processing techniques to preserve the realistic aspect of the images, while automatically creating more diverse samples with their appropriate labels for training purposes. Convolutional neural networks trained with these improved datasets have shown promising results in the context of segmentation and detection. The improvements obtained indicate that the implementation of new methods for the automatic improvement of samples in medical image datasets has the potential to positively affect the training of convolutional networks.
15

Generating Synthetic Training Data with Stable Diffusion

Rynell, Rasmus, Melin, Oscar January 2023 (has links)
The usage of image classification in various industries has grown significantly in recentyears. There are however challenges concerning the data used to train such models. Inmany cases the data used in training is often difficult and expensive to obtain. Furthermore,dealing with image data may come with additional problems such as privacy concerns. Inrecent years, synthetic image generation models such as Stable Diffusion has seen signifi-cant improvement. Solely using a textual description, Stable Diffusion is able to generate awide variety of photorealistic images. In addition to textual descriptions, other condition-ing models such as ControlNet has enabled the possibility of additional grounding infor-mation, such as canny edge and segmentation images. This thesis investigates if syntheticimages generated by Stable Diffusion can be used effectively in training an image classifier.To find the most effective method for generating training data, multiple conditioning meth-ods are investigated and evaluated. The results show that it is possible to generate high-quality training data using several conditioning techniques. The best performing methodwas using canny edge grounded images to augment already existing data. Extending twoclasses with additional synthetic data generated by the best performing method, achievedthe highest average F1-score increase of 0.85 percentage points compared with a baselinesolely trained on real images.
16

Generic instance segmentation for object-oriented bin-picking / Segmentation en instances génériques pour le dévracage orienté objet

Grard, Matthieu 20 May 2019 (has links)
Le dévracage robotisé est une tâche industrielle en forte croissance visant à automatiser le déchargement par unité d’une pile d’instances d'objet en vrac pour faciliter des traitements ultérieurs tels que la formation de kits ou l’assemblage de composants. Cependant, le modèle explicite des objets est souvent indisponible dans de nombreux secteurs industriels, notamment alimentaire et automobile, et les instances d'objet peuvent présenter des variations intra-classe, par exemple en raison de déformations élastiques.Les techniques d’estimation de pose, qui nécessitent un modèle explicite et supposent des transformations rigides, ne sont donc pas applicables dans de tels contextes. L'approche alternative consiste à détecter des prises sans notion explicite d’objet, ce qui pénalise fortement le dévracage lorsque l’enchevêtrement des instances est important. Ces approches s’appuient aussi sur une reconstruction multi-vues de la scène, difficile par exemple avec des emballages alimentaires brillants ou transparents, ou réduisant de manière critique le temps de cycle restant dans le cadre d’applications à haute cadence.En collaboration avec Siléane, une entreprise française de robotique industrielle, l’objectif de ce travail est donc de développer une solution par apprentissage pour la localisation des instances les plus prenables d’un vrac à partir d’une seule image, en boucle ouverte, sans modèles d'objet explicites. Dans le contexte du dévracage industriel, notre contribution est double.Premièrement, nous proposons un nouveau réseau pleinement convolutionnel (FCN) pour délinéer les instances et inférer un ordre spatial à leurs frontières. En effet, les méthodes état de l'art pour cette tâche reposent sur deux flux indépendants, respectivement pour les frontières et les occultations, alors que les occultations sont souvent sources de frontières. Plus précisément, l'approche courante, qui consiste à isoler les instances dans des boîtes avant de détecter les frontières et les occultations, se montre inadaptée aux scénarios de dévracage dans la mesure où une région rectangulaire inclut souvent plusieurs instances. A contrario, notre architecture sans détection préalable de régions détecte finement les frontières entre instances, ainsi que le bord occultant correspondant, à partir d'une représentation unifiée de la scène.Deuxièmement, comme les FCNs nécessitent de grands ensembles d'apprentissage qui ne sont pas disponibles dans les applications de dévracage, nous proposons une procédure par simulation pour générer des images d'apprentissage à partir de moteurs physique et de rendu. Plus précisément, des vracs d'instances sont simulés et rendus avec les annotations correspondantes à partir d'ensembles d'images de texture et de maillages auxquels sont appliquées de multiples déformations aléatoires. Nous montrons que les données synthétiques proposées sont vraisemblables pour des applications réelles au sens où elles permettent l'apprentissage de représentations profondes transférables à des données réelles. A travers de nombreuses expériences sur une maquette réelle avec robot, notre réseau entraîné sur données synthétiques surpasse la méthode industrielle de référence, tout en obtenant des performances temps réel. L'approche proposée établit ainsi une nouvelle référence pour le dévracage orienté-objet sans modèle d'objet explicite. / Referred to as robotic random bin-picking, a fast-expanding industrial task consists in robotizing the unloading of many object instances piled up in bulk, one at a time, for further processing such as kitting or part assembling. However, explicit object models are not always available in many bin-picking applications, especially in the food and automotive industries. Furthermore, object instances are often subject to intra-class variations, for example due to elastic deformations.Object pose estimation techniques, which require an explicit model and assume rigid transformations, are therefore not suitable in such contexts. The alternative approach, which consists in detecting grasps without an explicit notion of object, proves hardly efficient when the object geometry makes bulk instances prone to occlusion and entanglement. These approaches also typically rely on a multi-view scene reconstruction that may be unfeasible due to transparent and shiny textures, or that reduces critically the time frame for image processing in high-throughput robotic applications.In collaboration with Siléane, a French company in industrial robotics, we thus aim at developing a learning-based solution for localizing the most affordable instance of a pile from a single image, in open loop, without explicit object models. In the context of industrial bin-picking, our contribution is two-fold.First, we propose a novel fully convolutional network (FCN) for jointly delineating instances and inferring the spatial layout at their boundaries. Indeed, the state-of-the-art methods for such a task rely on two independent streams for boundaries and occlusions respectively, whereas occlusions often cause boundaries. Specifically, the mainstream approach, which consists in isolating instances in boxes before detecting boundaries and occlusions, fails in bin-picking scenarios as a rectangle region often includes several instances. By contrast, our box proposal-free architecture recovers fine instance boundaries, augmented with their occluding side, from a unified scene representation. As a result, the proposed network outperforms the two-stream baselines on synthetic data and public real-world datasets.Second, as FCNs require large training datasets that are not available in bin-picking applications, we propose a simulation-based pipeline for generating training images using physics and rendering engines. Specifically, piles of instances are simulated and rendered with their ground-truth annotations from sets of texture images and meshes to which multiple random deformations are applied. We show that the proposed synthetic data is plausible for real-world applications in the sense that it enables the learning of deep representations transferable to real data. Through extensive experiments on a real-world robotic setup, our synthetically trained network outperforms the industrial baseline while achieving real-time performances. The proposed approach thus establishes a new baseline for model-free object-oriented bin-picking.
17

Rozpoznávání obličejů v zabezpečovacích a dohledových kamerových systémech / Face Recognition in Security and Surveillance Camera Systems

Malach, Tobiáš January 2020 (has links)
Tato práce se zabývá zvýšením úspěšnosti rozpoznávání obličejů v dohledových CCTV systémech a systémech kontroly vstupu. K dosažení tohoto cíle je využit nový přístup - optimalizace vzorů obličejů. Optimalizace tvorby vzorů umožní vytvořit vzory, které zajistí zvýšení úspěšnosti rozpoznání. Měření a další zvyšování úspěšnosti rozpoznávání obličejů vyžaduje naplnění následujících dílčích cílů této práce. Prvním cílem je návrh a sestavení reprezentativní databáze obličejů, která umožní dosáhnout věrohodných a statisticky spolehlivých výsledků rozpoznávání obličejů v dohledových CCTV systémech a systémech kontroly vstupu. Druhým cílem je vytvoření metodiky pro statisticky spolehlivé porovnání výsledků, která umožní konstatování relevantních závěrů. Třetím cílem je výzkum tvorby vzorů a jejich optimalizace. Z dosažených výsledků vyplývá, že optimalizace tvorby vzorů zvyšuje úspěšnost rozpoznávání v uvedených a náročných aplikacích typicky o 4-8%, a v některých případech i 15%. Optimalizace tvorby vzorů přispívá použitelnosti rozpoznávání obličejů v uvedených aplikacích.
18

Využití neuronových sítí pro predikaci síťového provozu / Neural network utilization for etwork traffic predictions

Pavela, Radek January 2009 (has links)
In this master’s thesis are discussed static properties of network traffic trace. There are also addressed the possibility of a predication with a focus on neural networks. Specifically, therefore recurrent neural networks. Training data were downloaded from freely accessible on the internet link. This is the captured packej of traffic of LAN network in 2001. They are not the most actual, but it is possible to use them to achieve the objective results of the work. Input data needed to be processed into acceptable form. In the Visual Studio 2005 was created program to aggregate the intensities of these data. The best combining appeared after 100 ms. This was achieved by the input vector, which was divided according to the needs of network training and testing part. The various types of networks operate with the same input data, thereby to make more objective results. In practical terms, it was necessary to verify the two principles. Principle of training and the principle of generalization. The first of the nominated designs require stoking training and verification training by using gradient and mean square error. The second one represents unknown designs application on neural network. It was monitored the response of network to these input data. It can be said that the best model seemed the Layer recurrent neural network (LRN). So, it was a solution developed in this direction, followed by searching the appropriate option of recurrent network and optimal configuration. Found a variant of topology is 10-10-1. It was used the Matlab 7.6, with an extension of Neural Network toolbox 6. The results are processed in the form of graphs and the final appreciation. All successful models and network topologies are on the enclosed CD. However, Neural Network toolbox reported some problems when importing networks. In creating this work wasn’t import of network functions practically used. The network can be imported, but the majority appear to be non-trannin. Unsuccessful models of networks are not presented in this master’s thesis, because it would be make a deterioration of clarity and orientation.
19

Separating Tweets from Croaks : Detecting Automated Twitter Accounts with Supervised Learning and Synthetically Constructed Training Data / : Automationsdetektion av Twitter-konton med övervakad inlärning och syntetiskt konstruerad träningsmängd

Teljstedt, Erik Christopher January 2016 (has links)
In this thesis, we have studied the problem of detecting automated Twitter accounts related to the Ukraine conflict using supervised learning. A striking problem with the collected data set is that it was initially lacking a ground truth. Traditionally, supervised learning approaches rely on manual annotation of training sets, but it incurs tedious work and becomes expensive for large and constantly changing collections. We present a novel approach to synthetically generate large amounts of labeled Twitter accounts for detection of automation using a rule-based classifier. It significantly reduces the effort and resources needed and speeds up the process of adapting classifiers to changes in the Twitter-domain. The classifiers were evaluated on a manually annotated test set of 1,000 Twitter accounts. The results show that rule-based classifier by itself achieves a precision of 94.6% and a recall of 52.9%. Furthermore, the results showed that classifiers based on supervised learning could learn from the synthetically generated labels. At best, the these machine learning based classifiers achieved a slightly lower precision of 94.1% compared to the rule-based classifier, but at a significantly better recall of 93.9% / Detta exjobb har undersökt problemet att detektera automatiserade Twitter-konton relaterade till Ukraina-konflikten genom att använda övervakade maskininlärningsmetoder. Ett slående problem med den insamlade datamängden var avsaknaden av träningsexempel. I övervakad maskininlärning brukar man traditionellt manuellt märka upp en träningsmängd. Detta medför dock långtråkigt arbete samt att det blir dyrt förstora och ständigt föränderliga datamängder. Vi presenterar en ny metod för att syntetiskt generera uppmärkt Twitter-data (klassifieringsetiketter) för detektering av automatiserade konton med en regel-baseradeklassificerare. Metoden medför en signifikant minskning av resurser och anstränging samt snabbar upp processen att anpassa klassificerare till förändringar i Twitter-domänen. En utvärdering av klassificerare utfördes på en manuellt uppmärkt testmängd bestående av 1,000 Twitter-konton. Resultaten visar att den regelbaserade klassificeraren på egen hand uppnår en precision på 94.6% och en recall på 52.9%. Vidare påvisar resultaten att klassificerare baserat på övervakad maskininlärning kunde lära sig från syntetiskt uppmärkt data. I bästa fall uppnår dessa maskininlärningsbaserade klassificerare en något lägre precision på 94.1%, jämfört med den regelbaserade klassificeraren, men med en betydligt bättre recall på 93.9%.
20

A Deep-Learning Approach to Evaluating the Navigability of Off-Road Terrain from 3-D Imaging

Pech, Thomas Joel 30 August 2017 (has links)
No description available.

Page generated in 0.0575 seconds