161 |
Characterization of the structure, stratigraphy and CO2 storage potential of the Swedish sector of the Baltic and Hanö Bay basins using seismic reflection methodsSopher, Daniel January 2016 (has links)
An extensive multi-channel seismic dataset acquired between 1970 and 1990 by Oljeprospektering AB (OPAB) has recently been made available by the Geological Survey of Sweden (SGU). This thesis summarizes four papers, which utilize this largely unpublished dataset to improve our understanding of the geology and CO2 storage capacity of the Baltic and Hanö Bay basins in southern Sweden. A range of new processing workflows were developed, which typically provide an improvement in the final stacked seismic image, when compared to the result obtained with the original processing. A method was developed to convert scanned images of seismic sections into SEGY files, which allows large amounts of the OPAB dataset to be imported and interpreted using modern software. A new method for joint imaging of multiples and primaries was developed, which is shown to provide an improvement in signal to noise for some of the seismic lines within the OPAB dataset. For the first time, five interpreted regional seismic profiles detailing the entire sedimentary sequence within these basins, are presented. Depth structure maps detailing the Outer Hanö Bay area and the deeper parts of the Baltic Basin were also generated. Although the overall structure and stratigraphy of the basins inferred from the reprocessed OPAB dataset are consistent with previous studies, some new observations have been made, which improve the understanding of the tectonic history of these basins and provide insight into how the depositional environments have changed throughout time. The effective CO2 storage potential within structural and stratigraphic traps is assessed for the Cambrian Viklau, När and Faludden sandstone reservoirs. A probabilistic methodology is utilized, which allows a robust assessment of the storage capacity as well as the associated uncertainty. The most favourable storage option in the Swedish sector of the Baltic Basin is assessed to be the Faludden stratigraphic trap, which is estimated to have a mid case (P50) storage capacity of 3390 Mt in the deeper part of the basin, where CO2 can be stored in a supercritical phase.
|
162 |
Development of artificial intelligence-based in-silico toxicity models : data quality analysis and model performance enhancement through data generationMalazizi, Ladan January 2008 (has links)
Toxic compounds, such as pesticides, are routinely tested against a range of aquatic, avian and mammalian species as part of the registration process. The need for reducing dependence on animal testing has led to an increasing interest in alternative methods such as in silico modelling. The QSAR (Quantitative Structure Activity Relationship)-based models are already in use for predicting physicochemical properties, environmental fate, eco-toxicological effects, and specific biological endpoints for a wide range of chemicals. Data plays an important role in modelling QSARs and also in result analysis for toxicity testing processes. This research addresses number of issues in predictive toxicology. One issue is the problem of data quality. Although large amount of toxicity data is available from online sources, this data may contain some unreliable samples and may be defined as of low quality. Its presentation also might not be consistent throughout different sources and that makes the access, interpretation and comparison of the information difficult. To address this issue we started with detailed investigation and experimental work on DEMETRA data. The DEMETRA datasets have been produced by the EC-funded project DEMETRA. Based on the investigation, experiments and the results obtained, the author identified a number of data quality criteria in order to provide a solution for data evaluation in toxicology domain. An algorithm has also been proposed to assess data quality before modelling. Another issue considered in the thesis was the missing values in datasets for toxicology domain. Least Square Method for a paired dataset and Serial Correlation for single version dataset provided the solution for the problem in two different situations. A procedural algorithm using these two methods has been proposed in order to overcome the problem of missing values. Another issue we paid attention to in this thesis was modelling of multi-class data sets in which the severe imbalance class samples distribution exists. The imbalanced data affect the performance of classifiers during the classification process. We have shown that as long as we understand how class members are constructed in dimensional space in each cluster we can reform the distribution and provide more knowledge domain for the classifier.
|
163 |
Non-parametric workspace modelling for mobile robots using push broom lasersSmith, Michael January 2011 (has links)
This thesis is about the intelligent compression of large 3D point cloud datasets. The non-parametric method that we describe simultaneously generates a continuous representation of the workspace surfaces from discrete laser samples and decimates the dataset, retaining only locally salient samples. Our framework attains decimation factors in excess of two orders of magnitude without significant degradation in fidelity. The work presented here has a specific focus on gathering and processing laser measurements taken from a moving platform in outdoor workspaces. We introduce a somewhat unusual parameterisation of the problem and look to Gaussian Processes as the fundamental machinery in our processing pipeline. Our system compresses laser data in a fashion that is naturally sympathetic to the underlying structure and complexity of the workspace. In geometrically complex areas, compression is lower than that in geometrically bland areas. We focus on this property in detail and it leads us well beyond a simple application of non-parametric techniques. Indeed, towards the end of the thesis we develop a non-stationary GP framework whereby our regression model adapts to the local workspace complexity. Throughout we construct our algorithms so that they may be efficiently implemented. In addition, we present a detailed analysis of the proposed system and investigate model parameters, metric errors and data compression rates. Finally, we note that this work is predicated on a substantial amount of robotics engineering which has allowed us to produce a high quality, peer reviewed, dataset - the first of its kind.
|
164 |
Webový vyhledávací systém / Web Search EngineTamáš, Miroslav January 2014 (has links)
Academic fulltext search engine Egothor has recently became starting point of several thesis aimed on searching. Until now, there was no solution available to provide robust set of web content processing tools. This master thesis is aiming on design and implementation of distributed search system working primary with internet sources. We analyze first generation components for processing of web content and summarize their primary features. We use those features to propose architecture of distributed web search engine. We aim mainly to phases of data fetching, processing and indexing. We also describe final implementation of such system and propose few ideas for future extensions.
|
165 |
Automatic prediction of emotions induced by movies / Reconnaissance automatique des émotions induites par les filmsBaveye, Yoann 12 November 2015 (has links)
Jamais les films n’ont été aussi facilement accessibles aux spectateurs qui peuvent profiter de leur potentiel presque sans limite à susciter des émotions. Savoir à l’avance les émotions qu’un film est susceptible d’induire à ses spectateurs pourrait donc aider à améliorer la précision des systèmes de distribution de contenus, d’indexation ou même de synthèse des vidéos. Cependant, le transfert de cette expertise aux ordinateurs est une tâche complexe, en partie due à la nature subjective des émotions. Cette thèse est donc dédiée à la détection automatique des émotions induites par les films, basée sur les propriétés intrinsèques du signal audiovisuel. Pour s’atteler à cette tâche, une base de données de vidéos annotées selon les émotions induites aux spectateurs est nécessaire. Cependant, les bases de données existantes ne sont pas publiques à cause de problèmes de droit d’auteur ou sont de taille restreinte. Pour répondre à ce besoin spécifique, cette thèse présente le développement de la base de données LIRIS-ACCEDE. Cette base a trois avantages principaux: (1) elle utilise des films sous licence Creative Commons et peut donc être partagée sans enfreindre le droit d’auteur, (2) elle est composée de 9800 extraits vidéos de bonne qualité qui proviennent de 160 films et courts métrages, et (3) les 9800 extraits ont été classés selon les axes de “valence” et “arousal” induits grâce un protocole de comparaisons par paires mis en place sur un site de crowdsourcing. L’accord inter-annotateurs élevé reflète la cohérence des annotations malgré la forte différence culturelle parmi les annotateurs. Trois autres expériences sont également présentées dans cette thèse. Premièrement, des scores émotionnels ont été collectés pour un sous-ensemble de vidéos de la base LIRIS-ACCEDE dans le but de faire une validation croisée des classements obtenus via crowdsourcing. Les scores émotionnels ont aussi rendu possible l’apprentissage d’un processus gaussien par régression, modélisant le bruit lié aux annotations, afin de convertir tous les rangs liés aux vidéos de la base LIRIS-ACCEDE en scores émotionnels définis dans l’espace 2D valence-arousal. Deuxièmement, des annotations continues pour 30 films ont été collectées dans le but de créer des modèles algorithmiques temporellement fiables. Enfin, une dernière expérience a été réalisée dans le but de mesurer de façon continue des données physiologiques sur des participants regardant les 30 films utilisés lors de l’expérience précédente. La corrélation entre les annotations physiologiques et les scores continus renforce la validité des résultats de ces expériences. Equipée d’une base de données, cette thèse présente un modèle algorithmique afin d’estimer les émotions induites par les films. Le système utilise à son avantage les récentes avancées dans le domaine de l’apprentissage profond et prend en compte la relation entre des scènes consécutives. Le système est composé de deux réseaux de neurones convolutionnels ajustés. L’un est dédié à la modalité visuelle et utilise en entrée des versions recadrées des principales frames des segments vidéos, alors que l’autre est dédié à la modalité audio grâce à l’utilisation de spectrogrammes audio. Les activations de la dernière couche entièrement connectée de chaque réseau sont concaténées pour nourrir un réseau de neurones récurrent utilisant des neurones spécifiques appelés “Long-Short-Term- Memory” qui permettent l’apprentissage des dépendances temporelles entre des segments vidéo successifs. La performance obtenue par le modèle est comparée à celle d’un modèle basique similaire à l’état de l’art et montre des résultats très prometteurs mais qui reflètent la complexité de telles tâches. En effet, la prédiction automatique des émotions induites par les films est donc toujours une tâche très difficile qui est loin d’être complètement résolue. / Never before have movies been as easily accessible to viewers, who can enjoy anywhere the almost unlimited potential of movies for inducing emotions. Thus, knowing in advance the emotions that a movie is likely to elicit to its viewers could help to improve the accuracy of content delivery, video indexing or even summarization. However, transferring this expertise to computers is a complex task due in part to the subjective nature of emotions. The present thesis work is dedicated to the automatic prediction of emotions induced by movies based on the intrinsic properties of the audiovisual signal. To computationally deal with this problem, a video dataset annotated along the emotions induced to viewers is needed. However, existing datasets are not public due to copyright issues or are of a very limited size and content diversity. To answer to this specific need, this thesis addresses the development of the LIRIS-ACCEDE dataset. The advantages of this dataset are threefold: (1) it is based on movies under Creative Commons licenses and thus can be shared without infringing copyright, (2) it is composed of 9,800 good quality video excerpts with a large content diversity extracted from 160 feature films and short films, and (3) the 9,800 excerpts have been ranked through a pair-wise video comparison protocol along the induced valence and arousal axes using crowdsourcing. The high inter-annotator agreement reflects that annotations are fully consistent, despite the large diversity of raters’ cultural backgrounds. Three other experiments are also introduced in this thesis. First, affective ratings were collected for a subset of the LIRIS-ACCEDE dataset in order to cross-validate the crowdsourced annotations. The affective ratings made also possible the learning of Gaussian Processes for Regression, modeling the noisiness from measurements, to map the whole ranked LIRIS-ACCEDE dataset into the 2D valence-arousal affective space. Second, continuous ratings for 30 movies were collected in order develop temporally relevant computational models. Finally, a last experiment was performed in order to collect continuous physiological measurements for the 30 movies used in the second experiment. The correlation between both modalities strengthens the validity of the results of the experiments. Armed with a dataset, this thesis presents a computational model to infer the emotions induced by movies. The framework builds on the recent advances in deep learning and takes into account the relationship between consecutive scenes. It is composed of two fine-tuned Convolutional Neural Networks. One is dedicated to the visual modality and uses as input crops of key frames extracted from video segments, while the second one is dedicated to the audio modality through the use of audio spectrograms. The activations of the last fully connected layer of both networks are conv catenated to feed a Long Short-Term Memory Recurrent Neural Network to learn the dependencies between the consecutive video segments. The performance obtained by the model is compared to the performance of a baseline similar to previous work and shows very promising results but reflects the complexity of such tasks. Indeed, the automatic prediction of emotions induced by movies is still a very challenging task which is far from being solved.
|
166 |
Generation of synthetic plant images using deep learning architectureKola, Ramya Sree January 2019 (has links)
Background: Generative Adversarial Networks (Goodfellow et al., 2014) (GANs)are the current state of the art machine learning data generating systems. Designed with two neural networks in the initial architecture proposal, generator and discriminator. These neural networks compete in a zero-sum game technique, to generate data having realistic properties inseparable to that of original datasets. GANs have interesting applications in various domains like Image synthesis, 3D object generation in gaming industry, fake music generation(Dong et al.), text to image synthesis and many more. Despite having a widespread application domains, GANs are popular for image data synthesis. Various architectures have been developed for image synthesis evolving from fuzzy images of digits to photorealistic images. Objectives: In this research work, we study various literature on different GAN architectures. To understand significant works done essentially to improve the GAN architectures. The primary objective of this research work is synthesis of plant images using Style GAN (Karras, Laine and Aila, 2018) variant of GAN using style transfer. The research also focuses on identifying various machine learning performance evaluation metrics that can be used to measure Style GAN model for the generated image datasets. Methods: A mixed method approach is used in this research. We review various literature work on GANs and elaborate in detail how each GAN networks are designed and how they evolved over the base architecture. We then study the style GAN (Karras, Laine and Aila, 2018a) design details. We then study related literature works on GAN model performance evaluation and measure the quality of generated image datasets. We conduct an experiment to implement the Style based GAN on leaf dataset(Kumar et al., 2012) to generate leaf images that are similar to the ground truth. We describe in detail various steps in the experiment like data collection, preprocessing, training and configuration. Also, we evaluate the performance of Style GAN training model on the leaf dataset. Results: We present the results of literature review and the conducted experiment to address the research questions. We review and elaborate various GAN architecture and their key contributions. We also review numerous qualitative and quantitative evaluation metrics to measure the performance of a GAN architecture. We then present the generated synthetic data samples from the Style based GAN learning model at various training GPU hours and the latest synthetic data sample after training for around ~8 GPU days on leafsnap dataset (Kumar et al., 2012). The results we present have a decent quality to expand the dataset for most of the tested samples. We then visualize the model performance by tensorboard graphs and an overall computational graph for the learning model. We calculate the Fréchet Inception Distance score for our leaf Style GAN and is observed to be 26.4268 (the lower the better). Conclusion: We conclude the research work with an overall review of sections in the paper. The generated fake samples are much similar to the input ground truth and appear to be convincingly realistic for a human visual judgement. However, the calculated FID score to measure the performance of the leaf StyleGAN accumulates a large value compared to that of Style GANs original celebrity HD faces image data set. We attempted to analyze the reasons for this large score.
|
167 |
Détection des fraudes : de l’image à la sémantique du contenu : application à la vérification des informations extraites d’un corpus de tickets de caisse / Fraud detection : from image to semantics of contentArtaud, Chloé 06 February 2019 (has links)
Les entreprises, les administrations, et parfois les particuliers, doivent faire face à de nombreuses fraudes sur les documents qu’ils reçoivent de l’extérieur ou qu’ils traitent en interne. Les factures, les notes de frais, les justificatifs... tout document servant de preuve peut être falsifié dans le but de gagner plus d’argent ou de ne pas en perdre. En France, on estime les pertes dues aux fraudes à plusieurs milliards d’euros par an. Étant donné que le flux de documents échangés, numériques ou papiers, est très important, il serait extrêmement coûteux en temps et en argent de les faire tous vérifier par des experts de la détection des fraudes. C’est pourquoi nous proposons dans notre thèse un système de détection automatique des faux documents. Si la plupart des travaux en détection automatique des faux documents se concentrent sur des indices graphiques, nous cherchons quant à nous à vérifier les informations textuelles du document afin de détecter des incohérences ou des invraisemblances. Pour cela, nous avons tout d’abord constitué un corpus de tickets de caisse que nous avons numérisés et dont nous avons extrait le texte. Après avoir corrigé les sorties de l’OCR et fait falsifier une partie des documents, nous en avons extrait les informations et nous les avons modélisées dans une ontologie, afin de garder les liens sémantiques entre elles. Les informations ainsi extraites, et augmentées de leurs possibles désambiguïsations, peuvent être vérifiées les unes par rapport aux autres au sein du document et à travers la base de connaissances constituée. Les liens sémantiques de l’ontologie permettent également de chercher l’information dans d’autres sources de connaissances, et notamment sur Internet. / Companies, administrations, and sometimes individuals, have to face many frauds on documents they receive from outside or process internally. Invoices, expense reports, receipts...any document used as proof can be falsified in order to earn more money or not to lose it. In France, losses due to fraud are estimated at several billion euros per year. Since the flow of documents exchanged, whether digital or paper, is very important, it would be extremely costly and time-consuming to have them all checked by fraud detection experts. That’s why we propose in our thesis a system for automatic detection of false documents. While most of the work in automatic document detection focuses on graphic clues, we seek to verify the textual information in the document in order to detect inconsistencies or implausibilities.To do this, we first compiled a corpus of documents that we digitized. After correcting the characters recognition outputs and falsifying part of the documents, we extracted the information and modelled them in an ontology, in order to keep the semantic links between them. The information thus extracted, and increased by its possible disambiguation, can be verified against each other within the document and through the knowledge base established. The semantic links of ontology also make it possible to search for information in other sources of knowledge, particularly on the Internet.
|
168 |
Large planetary data visualization using ROAM 2.0Persson, Anders January 2005 (has links)
<p>The problem of estimating an adequate level of detail for an object for a specific view is one of the important problems in computer 3d-graphics and is especially important in real-time applications. The well-known continuous level-of-detail technique, Real-time Optimally Adapting Meshes (ROAM), has been employed with success for almost 10 years but has at present, due to rapid development of graphics hardware, been found to be inadequate. Compared to many other level-of-detail techniques it cannot benefit from the higher triangle throughput available on graphics cards of today.</p><p>This thesis will describe the implementation of the new version of ROAM (informally known as ROAM 2.0) for the purpose of massive planetary data visualization. It will show how the problems of the old technique can be bridged to be able to adapt to newer graphics card while still benefiting from the advantages of ROAM. The resulting implementation that is presented here is specialized on spherical objects and handles both texture and geometry data of arbitrary large sizes in an efficient way.</p>
|
169 |
Influence Of Filtering On Linear And Nonlinear Single Degree Of Freedom DemandsOzen, Onder Garip 01 November 2006 (has links) (PDF)
Ground-motion data processing is a necessity for most earthquake engineering related studies. Important engineering parameters such as the peak values of ground motion and the ordinates of the response spectra are determined from the strong ground-motion data recorded by accelerometers. However, the raw data needs to be processed since the recorded data always contains high- and low-frequency noise from different sources.
Low-cut filters are the most popular ground-motion data processing scheme for removing long-period noise. Removing long-period noise from the raw accelogram is important since the displacement spectrum that provides primary information about deformation demands on structural systems is highly sensitive to the long-period noise.
The objective of this study is to investigate the effect of low-cut filtering period on linear and nonlinear deformation demands. A large number of strong ground motions from Europe and the Middle East representing different site classes as well as different magnitude and distance ranges are used to conduct statistical analysis. The statistical results are used to investigate the influence of low-cut filter period on spectral displacements.
The results of the study are believed to be useful for future generation ground-motion prediction equations on deformation demands that are of great importance in performance-based earthquake engineering.
|
170 |
Dynamic and Static Approaches for Glyph-Based Visualization of Software MetricsMajid, Raja January 2008 (has links)
<p>This project presents the research on software visualization techniques. We will introduce the concepts of software visualization, software metrics and our proposed visualization techniques: Static Visualization (glyphs object with static texture) and Dynamic Visualization (glyphs object with moving object). Our intent to study the existing visualization techniques for visualization of software</p><p>metrics and then proposed the new visualization approach that is more time efficient and easy to perceive by viewer. In this project, we focus on the practical aspects of visualization of multivariate dataset. This project also gives an implementation of proposed visualization techniques of software metrics. In this research based work, we have to compare practically the proposed visualization approaches. We will discuss the software development life cycle of our proposed visualization system, and we will also describe the complete software implementation of implemented software.</p>
|
Page generated in 0.0421 seconds