331 |
Estrogens Rapidly Enhance Neural Plasticity and LearningPhan, Anna 24 July 2013 (has links)
This thesis examines the rapid, non-genomic effects of estrogens on neural plasticity and learning. Estrogens are classically known to affect gene transcription events, however they have more recently been found to also rapidly activate second messenger systems within 1hr of administration. Therefore, we first examined the rapid effects of 17β-estradiol, and an estrogen receptor (ER) α and ERβ agonist on three different learning paradigms: object placement, object recognition, and social recognition. We found that both systemic injections and intrahippocampal delivery of 17β-estradiol and the ERα agonist improved performance on all 3 learning paradigms within 40min of hormone administration. However, the ERβ agonist administered systemically or intrahippocampally, improved performance only on the object placement learning paradigm, while having no effect on object recognition, and impairing social recognition at high doses. To elucidate how estrogens might rapidly affect learning, we examined how estrogens rapidly affect the neural plasticity of CA1 hippocampal neurons. We found that 17β-estradiol and the ERα agonist increased dendritic spine density in CA1 hippocampal neurons within 40min of administration, suggesting that estrogens rapidly increase the density of synapses within this brain region. Conversely, the ERβ agonist did not affect spine density, or decreased spine density. In addition, by using whole-cell patch clamp recordings of CA1 pyramidal neurons, we were able to determine that 17β-estradiol and the ERα agonist rapidly reduced AMPA receptor (but not NMDA receptor) mediated membrane depolarizations after 15min of hormone application. Similar to above, the ERβ agonist had no effect on AMPA or NMDA receptor mediated membrane depolarizations. These data suggest that estrogens rapidly promote the development of immature synapses (which contain low levels of synaptic AMPA receptors) within the CA1 hippocampus. Immature spines provide synaptic sites at which new memories can be stored and are thought of as “learning spines” (Kasai et al, 2003). Therefore, estrogens (through ERα) may rapidly induce the formation of hippocampal immature spines to promote learning. / Funded by NSERC
|
332 |
Learning descriptive models of objects and activities from egocentric videoFathi, Alireza 29 August 2013 (has links)
Recent advances in camera technology have made it possible to build a comfortable, wearable system which can capture the scene in front of the user throughout the day. Products based on this technology, such as GoPro and Google Glass, have generated substantial interest. In this thesis, I present my work on egocentric vision, which leverages wearable camera technology and provides a new line of attack on classical computer vision problems such as object categorization and activity recognition.
The dominant paradigm for object and activity recognition over the last decade has been based on using the web. In this paradigm, in order to learn a model for an object category like coffee jar, various images of that object type are fetched from the web (e.g. through Google image search), features are extracted and then classifiers are learned. This paradigm has led to great advances in the field and has produced state-of-the-art results for object recognition. However, it has two main shortcomings: a) objects on the web appear in isolation and they miss the context of daily usage; and b) web data does not represent what we see every day.
In this thesis, I demonstrate that egocentric vision can address these limitations as an alternative paradigm. I will demonstrate that contextual cues and the actions of a user can be exploited in an egocentric vision system to learn models of objects under very weak supervision. In addition, I will show that measurements of a subject's gaze during object manipulation tasks can provide novel feature representations to support activity recognition. Moving beyond surface-level categorization, I will showcase a method for automatically discovering object state changes during actions, and an approach to building descriptive models of social interactions between groups of individuals. These new capabilities for egocentric video analysis will enable new applications in life logging, elder care, human-robot interaction, developmental screening, augmented reality and social media.
|
333 |
基於多元編碼機制之區域特徵描述子 / Local Descriptors Based on Multi-level Encoding Scheme翁苡甄 Unknown Date (has links)
影像辨識一直是電腦視覺中很重要的技術,且伴隨著行動裝置與相機的普及,人們更加重視辨識的準確度與效能,以區域梯度分佈及直方圖表示方法為基礎的影像特徵描述子,如SIFT與SURF,是近十多年來的物件辨識技術中所採用的主流演算法,然而此類特徵表示法,常需要為多維度的資訊提供大量的儲存空間與複雜的距離計算流程,因此,近年來有學者提出了另一種形式的區域二元特徵描述子 ( Local Binary Descriptor, LBD),以二元架構建立描述子,使得LBD能在較少空間之下提供可相抗衡的辨識率。
本論文提出以多元編碼機制之區域特徵描述子(LMLED),乃基於LBD的基本架構,但改以多元編碼取代LBD的二元編碼方法,利用緩衝區的架構達到更強的抗噪性,並提出降維方法以承襲二元編碼在儲存空間的優勢,使得多元編碼機制之區域特徵描述子能在不影響匹配能力與儲存空間的情況下,得到更佳的影像辨識能力。 / Efficient and robust object recognition is an important yet challenging task in computer vision. With the popularity of mobile equipment and digital camera, the demand for effectiveness and efficiency in image recognition has become increasingly pressing. In the past decade, local feature descriptors based on the distribution of local gradients and histogram representation such as SIFT and SURF have achieved a certain level of success. However, these descriptors require a large amount of storage and computing resources for high dimensional feature vectors. Hence, local binary descriptor (LBD) arises and becomes popular in recent years, providing comparable performance with binary structure that needs dramatically lower storage cost.
In this thesis, we propose to employ multi-level encoding scheme to replace binary encoding of LBD. The resultant descriptor is named local multi-level encoding descriptor (LMLED). LMLED takes advantage of multiple decision intervals and thus can achieve better noise resistivity. Methods to reduce the dimension have been devised to maintain low storage cost. Extensive experiments have been performed and the results validate that LMLED can achieve superior performance under noisy condition while maintaining comparable matching efficacy and storage requirement.
|
334 |
Mathematical imaging tools in cancer research : from mitosis analysis to sparse regularisationGrah, Joana Sarah January 2018 (has links)
This dissertation deals with customised image analysis tools in cancer research. In the field of biomedical sciences, mathematical imaging has become crucial in order to account for advancements in technical equipment and data storage by sound mathematical methods that can process and analyse imaging data in an automated way. This thesis contributes to the development of such mathematically sound imaging models in four ways: (i) automated cell segmentation and tracking. In cancer drug development, time-lapse light microscopy experiments are conducted for performance validation. The aim is to monitor behaviour of cells in cultures that have previously been treated with chemotherapy drugs, since atypical duration and outcome of mitosis, the process of cell division, can be an indicator of successfully working drugs. As an imaging modality we focus on phase contrast microscopy, hence avoiding phototoxicity and influence on cell behaviour. As a drawback, the common halo- and shade-off effect impede image analysis. We present a novel workflow uniting both automated mitotic cell detection with the Hough transform and subsequent cell tracking by a tailor-made level-set method in order to obtain statistics on length of mitosis and cell fates. The proposed image analysis pipeline is deployed in a MATLAB software package called MitosisAnalyser. For the detection of mitotic cells we use the circular Hough transform. This concept is investigated further in the framework of image regularisation in the general context of imaging inverse problems, in which circular objects should be enhanced, (ii) exploiting sparsity of first-order derivatives in combination with the linear circular Hough transform operation. Furthermore, (iii) we present a new unified higher-order derivative-type regularisation functional enforcing sparsity of a vector field related to an image to be reconstructed using curl, divergence and shear operators. The model is able to interpolate between well-known regularisers such as total generalised variation and infimal convolution total variation. Finally, (iv) we demonstrate how we can learn sparsity promoting parametrised regularisers via quotient minimisation, which can be motivated by generalised Eigenproblems. Learning approaches have recently become very popular in the field of inverse problems. However, the majority aims at fitting models to favourable training data, whereas we incorporate knowledge about both fit and misfit data. We present results resembling behaviour of well-established derivative-based sparse regularisers, introduce novel families of non-derivative-based regularisers and extend this framework to classification problems.
|
335 |
Modulação do sistema das poliaminas e bloqueio seletivo de correntes de K+ do tipo A reverte o dano cognitivo induzido por peptídeo β-amiloide25-35 / Modulation of polyamine system and blockade of A-Type K+ currents counteracts β-Amyloid25-35-induced cognitive deficitsGomes, Guilherme Monteiro 18 November 2013 (has links)
Conselho Nacional de Desenvolvimento Científico e Tecnológico / In Alzheimer s disease (AD), β-amyloid peptide (Aβ) has been linked with synaptic loss and cognitive dysfunction, albeit the precise mechanism remains unknown. An involvement of N-Methyl-D-Aspartate receptors (NMDAR) in AD is proposed, since its inhibition attenuates some aspects of AD s neuropathology. In this regard, polyamines, like spermidine and spermine, positive modulators of NMDARs, have been shown to have both concentration and synthesis increased by Aβ. Using the novel object recognition task we showed that negative modulation of polyamine system, been trough blockade of its binding site at NMDAR by arcaine (0.02 nmol/site), traxoprodil (0.002 nmol/site), or inhibition of polyamine synthesis by DFMO (2.7 nmol/site), reverses Aβ25-35-induced memory impairment in mice. The activation of polyamine binding site at NMDAR located at extrasynaptic sites might underlie the cognitive deficits of Aβ25-35-treated mice, since incubation of hippocampal neuron cultures with spermidine (400 μM) or Aβ25-35 (10 μM) significantly increased nuclear accumulation of jacob protein, a marker of extrasynaptic NMDAR activation. Moreover, traxoprodil (4nM), arcaine (4 μM) or DFMO (5 μM) blocked the Aβ-induced jacob nuclear translocation. Activation of extrasynaptic NMDAR in neurons leads to striping of synaptic contacts and simplification of neuronal cytoarchitecture. Incubation of hippocampal neuron cultures with traxoprodil (4 Nm), arcaine (4 μM) or DFMO (5 μM) reversed the deleterious effects of Aβ25-35 on dendritic spine number and spine morphology. We also evaluated the involvement of A-type K+ currents on the Aβ25-35-induced memory impairment. Administration of Tx3-1 (3 100 pmol/site), a selective IA blocker, restored memory of mice injected with Aβ25-35 and tested on the novel object recognition task The reversal of memory impairment and the protective effect on dendritic spine alterations exerted by the modulators of the polyamine system suggest the polyamine binding site at extrasynaptic NMDAR a potential player in Aβ-induced cognitive deficit. / O peptídeo β-amiloide (Aβ), reconhecido como agente tóxico na Doença de Alzheimer (DA) é implicado como causador de danos cognitivos e sinápticos, apesar de os mecanismos não serem completamente compreendidos. O envolvimento do receptor N-metil-D-aspartato (NMDA) na DA é sugerido, visto que o seu bloqueio atenua alguns aspectos neuropatológicos da DA. Nesse contexto, tem sido demonstrado que as poliaminas, como espermidina e espermina, moduladores positivos do receptor NMDA, possuem níveis e síntese elevada tanto no cérebro de pacientes com DA como em preparações in vitro utilizando o peptídeo Aβ. Neste estudo demonstrou-se que a modulação do sistema das poliaminas, através do bloqueio do seu sítio de ligação no receptor NMDA por arcaína (0,02 nmol/sítio), traxoprodil (0,002 nmol/sítio) ou da inibição de sua síntese por DFMO (2,7 nmol/sítio), reverte o déficit cognitivo induzido pela injeção de Aβ25-35 em camundongos testados na tarefa de reconhecimento de objetos. A ativação do sítio de ligação das poliaminas em receptores NMDA extrassinápticos pode subjazer o déficit cognitivo de camundongos injetados com Aβ25-35, visto que a incubação de culturas primárias de neurônios hipocampais com espermidina (400 μM), NMDA (200 μM) ou Aβ25-35 (10 μM) aumenta o acúmulo nuclear de jacob, um marcador de ativação de receptores NMDA extrassinápticos, de maneira significante. Ademais, traxoprodil (4 nM), arcaína (4 μM) ou DFMO (5 μM) bloquearam o acúmulo nuclear de jacob induzido por Aβ. A ativação de receptores NMDA extrassinápticos em neurônios leva a simplificação da citoarquitetura neuronal e a diminuição de contatos sinápticos. Aqui demonstrou-se que a incubação de culturas de neurônios hipocampais com traxoprodil (4 nM), arcaína (4 μM) ou DFMO (5 μM) reverte as alterações na a densidade e morfologia de espinhas dendríticas induzido pela incubação com Aβ25-35. Ainda, também avaliou-se o envolvimento de correntes de K+ do tipo A no déficit cognitivo induzido pela injeção i.c.v. de Aβ25-35. A administração de Tx3-1 (3 100 pmol/sítio), um bloqueador seletivo de correntes IA, reverteu o prejuízo de memória de camundongos injetados com Aβ25-35 e testados na tarefa de reconhecimento de objetos. A reversão dos danos cognitivos e sinápticos induzidos por Aβ25-35 através da modulação do sistema das poliaminas sugere a estimulação do sítio de ligação das poliaminas no receptor NMDA, possivelmente extrassínaptico, como um dos mecanimos por trás do déficit cognitivo induzido pelo peptídeo Aβ.
|
336 |
Transformations polynomiales, applications à l'estimation de mouvements et la classification / Polynomial transformations, applications to motion estimation and classificationMoubtahij, Redouane El 11 June 2016 (has links)
Ces travaux de recherche concernent la modélisation de l'information dynamique fonctionnelle fournie par les champs de déplacements apparents à l'aide de base de polynômes orthogonaux. Leur objectif est de modéliser le mouvement et la texture extraites afin de l'exploiter dans les domaines de l'analyse et de la reconnaissance automatique d'images et de vidéos. Nous nous intéressons aussi bien aux mouvements humains qu'aux textures dynamiques. Les bases de polynômes orthogonales ont été étudiées. Cette approche est particulièrement intéressante car elle offre une décomposition en multi-résolution et aussi en multi-échelle. La première contribution de cette thèse est la définition d'une méthode spatiale de décomposition d'image : l'image est projetée et reconstruite partiellement avec un choix approprié du degré d'anisotropie associé à l'équation de décomposition basée sur des transformations polynomiales. Cette approche spatiale est étendue en trois dimensions afin d'extraire la texture dynamique dans des vidéos. Notre deuxième contribution consiste à utiliser les séquences d'images qui représentent les parties géométriques comme images initiales pour extraire les flots optiques couleurs. Deux descripteurs d'action, spatial et spatio-temporel, fondés sur la combinaison des informations du mouvement/texture sont alors extraits. Il est ainsi possible de définir un système permettant de reconnaître une action complexe (composée d'une suite de champs de déplacement et de textures polynomiales) dans une vidéo. / The research relies on modeling the dynamic functional information from the fields of apparent movement using basic orthogonal polynomials. The goal is to model the movement and texture extracted for automatic analysis and recognition of images and videos. We are interested both in human movements as dynamic textures. Orthogonal polynomials bases were studied. This approach is particularly interesting because it offers a multi-resolution and a multi-scale decomposition. The first contribution of this thesis is the definition of method of image spatial decomposition: the image is projected and partially rebuilt with an appropriate choice of the degree of anisotropy associated with the decomposition equation based on polynomial transformations. This spatial approach is extended into three dimensions to retrieve the dynamic texture in videos. Our second contribution is to use image sequences that represent the geometric parts as initial images to extract color optical flow. Two descriptors of action, spatial and space-time, based on the combination of information of motion / texture are extracted. It is thus possible to define a system to recognize a complex action (composed of a series of fields of motion and polynomial texture) in a video.
|
337 |
3D Semantic SLAM of Indoor Environment with Single Depth Sensor / SLAM sémantique 3D de l'environnement intérieur avec capteur de profondeur simpleGhorpade, Vijaya Kumar 20 December 2017 (has links)
Pour agir de manière autonome et intelligente dans un environnement, un robot mobile doit disposer de cartes. Une carte contient les informations spatiales sur l’environnement. La géométrie 3D ainsi connue par le robot est utilisée non seulement pour éviter la collision avec des obstacles, mais aussi pour se localiser et pour planifier des déplacements. Les robots de prochaine génération ont besoin de davantage de capacités que de simples cartographies et d’une localisation pour coexister avec nous. La quintessence du robot humanoïde de service devra disposer de la capacité de voir comme les humains, de reconnaître, classer, interpréter la scène et exécuter les tâches de manière quasi-anthropomorphique. Par conséquent, augmenter les caractéristiques des cartes du robot à l’aide d’attributs sémiologiques à la façon des humains, afin de préciser les types de pièces, d’objets et leur aménagement spatial, est considéré comme un plus pour la robotique d’industrie et de services à venir. Une carte sémantique enrichit une carte générale avec les informations sur les entités, les fonctionnalités ou les événements qui sont situés dans l’espace. Quelques approches ont été proposées pour résoudre le problème de la cartographie sémantique en exploitant des scanners lasers ou des capteurs de temps de vol RGB-D, mais ce sujet est encore dans sa phase naissante. Dans cette thèse, une tentative de reconstruction sémantisée d’environnement d’intérieur en utilisant une caméra temps de vol qui ne délivre que des informations de profondeur est proposée. Les caméras temps de vol ont modifié le domaine de l’imagerie tridimensionnelle discrète. Elles ont dépassé les scanners traditionnels en termes de rapidité d’acquisition des données, de simplicité fonctionnement et de prix. Ces capteurs de profondeur sont destinés à occuper plus d’importance dans les futures applications robotiques. Après un bref aperçu des approches les plus récentes pour résoudre le sujet de la cartographie sémantique, en particulier en environnement intérieur. Ensuite, la calibration de la caméra a été étudiée ainsi que la nature de ses bruits. La suppression du bruit dans les données issues du capteur est menée. L’acquisition d’une collection d’images de points 3D en environnement intérieur a été réalisée. La séquence d’images ainsi acquise a alimenté un algorithme de SLAM pour reconstruire l’environnement visité. La performance du système SLAM est évaluée à partir des poses estimées en utilisant une nouvelle métrique qui est basée sur la prise en compte du contexte. L’extraction des surfaces planes est réalisée sur la carte reconstruite à partir des nuages de points en utilisant la transformation de Hough. Une interprétation sémantique de l’environnement reconstruit est réalisée. L’annotation de la scène avec informations sémantiques se déroule sur deux niveaux : l’un effectue la détection de grandes surfaces planes et procède ensuite en les classant en tant que porte, mur ou plafond; l’autre niveau de sémantisation opère au niveau des objets et traite de la reconnaissance des objets dans une scène donnée. A partir de l’élaboration d’une signature de forme invariante à la pose et en passant par une phase d’apprentissage exploitant cette signature, une interprétation de la scène contenant des objets connus et inconnus, en présence ou non d’occultations, est obtenue. Les jeux de données ont été mis à la disposition du public de la recherche universitaire. / Intelligent autonomous actions in an ordinary environment by a mobile robot require maps. A map holds the spatial information about the environment and gives the 3D geometry of the surrounding of the robot to not only avoid collision with complex obstacles, but also selflocalization and for task planning. However, in the future, service and personal robots will prevail and need arises for the robot to interact with the environment in addition to localize and navigate. This interaction demands the next generation robots to understand, interpret its environment and perform tasks in human-centric form. A simple map of the environment is far from being sufficient for the robots to co-exist and assist humans in the future. Human beings effortlessly make map and interact with environment, and it is trivial task for them. However, for robots these frivolous tasks are complex conundrums. Layering the semantic information on regular geometric maps is the leap that helps an ordinary mobile robot to be a more intelligent autonomous system. A semantic map augments a general map with the information about entities, i.e., objects, functionalities, or events, that are located in the space. The inclusion of semantics in the map enhances the robot’s spatial knowledge representation and improves its performance in managing complex tasks and human interaction. Many approaches have been proposed to address the semantic SLAM problem with laser scanners and RGB-D time-of-flight sensors, but it is still in its nascent phase. In this thesis, an endeavour to solve semantic SLAM using one of the time-of-flight sensors which gives only depth information is proposed. Time-of-flight cameras have dramatically changed the field of range imaging, and surpassed the traditional scanners in terms of rapid acquisition of data, simplicity and price. And it is believed that these depth sensors will be ubiquitous in future robotic applications. In this thesis, an endeavour to solve semantic SLAM using one of the time-of-flight sensors which gives only depth information is proposed. Starting with a brief motivation in the first chapter for semantic stance in normal maps, the state-of-the-art methods are discussed in the second chapter. Before using the camera for data acquisition, the noise characteristics of it has been studied meticulously, and properly calibrated. The novel noise filtering algorithm developed in the process, helps to get clean data for better scan matching and SLAM. The quality of the SLAM process is evaluated using a context-based similarity score metric, which has been specifically designed for the type of acquisition parameters and the data which have been used. Abstracting semantic layer on the reconstructed point cloud from SLAM has been done in two stages. In large-scale higher-level semantic interpretation, the prominent surfaces in the indoor environment are extracted and recognized, they include surfaces like walls, door, ceiling, clutter. However, in indoor single scene object-level semantic interpretation, a single 2.5D scene from the camera is parsed and the objects, surfaces are recognized. The object recognition is achieved using a novel shape signature based on probability distribution of 3D keypoints that are most stable and repeatable. The classification of prominent surfaces and single scene semantic interpretation is done using supervised machine learning and deep learning systems. To this end, the object dataset and SLAM data are also made publicly available for academic research.
|
338 |
Rôle(s) du récepteur aux cannabinoïdes mitochondrial de type 1 dans le cerveau / Role(s) of the mitochondrial type-1 cannabinoid receptor in the brainDesprez, Tifany 13 May 2015 (has links)
Le récepteur aux cannabinoïdes de type 1 (CB1) est un récepteur couplé aux protéines G, abondamment exprimé dans le cerveau et régulant plusieurs processus physiologiques. Cependant, les mécanismes cellulaires par lesquels les CB1 régulent ces processus n’ont été que peu analysés. Bien que les CB1 localisés dans les membranes plasmiques sont connus pour induire la transduction de signal; une partie de ces récepteurs sont aussi fonctionnels au niveau des mitochondries (mtCB1), où leur stimulation réduit la respiration mitochondriale. L’objectif de cette thèse fut d’évaluer l’impact de l’activation des récepteurs mtCB1 du cerveau sur les effets connus des cannabinoïdes. Afin de distinguer la fonction des mtCB1 de celle des autres populations de récepteurs, nous avons développé des outils basés sur la signalisation induite par les mtCB1. Dans les mitochondries isolées de cerveau, l’activation des protéines Gαi/o, dépendante des mtCB1 diminue l’activité de l’adénylyl cyclase soluble (sAC). L'inhibition locale de l’activité de sAC prévient l’amnésie, la catalepsie et partiellement l’hypolocomotion induite par les cannabinoïdes. De plus, nous avons généré une protéine fonctionnelle mutante CB1 (DN22-CB1) dépourvue des 22 premiers acides aminés des CB1 ainsi que de sa localisation mitochondriale. Contrairement aux CB1, l'activation des DN22-CB1 n’affecte pas l'activité mitochondriale. Enfin, l’expression des DN22-CB1 dans l’hippocampe bloque à la fois la diminution de la transmission synaptique et l’amnésie induites par les cannabinoïdes. Ces travaux démontrent l’implication des mtCB1 dans certains effets des cannabinoïdes et le rôle clé des processus bioénergétiques contrôlant les fonctions cérébrales. / Type-1 cannabinoid receptor CB1 is a G protein-coupled receptor (GPCR), widely expressed in the brain, which regulates numerous physiological processes. However, the cellular mechanisms of CB1-mediated control of these functions are poorly understood. Although CB1 are known to signal at the plasma membrane, a portion of these receptors are also present in mitochondria (mtCB1), where mtCB1 activation decreases mitochondrial activity. The goal of this thesis was to dissect the impact of brain mtCB1 signaling in known behavioral effects induced by cannabinoids. To distinguish the functions of mtCB1 from other receptor pools, we developed tools based on the characterization of the intra-mitochondrial molecular cascade induced by mtCB1 receptors. In isolated brain mitochondria, we found that intra-mitochondrial decrease of soluble-adenylyl cyclase (sAC) activity links mtCB1- dependent activation of Gαi/o proteins to decrease cellular respiration. Local brain inhibition of sAC activity blocks cannabinoid-induced amnesia, catalepsy and contributes to the hypolocomotor effect of cannabinoids. In addition, we generated a functional mutant CB1 protein (DN22-CB1) lacking the first 22 amino acid of CB1 and its mitochondrial localization. Differently from CB1, activation of DN22-CB1 does not affect mitochondrial activity. Hippocampal in vivo expression of DN22-CB1 abolished both cannabinoid-induced impairment of synaptic transmission and amnesia in mice. Together, these studies couple mitochondrial activity to behavioral performances. The involvement of mtCB1 in the effects of cannabinoids on memory and motor control highlights the key role of bioenergetic processes as regulators of brain functions.
|
339 |
Recalage hétérogène pour la reconstruction 3D de scènes sous-marines / Heterogeneous Registration for 3D Reconstruction of Underwater SceneMahiddine, Amine 30 June 2015 (has links)
Le relevé et la reconstruction 3D de scènes sous-marine deviennent chaque jour plus incontournable devant notre intérêt grandissant pour l’étude des fonds sous-marins. La majorité des travaux existants dans ce domaine sont fondés sur l’utilisation de capteurs acoustiques l’image n’étant souvent qu’illustrative.L’objectif de cette thèse consiste à développer des techniques permettant la fusion de données hétérogènes issues d’un système photogrammétrique et d’un système acoustique.Les travaux présentés dans ce mémoire sont organisés en trois parties. La première est consacrée au traitement des données 2D afin d’améliorer les couleurs des images sous-marines pour augmenter la répétabilité des descripteurs en chaque point 2D. Puis, nous proposons un système de visualisation de scène en 2D sous forme de mosaïque.Dans la deuxième partie, une méthode de reconstruction 3D à partir d’un ensemble non ordonné de plusieurs images a été proposée. Les données 3D ainsi calculées seront fusionnées avec les données provenant du système acoustique dans le but de reconstituer le site sous-marin.Dans la dernière partie de ce travail de thèse, nous proposons une méthode de recalage 3D originale qui se distingue par la nature du descripteur extrait en chaque point. Le descripteur que nous proposons est invariant aux transformations isométriques (rotation, transformation) et permet de s’affranchir du problème de la multi-résolution. Nous validons à l’aide d’une étude effectuée sur des données synthétiques et réelles où nous montrons les limites des méthodes de recalages existantes dans la littérature. Au final, nous proposons une application de notre méthode à la reconnaissance d’objets 3D. / The survey and the 3D reconstruction of underwater become indispensable for our growing interest in the study of the seabed. Most of the existing works in this area are based on the use of acoustic sensors image.The objective of this thesis is to develop techniques for the fusion of heterogeneous data from a photogrammetric system and an acoustic system.The presented work is organized in three parts. The first is devoted to the processing of 2D data to improve the colors of the underwater images, in order to increase the repeatability of the feature descriptors. Then, we propose a system for creating mosaics, in order to visualize the scene.In the second part, a 3D reconstruction method from an unordered set of several images was proposed. The calculated 3D data will be merged with data from the acoustic system in order to reconstruct the underwater scene.In the last part of this thesis, we propose an original method of 3D registration in terms of the nature of the descriptor extracted at each point. The descriptor that we propose is invariant to isometric transformations (rotation, transformation) and addresses the problem of multi-resolution. We validate our approach with a study on synthetic and real data, where we show the limits of the existing methods of registration in the literature. Finally, we propose an application of our method to the recognition of 3D objects.
|
340 |
Produktmatchning EfficientNet vs. ResNet : En jämförelse / Product matching EfficientNet vs. ResNetMalmgren, Emil, Järdemar, Elin January 2021 (has links)
E-handeln ökar stadigt och mellan åren 2010 och 2014 var det en ökning på antalet konsumenter som handlar online från 28,9% till 34,2%. Otillräcklig information kring en produkts pris tvingar köpare att leta bland flera olika återförsäljare efter det bästa priset. Det finns olika sätt att ta fram informationen som krävs för att kunna jämföra priser. En metod för att kunna jämföra priser är automatiserad produktmatchning. Denna metod använder algoritmer för bildigenkänning där dess syfte är att detektera, lokalisera och känna igen objekt i bilder. Bildigenkänningsalgoritmer har ofta problem med att hitta objekt i bilder på grund av yttre faktorer såsom belysning, synvinklar och om bilden innehåller mycket onödig information. Tidigare har algoritmer såsom ANN (artificial neural network), random forest classifier och support vector machine används men senare undersökningar har visat att CNN (convolutional neural network) är bättre på att hitta viktiga egenskaper hos objekt som gör dem mindre känsliga mot dessa yttre faktorer. Två exempel på alternativa CNN-arkitekturer som vuxit fram är EfficientNet och ResNet som båda har visat bra resultat i tidigare forskning men det finns inte mycket forskning som hjälper en välja vilken CNN-arkitektur som leder till ett så bra resultat som möjligt. Vår frågeställning är därför: Vilken av EfficientNet- och ResNetarkitekturerna ger det högsta resultatet på produktmatchning med utvärderingsmåtten f1-score, precision och recall? Resultatet av studien visar att EfficientNet är den över lag bästa arkitekturen för produktmatchning på studiens datamängd. Resultatet visar också att ResNet var bättre än EfficientNet på att föreslå rätt matchningar av bilderna. De matchningarna ResNet gör stämmer mer än de matchningar EfficientNet föreslår då Resnet fick ett högre recall än vad EfficientNet fick. EfficientNet uppnår dock en bättre recall som visar att EfficientNet är bättre än ResNet på att hitta fler eller alla korrekta matchningar bland sina potentiella matchningar. Men skillnaden i recall är större mellan modellerna vilket göra att EfficientNet får en högre f1-score och är över lag bättre än ResNet, men vad som är viktigast kan diskuteras. Är det viktigt att de föreslagna matchningarna är korrekta eller att man hittar alla korrekta matchningar. Är det viktigaste att de föreslagna matchningarna är korrekta har ResNet ett övertag men är det viktigare att hitta alla korrekta matchningar har EfficientNet ett övertag. Resultatet beror därför på vad som anses vara viktigast för att avgöra vilken av arkitekturerna som ger bäst resultat. / E-commerce is steadily increasing and between the years 2010 and 2014, there was an increase in the number of consumers shopping online from 28,9% to 34,2%. Insufficient information about the price of a product forces buyers to search among several different retailers for the best price. There are different ways to produce the information required to be able to compare prices. One method to compare prices is automated product matching. This method uses image recognition algorithms where its purpose is to detect, locate and recognize objects in images. Image recognition algorithms often have problems finding objects in images due to external factors such as brightness, viewing angles and if the image contains a lot of unnecessary information. In the past, algorithms such as ANN, random forest classifier and support vector machine have been used, but recent studies have shown that CNN is better at finding important properties of objects that make them less sensitive to these external factors. Two examples of alternative CNN architectures that have emerged are EfficientNet and ResNet, both of which have shown good results in previous studies, but there is not a lot of research that helps one choose which CNN architecture that leads to the best possible result. Our question is therefore: Which of the EfficientNet and ResNet architectures gives the highest result on product matching with the evaluation measures f1-score, precision, and recall? The results of the study show that EfficientNet is the overall best architecture for product matching on the dataset. The results also show that ResNet was better than EfficientNet in proposing the right matches for the images. The matches ResNet makes are more accurate than the matches EfficientNet suggests when Resnet received a higher precision than EfficientNet. However, EfficientNet achieves a better recall that shows that EfficientNet is better than ResNet at finding more or all correct matches among its potential matches. The difference in recall is greater than the difference in precision between the models, which means that EfficientNet gets a higher f1-score and is generally better than ResNet, but what is most important can be discussed. Is it important that the suggested matches are correct or that you find all the correct matches? If the most important thing is that the proposed matches are correct, ResNet has an advantage, but if it is more important to find all correct matches, EfficientNet has an advantage. The result therefore depends on what is considered to be most important in determining which of the architectures gives the best results
|
Page generated in 0.1518 seconds