171 |
On singular estimation problems in sensor localization systemsAsh, Joshua N. 10 December 2007 (has links)
No description available.
|
172 |
Non-linguistic Notions in Language Modeling: Learning, Retention, and ApplicationsSharma, Mandar 11 September 2024 (has links)
Language modeling, especially through the use of transformer-based large language models (LLMs), has drastically changed how we view and use artificial intelligence (AI) and machine learning (ML) in our daily lives. Although LLMs have showcased remarkable linguistic proficiency in their abilities to write, summarize, and phrase, these model have yet to achieve the same remarkability in their ability to quantitatively reason. This deficiency is specially apparent in smaller models (less than 1 Billion parameters) than can run natively on-device. Between the complementary capabilities of qualitative and quantitative reasoning, this thesis focuses on the latter, where the goal is to devise mechanisms to instill quantitative reasoning capabilities into these models. However, instilling this notion is not as straight forward as traditional end-to-end learning. The learning of quantitative notions include the ability of the model to discern between regular linguistic tokens and magnitude/scale-oriented non-linguistic tokens. The learning of these notions, specially after pre-training, comes at a cost for these models: catastrophic forgetting. Thus, learning needs to be followed with retention - making sure these models do not forget what they have learned. Thus, we first motivate the need for numeracy-enhanced models via their potential applications in field of data-to-text generation (D2T), showcasing how these models behave as quantitative reasoners as-is. Then, we devise both token-level training interventions and information-theoretic training interventions to numerically enhance these models, with the latter specifically focused on combating catastrophic forgetting. Our information-theoretic interventions not only lead to numerically-enhanced models but lend us critical insights into the learning behavior of these models, especially when it comes to adapting these models to the target task distribution from their pretraining distribution. Finally, we extrapolate these insights to devise more effective strategies transfer learning and unlearning for language modeling. / Doctor of Philosophy / Language modeling, especially through the use of transformer-based large language models (LLMs), has drastically changed how we view and use artificial intelligence (AI) and machine learning (ML) in our daily lives. Although LLMs have showcased remarkable linguistic proficiency in their abilities to write, summarize, and phrase, these model have yet to achieve the same remarkability in their ability to quantitatively reason. This deficiency is specially apparent in smaller models than can run natively on-device. This thesis focuses on instilling within these models the ability to perform quantitative reasoning - the ability to differentiate between words and numbers and understand the notions of magnitude tied with said numbers, while retaining their linguistic skills. The learned insights from our experiments are further used to devise models that better adapt to target tasks.
|
173 |
Sensitivity Analysis and Material Parameter Estimation using Electromagnetic Modelling / Känslighetsanalys och estimering av materialparametrar med elektromagnetisk modelleringSjödén, Therese January 2012 (has links)
Estimating parameters is the problem of finding their values from measurements and modelling. Parameters describe properties of a system; material, for instance, are defined by mechanical, electrical, and chemical parameters. Fisher information is an information measure, giving information about how changes in the parameter effect the estimation. The Fisher information includes the physical model of the problem and the statistical model of noise. The Cramér-Rao bound is the inverse of the Fisher information and gives the best possible variance for any unbiased estimator. This thesis considers aspects of sensitivity analysis in two applied material parameter estimation problems. Sensitivity analysis with the Fisher information and the Cramér-Rao bound is used as a tool for evaluation of measurement feasibilities, comparison of measurement set-ups, and as a quantitative measure of the trade-off between accuracy and resolution in inverse imaging. The first application is with estimation of the wood grain angle parameter in trees and logs. The grain angle is the angle between the direction of the wood fibres and the direction of growth; a large grain angle strongly correlates to twist in sawn timber. In the thesis, measurements with microwaves are argued as a fast and robust measurement technique and electromagnetic modelling is applied, exploiting the anisotropic properties of wood. Both two-dimensional and three-dimensional modelling is considered. Mathematical modelling is essential, lowering the complexity and speeding up the computations. According to a sensitivity analysis with the Cramér-Rao bound, estimation of the wood grain angle with microwaves is feasible. The second application is electrical impedance tomography, where the conductivity of an object is estimated from surface measurements. Electrical impedance tomography has applications in, for example, medical imaging, geological surveillance, and wood evaluation. Different configurations and noise models are evaluated with sensitivity analysis for a two-dimensional electrical impedance tomography problem. The relation between the accuracy and resolution is also analysed using the Fisher information. To conclude, sensitivity analysis is employed in this thesis, as a method to enhance material parameter estimation. The sensitivity analysis methods are general and applicable also on other parameter estimation problems. / Estimering av parametrar är att finna deras värde utifrån mätningar och modellering. Parametrar beskriver egenskaper hos system och till exempel material kan definieras med mekaniska, elektriska och kemiska parametrar. Fisherinformation är ett informationsmått som ger information om hur ändringar i en parameter påverkar estimeringen. Fisherinformationen ges av en fysikalisk modell av problemet och en statistisk modell av mätbruset. Cramér-Rao-gränsen är inversen av Fisherinformationen och ger den bästa möjliga variansen för alla väntevärdesriktiga estimatorer.Den här avhandlingen behandlar aspekter av känslighetsanalys i två tillämpade estimeringsproblem för materialparametrar. Känslighetsanalys med Fisherinformation och Cramér-Rao-gränsen används som ett redskap för utvärdering av möjligheten att mäta och för jämförelser av mätuppställningar, samt som ett kvantitativt mått på avvägningen mellan noggrannhet och upplösning för inversa bilder. Den första tillämpningen är estimering av fibervinkeln hos träd och stockar. Fibervinkeln är vinkeln mellan växtriktningen och riktningen hos träfibern och en stor fibervinkel är relaterad till problem med formstabilitet i färdiga brädor. Mikrovågsmätningar av fibervinkeln presenteras som en snabb och robust mätteknik. I avhandlingen beskrivs två- och tredimensionella elektromagnetiska modeller som utnyttjar anisotropin hos trä. Eftersom matematisk modellering minskar komplexiteten och beräkningstiden är det en viktig del i estimeringen. Enligt känslighetsanalys med Cramér-Rao-gränsen är estimering av fibervinkeln hos trä möjlig. Den andra tillämpningen är elektrisk impedanstomografi, där ledningsförmågan hos objekt bestäms genom mätningar på ytan. Elektrisk impedanstomografi har tillämpningar inom till exempel medicinska bilder, geologisk övervakning och trämätningar. Olika mätkonfigurationer och brusmodeller utvärderas med känslighetsanalys för ett tvådimensionellt exempel på elektrisk impedanstomografi. Relationen mellan noggrannhet och upplösning analyseras med Fisher information. För att sammanfatta beskrivs känslighetsanalys som en metod för att förbättra estimeringen av materialparametrar. Metoderna för känslighetsanalys är generella och kan tillämpas också på andra estimeringsproblem för parametrar.
|
174 |
Free entropies, free Fisher information, free stochastic differential equations, with applications to Von Neumann algebras / Sur quelques propriétés des entropies libres, de l'Information de Fisher libre et des équations différentielles stochastiques libres avec des applications aux algèbres de Von NeumannDabrowski, Yoann 01 December 2010 (has links)
Ce travail étend nos connaissances des entropies libres et des équations différentielles stochastiques (EDS) libres dans trois directions. Dans un premier temps, nous montrons que l'algèbre de von Neumann engendrée par au moins deux autoadjoints ayant une information de Fisher finie n'a pas la propriété $Gamma$ de Murray et von Neumann. C'est un analogue d'un résultat de Voiculescu pour l'entropie microcanonique libre. Dans un second temps, nous étudions des EDS libres à coefficients opérateurs non-bornés (autrement dit des sortes d' EDP stochastiques libres ). Nous montrons la stationnarité des solutions dans des cas particuliers. Nous en déduisons un calcul de la dimension entropique libre microcanonique dans le cas d'une information de Fisher lipschitzienne. Dans un troisième et dernier temps, nous introduisons une méthode générale de résolutions d'EDS libres stationnaires, s'appuyant sur un analogue non-commutatif d'un espace de chemins. En définissant des états traciaux sur cet analogue, nous construisons des dilatations markoviennes de nombreux semigroupes complètement markoviens sur une algèbre de von Neumann finie, en particulier de tous les semigroupes symétriques. Pour des semigroupes particuliers, par exemple dès que le générateur s'écrit sous une forme divergence pour une dérivation à valeur dans la correspondance grossière, ces dilatations résolvent des EDS libres. Entre autres applications, nous en déduisons une inégalité de Talagrand pour l'entropie non-microcanonique libre (relative à une sous-algèbre et une application complètement positive). Nous utilisons aussi ces déformations dans le cadre des techniques de déformations/rigidité de Popa / This works extends our knowledge of free entropies, free Fisher information and free stochastic differential equations in three directions. First, we prove that if a $W^{*}$-probability space generated by more than 2 self-adjoints with finite non-microstates free Fisher information doesn't have property $Gamma$ of Murray and von Neumann (especially is not amenable). This is an analogue of a well-known result of Voiculescu for microstates free entropy. We also prove factoriality under finite non-microstates entropy. Second, we study a general free stochastic differential equation with unbounded coefficients (``stochastic PDE"), and prove stationarity of solutions in well-chosen cases. This leads to a computation of microstates free entropy dimension in case of Lipschitz conjugate variable. Finally, we introduce a non-commutative path space approach to solve general stationary free Stochastic differential equations. By defining tracial states on a non-commutative analogue of a path space, we construct Markov dilations for a class of conservative completely Markov semigroups on finite von Neumann algebras. This class includes all symmetric semigroups. For well chosen semigroups (for instance with generator any divergence form operator associated to a derivation valued in the coarse correspondence) those dilations give rise to stationary solutions of certain free SDEs. Among applications, we prove a non-commutative Talagrand inequality for non-microstate free entropy (relative to a subalgebra $B$ and a completely positive map $eta:Bto B$). We also use those new deformations in conjunction with Popa's deformation/rigidity techniques, to get absence of Cartan subalgebra results
|
175 |
Classification d'images RSO polarimétriques à haute résolution spatiale sur site urbain / High – Resolution Polarimetric SAR image classification on urban areasSoheili Majd, Maryam 28 April 2014 (has links)
Notre recherche vise à évaluer l’apport d’une seule image polarimétrique RSO (Radar à Synthèse d’Ouverture) à haute résolution spatiale pour classifier les surfaces urbaines. Pour cela, nous définissons plusieurs types de toits, de sols et d’objets.Dans un premier temps, nous proposons un inventaire d’attributs statistiques, texturaux et polarimétriques pouvant être utilisés dans un algorithme de classification. Nous étudions les lois statistiques des descripteurs et montrons que la distribution de Fisher est bien adaptée pour la plupart d’entre eux. Dans un second temps, plusieurs algorithmes de classification vectorielle supervisée sont testés et comparés, notamment la classification par maximum de vraisemblance basée sur une distribution gaussienne, ou celle basée sur la distribution de Wishart comme modèle statistique de la matrice de cohérence polarimétrique, ou encore l’approche SVM. Nous proposons alors une variante de l’algorithme par maximum de vraisemblance basée sur une distribution de Fisher, dont nous avons étudié l’adéquation avec l’ensemble de nos attributs. Nous obtenons une nette amélioration de nos résultats avec ce nouvel algorithme mais une limitation apparaît pour reconnaître certains toits. Ainsi, la forme des bâtiments rectangulaires est reconnue par opérations morphologiques à partir de l’image d’amplitude radar. Cette information spatiale est introduite dans le processus de classification comme contrainte. Nous montrons tout l’intérêt de cette information puisqu’elle empêche la confusion de classification entre pixels situés sur des toits plats et des pixels d’arbre. De plus, nous proposons une méthode de sélection des attributs les plus pertinents pour la classification, basée sur l’information mutuelle et une recherche par algorithme génétique. Nos expériences sont menées sur une image polarimétrique avec un pixel de 35 cm, acquise en 2006 par le capteur aéroporté RAMSES de l’ONERA. / In this research, our aim is to assess the potential of a one single look high spatial resolution polarimetric radar image for the classification of urban areas. For that purpose, we concentrate on classes corresponding to different kinds of roofs, objects and ground surfaces.At first, we propose a uni-variate statistical analysis of polarimetric and texture attributes, that can be used in a classification algorithm. We perform a statistical analysis of descriptors and show that the Fisher distribution is suitable for most of them. We then propose a modification of the maximum likelihood algorithm based on a Fisher distribution; we train it with all of our attributes. We obtain a significant improvement in our results with the new algorithm, but a limitation appears to recognize some roofs.Then, the shape of rectangular buildings is recognized by morphological operations from the image of radar amplitude. This spatial information is introduced in a Fisher-based classification process as a constraint term and we show that classification results are improved. In particular, it overcomes classification ambiguities between flat roof pixels and tree pixels.In a second step, some well-known algorithms for supervised classification are used. We deal with Maximum Likelihood based on complex Gaussian distribution (uni-variate) and multivariate Complex Gaussian using coherency matrix. Meanwhile, the support vector machine, as a nonparametric method, is used as classification algorithm. Moreover, a feature selection based on Genetic Algorithm using Mutual Information (GA-MI) is adapted to introduce optimal subset to classification method. To illustrate the efficiency of subset selection based on GA-MI, we perform a comparison experiment of optimal subset with different target decompositions based on different scattering mechanisms, including the Pauli, Krogager, Freeman, Yamaguchi, Barnes, Holm, Huynen and the Cloude decompositions. Our experiments are based on an image of a suburban area, acquired by the airborne RAMSES SAR sensor of ONERA, in 2006, with a spatial spacing of 35 cm. The results highlight the potential of such data to discriminate some urban land cover types.
|
176 |
線性羅吉斯迴歸模型的最佳D型逐次設計 / The D-optimal sequential design for linear logistic regression model藍旭傑, Lan, Shiuh Jay Unknown Date (has links)
假設二元反應曲線為簡單線性羅吉斯迴歸模型(Simple Linear Logistic Regression Model),在樣本數為偶數的前題下,所謂的最佳D型設計(D-Optimal Design)是直接將半數的樣本點配置在第17.6個百分位數,而另一半則配置在第82.4個百分位數。很遺憾的是,這兩個位置在參數未知的情況下是無法決定的,因此逐次實驗設計法(Sequential Experimental Designs)在應用上就有其必要性。在大樣本的情況下,本文所探討的逐次實驗設計法在理論上具有良好的漸近最佳D型性質(Asymptotic D-Optimality)。尤其重要的是,這些特性並不會因為起始階段的配置不盡理想而消失,影響的只是收斂的快慢而已。但是在實際應用上,這些大樣本的理想性質卻不是我們關注的焦點。實驗步驟收斂速度的快慢,在小樣本的考慮下有決定性的重要性。基於這樣的考量,本文將提出三種起始階段設計的方法並透過模擬比較它們之間的優劣性。 / The D-optimal design is well known to be a two-point design for the simple linear logistic regression function model. Specif-ically , one half of the design points are allocated at the 17.6- th percentile, and the other half at the 82.4-th percentile. Since the locations of the two design points depend on the unknown parameters, the actual 2-locations can not be obtained. In order to dilemma, a sequential design is somehow necessary in practice. Sequential designs disscused in this context have some good properties that would not disappear even the initial stgae is not good enough under large sample size. The speed of converges of the sequential designs is influenced by the initial stage imposed under small sample size. Based on this, three initial stages will be provided in this study and will be compared through simulation conducted by C++ language.
|
177 |
Στατιστική ανακάλυψη και πειραματική επιβεβαίωση μεταγραφικών παραγόντων που ελέγχουν την ενεργοποίηση των λεμφοκυττάρων σε άνοσες καταστάσεις / Statistical discovery and experimental validation of transcription factors controlling activation in immune-related statesΑργυρόπουλος, Χρήστος 27 June 2007 (has links)
Σκοπός της παρούσης διατριβής είναι να προτείνει ένα τυπικό πλαίσιο για μια μεθοδολογία ανάλυσης που να ενσωματώνει ποσοτικά, και λειτουργικά δεδομένα και στοχεύει στο σχεδιασμό πειραμάτων για την επιβέβαιωση μεταγραφικών παραγόντων που δεσμεύονται σε ένα λειτουργικά ενεργό μοτίβο στον υποκινητή γονιδίων. Το πλαίσιο αυτό που βασίζεται στη Bayesian πιθανοκρατική θεωρία όπως αυτή θεμελιώνεται στη θεωρία λήψης αποφάσεων, εφαρμόζεται σε ένα πρόβλημα από το ερευνητικό πεδίο της ανοσοβιολογίας. Συγκεκριμένα μελετούμε την ανίχνευση μεταγραφικών παραγόντων που ενέχονται στην αρνητική ρύθμιση της γονιδιακής έκφρασης κατά τη διαδικασία ενεργοποίσης των Τ λεμφοκυττάτων. Η υιοθέτηση του προτεινούμενου πλαισίου σμιλεύει μαι αυστηρή διαδοχή διεξαγωγής in vitro και in silico πειραμάτων που ξεκινά από τεχνικές μαζικής ανάλυσης γονιδιακής έκφρασης, περνά μέσα από βάσεις δεδομένων μεταγραφικών παραγόντων και μέσω πειραμάτων ηλεκτροφορητικής κινητικότητας (Electromobility Shift Assays) στοχεύεται στην επιβεβαίωση που προσφέρουν τα πειράματα διαμόλυνσης (transfection and reporter assays). Κατά την τυποποίηση της λογικοφανούς αυτής προσέγγισης ανακύπτει ένα από τα γνωστότερα προβλήματα της εφαρμοσμένης στατιστικής, το πρόβλημα των "δύο μέσων όρων" ή πρόβλημα Behrens-Fisher. Για τη λύση αυτού του προβλήματος προτείνονται νέα μαθηματικά εργαλεία τα οποία αξιοποίηθηκαν για την κατασκευή αντίστοιχου λογισμικού. Με την εφαρμογή αυτών των εργαλείων στο εφαρμοσμένο ανοσοβιολογικό πρόβλημα προέκυψε μια μη αναμενόμενη σχέση μεταξύ δυο φαινομενικά μη συνδεόμενων συστημάτων¨των γονιδίων των κυτταροκινών και του ιού HIV. Μέσω της περιγραφόμενης μεθοδολογίας κατέστη εφικτή μια υποθεσο-εξαρτώμενη προσέγγιση σε ένα σημαντικό πρόβλημα το οποίο δεν ήταν δυνατό να λύθέί με κλασσικές βιοχημικές τεχνικές λόγω τεχνικών δυσκολιών / The current disertation concerns the description of a formal framework and an analytic methodology which aims to validate transcription factors controlling gene expression through functional and quantitative data. This Bayesian decision theory inspired framework is applied to a specific immunobiological problem. The problem targetted was the discovery of transcriptional repressors implicated in the negative control of T cell activation. Adopting the proposed framework leads one to a staged experimental strategy which starts from high-throughput gene expression data and transcription factor databases and through Electrophoretic Mobility Shift Assays targets the design of transfection and reporter gene assays. The formalization of the proposed approach, led to one of the famous applied statistics problems i.e. the two means or Behrens - Fisher problem. In order to deal with the computational aspects of this problem, we applied a novel integral transformations and ported them to software. The application of these tools to the immunobiological problem led to an unexpected connection between two seemingly unrelated systems: cytokine gene protomers and HIV LTR. The proposed methodology enabled a hypothesis-driven approach to an important basic immunobiological problem which could not be solved by standard biochemical techniques.
|
178 |
Explorando caminhos de mínima informação em grafos para problemas de classificação supervisionadaHiraga, Alan Kazuo 05 May 2014 (has links)
Made available in DSpace on 2016-06-02T19:06:12Z (GMT). No. of bitstreams: 1
5931.pdf: 2655791 bytes, checksum: 6eafe016c175143a8d55692b4681adfe (MD5)
Previous issue date: 2014-05-05 / Financiadora de Estudos e Projetos / Classification is a very important step in pattern recognition, as it aims to categorize objects from a set of inherent features, through its labeling. This process can be supervised, when there is a sample set of labeled training classes, semi-supervised, when the number of labeled samples is limited or nearly inexistent, or unsupervised, where there are no labeled samples. This project proposes to explore minimum information paths in graphs for classification problems, through the definition of a supervised, non-parametric, graph-based classification method, by means of a contextual approach. This method proposes to construct a graph from a set of training samples, where the samples are represented by vertices and the edges are links between samples that belongs to a neighborhood system. From the graph construction, the method calculates the local observed Fisher information, a measurement based on the Potts model, for all vertices, identifying the amount of information that each sample has. Generally, different class vertices when connected by an edge, have a high information level. After that, it is necessary to weight the edges by means of a function that penalizes connecting vertices with high information. During this process, it is possible to identify and select high information vertices, which will be chosen to be prototype vertices, namely, the nodes that define the classes boundaries. After the definition, the method proposes that each prototype sample conquer the remaining samples by offering the shortest path in terms of information, so that when a sample is conquered it receives the label of the winning prototype, occurring the classification. To evaluate the proposed method, statistical methods to estimate the error rates, such as Hold-out, K-fold and Leave-One- Out Cross-Validation will be considered. The obtained results indicate that the method can be a viable alternative to the existing classification techniques. / A classificação é uma etapa muito importante em reconhecimento de padrões, pois ela tem o objetivo de categorizar objetos a partir de um conjunto de características inerentes a ele, atribuindo-lhe um rótulo. Esse processo de classificação pode ser supervisionado, quando existe um conjunto de amostras de treinamento rotuladas que representam satisfatoriamente as classes, semi-supervisionado, quando o conjunto de amostras é limitado ou quase inexistente, ou não-supervisionado, quando não existem amostras rotuladas. Este trabalho propõe explorar caminhos de mínima informação em grafos para problemas de classificação, por meio da criação de um método de classificação supervisionado, não paramétrico, baseado em grafos, seguindo uma abordagem contextual. Esse método propõe a construção de um grafo a partir do conjunto de amostras de treinamento, onde as amostras serão representadas pelos vértices e as arestas serão as ligações entre amostras pertencentes a uma relação de adjacência. A partir da construção do grafo o método faz o calculo da informação de Fisher Local Observada, uma medida baseada no modelo de Potts, para todos os vértices, identificando o grau de informação que cada um possui. Geralmente vértices de classes distintas quando conectados por uma aresta possuem alta informação (bordas). Feito o calculo da informação, é necessário ponderar as arestas por meio de uma função que penaliza a ligação de vértices com alta informação. Enquanto as arestas são ponderadas é possível identificar e selecionar vértices altamente informativos os quais serão escolhidos para serem vértices protótipos, ou seja, os vértices que definem a região de borda. Depois de ponderadas as arestas e definidos os protótipos, o método propõe que cada protótipo conquiste as amostras oferecendo o menor caminho até ele, de modo que quando uma amostra é conquistada ela receba o rótulo do protótipo que a conquistou, ocorrendo a classificação. Para avaliar o método serão utilizados métodos estatísticos para estimar as taxas de acertos, como K-fold, Hold-out e Leave-one-out Cross- Validation. Os resultados obtidos indicam que o método pode ser um uma alternativa viável as técnicas de classificação existentes.
|
179 |
Les causes des variations du taux d’évolution moléculaire entre lignées / The causes of molecular evolutionary rate variations among lineagesDos Santos Lourenço, João 08 December 2011 (has links)
Cette thèse porte sur le décryptage des causes des variations des taux de substitution moléculaires entre lignées. D'un point de vue théorique, différentes hypothèses sont souvent basées sur des distributions des valeurs sélectives des mutations assez simplistes. En utilisant le modèle géométrique de Fisher, nous avons pu dériver des expressions pour cette distribution, et mettre en évidence l'importance de la complexité phénotypique et de la pléiotropie des mutations. Les variations entre espèces de la proportion de changements d'amino-acides qui sont adaptatifs sont souvent interprétées comme une conséquence de différences de taille de population. Par des simulations, nous avons démontré que la taille efficace des populations n'a qu'une influence faible sur la variation de ces taux, et que les changements environnementaux et la complexité phénotypique peuvent avoir un effet plus important. En ce qui concerne les taux de substitution synonymes, une relation inverse avec la masse corporelle est souvent décrite chez les vertébrés endothermes. Pour déterminer si cette relation est aussi valable chez les vertébrés ectothermes, nous avons suivi une approche comparative portant sur les tortues. Nous avons estimé les taux de substitution synonymes chez 224 espèces, que nous avons ensuite comparé à la masse corporelle (et autres traits d'histoire de vie) et à une variable environnementale (la latitude). Nos résultats démontrent que les taux d'évolution moléculaires sont fortement corrélés aux conditions environnementales et non pas à des traits d'histoire de vie. / The main objective of the present thesis is to elucidate the causes of variations in rates of molecular evolution among lineages, and in particular, to understand how factors connected to mutation, selection and genetic drift can influence these variations.
|
180 |
New methods for image classification, image retrieval and semantic correspondence / Nouvelles méthodes pour classification d'image, recherche d'image et correspondence sémantiqueSampaio de Rezende, Rafael 19 December 2017 (has links)
Le problème de représentation d’image est au cœur du domaine de vision. Le choix de représentation d’une image change en fonction de la tâche que nous voulons étudier. Un problème de recherche d’image dans des grandes bases de données exige une représentation globale compressée, alors qu’un problème de segmentation sémantique nécessite une carte de partitionnement de ses pixels. Les techniques d’apprentissage statistique sont l’outil principal pour la construction de ces représentations. Dans ce manuscrit, nous abordons l’apprentissage des représentations visuels dans trois problèmes différents : la recherche d’image, la correspondance sémantique et classification d’image. Premièrement, nous étudions la représentation vectorielle de Fisher et sa dépendance sur le modèle de mélange Gaussien employé. Nous introduisons l’utilisation de plusieurs modèles de mélange Gaussien pour différents types d’arrière-plans, e.g., différentes catégories de scènes, et analyser la performance de ces représentations pour objet classification et l’impact de la catégorie de scène en tant que variable latente. Notre seconde approche propose une extension de la représentation l’exemple SVM pipeline. Nous montrons d’abord que, en remplaçant la fonction de perte de la SVM par la perte carrée, on obtient des résultats similaires à une fraction de le coût de calcul. Nous appelons ce modèle la « square-loss exemplar machine », ou SLEM en anglais. Nous introduisons une variante de SLEM à noyaux qui bénéficie des même avantages computationnelles mais affiche des performances améliorées. Nous présentons des expériences qui établissent la performance et l’efficacité de nos méthodes en utilisant une grande variété de représentations de base et de jeux de données de recherche d’images. Enfin, nous proposons un réseau neuronal profond pour le problème de l’établissement sémantique correspondance. Nous utilisons des boîtes d’objets en tant qu’éléments de correspondance pour construire une architecture qui apprend simultanément l’apparence et la cohérence géométrique. Nous proposons de nouveaux scores géométriques de cohérence adaptés à l’architecture du réseau de neurones. Notre modèle est entrainé sur des paires d’images obtenues à partir des points-clés d’un jeu de données de référence et évaluées sur plusieurs ensembles de données, surpassant les architectures d’apprentissage en profondeur récentes et méthodes antérieures basées sur des caractéristiques artisanales. Nous terminons la thèse en soulignant nos contributions et en suggérant d’éventuelles directions de recherche futures. / The problem of image representation is at the heart of computer vision. The choice of feature extracted of an image changes according to the task we want to study. Large image retrieval databases demand a compressed global vector representing each image, whereas a semantic segmentation problem requires a clustering map of its pixels. The techniques of machine learning are the main tool used for the construction of these representations. In this manuscript, we address the learning of visual features for three distinct problems: Image retrieval, semantic correspondence and image classification. First, we study the dependency of a Fisher vector representation on the Gaussian mixture model used as its codewords. We introduce the use of multiple Gaussian mixture models for different backgrounds, e.g. different scene categories, and analyze the performance of these representations for object classification and the impact of scene category as a latent variable. Our second approach proposes an extension to the exemplar SVM feature encoding pipeline. We first show that, by replacing the hinge loss by the square loss in the ESVM cost function, similar results in image retrieval can be obtained at a fraction of the computational cost. We call this model square-loss exemplar machine, or SLEM. Secondly, we introduce a kernelized SLEM variant which benefits from the same computational advantages but displays improved performance. We present experiments that establish the performance and efficiency of our methods using a large array of base feature representations and standard image retrieval datasets. Finally, we propose a deep neural network for the problem of establishing semantic correspondence. We employ object proposal boxes as elements for matching and construct an architecture that simultaneously learns the appearance representation and geometric consistency. We propose new geometrical consistency scores tailored to the neural network’s architecture. Our model is trained on image pairs obtained from keypoints of a benchmark dataset and evaluated on several standard datasets, outperforming both recent deep learning architectures and previous methods based on hand-crafted features. We conclude the thesis by highlighting our contributions and suggesting possible future research directions.
|
Page generated in 0.0625 seconds