1 |
Annotation, Enrichment and Fusion of Multiscale Data: Identifying High Risk Prostate CancerSinganamalli, Asha 21 February 2014 (has links)
No description available.
|
2 |
Form data enriching using a post OCR clustering process : Measuring accuracy of field names and field values clusteringAboulkacim, Adil January 2022 (has links)
Med OCR teknologier kan innehållet av ett formulär läsas in, positionen av varje ord och dess innehåll kan extraheras, dock kan relationen mellan orden ej förstås. Denna rapport siktar på att lösa problemet med att berika data från ett strukturerat formulär utan någon förinställd konfiguration genom användandet utav klustring. Detta görs med en kvantitativ metod där mätning av en utvecklad prototyp som räknar antal korrekt klustrade textrutor och en kvalitativ utvärdering. Prototypen fungerar genom att mata en bild av ett ofyllt formulär och en annan bild av ett ifyllt formulär och en annan bild av ett ifyllt formulär som innehåller informationen som ska berikas till en OCR-motor. Utdatan från OCR-motorn körs genom ett efterbearbetningssteg som tillsammans med en modifierad euklidisk algoritm och en oskarp strängsökningsalgoritm kan klustra fältnamn och fältvärden i den ifyllda formulärbilden. Resultatet av prototypen för tre olika formulärstrukturer och 15 olika bilder vardera gav en träffsäkerhet från 100% till 92% beroende på formulärstruktur. Denna rapport kunde visa möjligheten att grupper ihop fältnamn och fältvärden i ett formulera, med andra ord utvinna information från formuläret / With OCR technologies the text in a form can be read, the position of each word and its contents can be extracted, however the relation between the words cannot be understood. This thesis aims to solve the problem of enriching data from a structured form without any pre-set configuration using clustering. This is done using the method of a quantitative measurement of a developed prototype counting correctly clustered text boxes and a qualitative evaluation. The prototype works by feeding an image of an unfilled form and another image of a filled form which contains the data to be enriched to an OCR engine. The OCR engine extracts the text and its positions which is then run through a post-processing step which together with a modified Euclidean and fuzzy string search algorithm, both together is able to cluster field names and field values in the filled in form image. The result of the prototype for three different form structures and 15 different images for each structure ranges from 100% to 92% accuracy depending on form structure. This thesis successfully was able to show the possibility of clustering together names and values in a form i.e., enriching data from the form.
|
3 |
Traitement joint de nuage de points et d'images pour l'analyse et la visualisation des formes 3D / Joint point clouds and images processing for the analysis and visualization of 3D modelsGuislain, Maximilien 19 October 2017 (has links)
Au cours de la dernière décennie, les technologies permettant la numérisation d'espaces urbains ont connu un développement rapide. Des campagnes d'acquisition de données couvrant des villes entières ont été menées en utilisant des scanners LiDAR (Light Detection And Ranging) installés sur des véhicules mobiles. Les résultats de ces campagnes d'acquisition laser, représentants les bâtiments numérisés, sont des nuages de millions de points pouvant également contenir un ensemble de photographies. On s'intéresse ici à l'amélioration du nuage de points à l'aide des données présentes dans ces photographies. Cette thèse apporte plusieurs contributions notables à cette amélioration. La position et l'orientation des images acquises sont généralement connues à l'aide de dispositifs embarqués avec le scanner LiDAR, même si ces informations de positionnement sont parfois imprécises. Pour obtenir un recalage précis d'une image sur un nuage de points, nous proposons un algorithme en deux étapes, faisant appel à l'information mutuelle normalisée et aux histogrammes de gradients orientés. Cette méthode permet d'obtenir une pose précise même lorsque les estimations initiales sont très éloignées de la position et de l'orientation réelles. Une fois ces images recalées, il est possible de les utiliser pour inférer la couleur de chaque point du nuage en prenant en compte la variabilité des points de vue. Pour cela, nous nous appuyons sur la minimisation d'une énergie prenant en compte les différentes couleurs associables à un point et les couleurs présentes dans le voisinage spatial du point. Bien entendu, les différences d'illumination lors de l'acquisition des données peuvent altérer la couleur à attribuer à un point. Notamment, cette couleur peut dépendre de la présence d'ombres portées amenées à changer avec la position du soleil. Il est donc nécessaire de détecter et de corriger ces dernières. Nous proposons une nouvelle méthode qui s'appuie sur l'analyse conjointe des variations de la réflectance mesurée par le LiDAR et de la colorimétrie des points du nuage. En détectant suffisamment d'interfaces ombre/lumière nous pouvons caractériser la luminosité de la scène et la corriger pour obtenir des scènes sans ombre portée. Le dernier problème abordé par cette thèse est celui de la densification du nuage de points. En effet la densité locale du nuage de points est variable et parfois insuffisante dans certaines zones. Nous proposons une approche applicable directement par la mise en oeuvre d'un filtre bilatéral joint permettant de densifier le nuage de points en utilisant les données des images / Recent years saw a rapid development of city digitization technologies. Acquisition campaigns covering entire cities are now performed using LiDAR (Light Detection And Ranging) scanners embedded aboard mobile vehicles. These acquisition campaigns yield point clouds, composed of millions of points, representing the buildings and the streets, and may also contain a set of images of the scene. The subject developed here is the improvement of the point cloud using the information contained in the camera images. This thesis introduces several contributions to this joint improvement. The position and orientation of acquired images are usually estimated using devices embedded with the LiDAR scanner, even if this information is inaccurate. To obtain the precise registration of an image on a point cloud, we propose a two-step algorithm which uses both Mutual Information and Histograms of Oriented Gradients. The proposed method yields an accurate camera pose, even when the initial estimations are far from the real position and orientation. Once the images have been correctly registered, it is possible to use them to color each point of the cloud while using the variability of the point of view. This is done by minimizing an energy considering the different colors associated with a point and the potential colors of its neighbors. Illumination changes can also change the color assigned to a point. Notably, this color can be affected by cast shadows. These cast shadows are changing with the sun position, it is therefore necessary to detect and correct them. We propose a new method that analyzes the joint variation of the reflectance value obtained by the LiDAR and the color of the points. By detecting enough interfaces between shadow and light, we can characterize the luminance of the scene and to remove the cast shadows. The last point developed in this thesis is the densification of a point cloud. Indeed, the local density of a point cloud varies and is sometimes insufficient in certain areas. We propose a directly applicable approach to increase the density of a point cloud using multiple images
|
4 |
Efficient Partially Observable Markov Decision Process Based Formulation Of Gene Regulatory Network Control ProblemErdogdu, Utku 01 April 2012 (has links) (PDF)
The need to analyze and closely study the gene related mechanisms motivated the
research on the modeling and control of gene regulatory networks (GRN). Dierent
approaches exist to model GRNs / they are mostly simulated as mathematical models
that represent relationships between genes. Though it turns into a more challenging
problem, we argue that partial observability would be a more natural and realistic
method for handling the control of GRNs. Partial observability is a fundamental
aspect of the problem / it is mostly ignored and substituted by the assumption that
states of GRN are known precisely, prescribed as full observability. On the other hand,
current works addressing partially observability focus on formulating algorithms for
the nite horizon GRN control problem. So, in this work we explore the feasibility of
realizing the problem in a partially observable setting, mainly with Partially Observable
Markov Decision Processes (POMDP). We proposed a POMDP formulation for
the innite horizon version of the problem. Knowing the fact that POMDP problems
suer from the curse of dimensionality, we also proposed a POMDP solution method
that automatically decomposes the problem by isolating dierent unrelated parts of
the problem, and then solves the reduced subproblems. We also proposed a method
to enrich gene expression data sets given as input to POMDP control task, because
in available data sets there are thousands of genes but only tens or rarely hundreds of
samples. The method is based on the idea of generating more than one model using
the available data sets, and then sampling data from each of the models and nally
ltering the generated samples with the help of metrics that measure compatibility,
diversity and coverage of the newly generated samples.
|
Page generated in 0.064 seconds