• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 18
  • 7
  • 6
  • 4
  • 4
  • 1
  • 1
  • Tagged with
  • 46
  • 17
  • 16
  • 11
  • 9
  • 8
  • 7
  • 7
  • 7
  • 7
  • 7
  • 6
  • 6
  • 5
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Multiview Video Compression

Bai, Baochun Unknown Date
No description available.
2

Multiview Video Compression

Bai, Baochun 11 1900 (has links)
With the progress of computer graphics and computer vision technologies, 3D/multiview video applications such as 3D-TV and tele-immersive conference become more and more popular and are very likely to emerge as a prime application in the near future. A successful 3D/multiview video system needs synergistic integration of various technologies such as 3D/multiview video acquisition, compression, transmission and rendering. In this thesis, we focus on addressing the challenges for multiview video compression. In particular, we have made 5 major contributions: (1) We propose a novel neighbor-based multiview video compression system which helps remove the inter-view redundancies among multiple video streams and improve the performance. An optimal stream encoding order algorithm is designed to enable the encoder to automatically decide the stream encoding order and find the best reference streams. (2) A novel multiview video transcoder is designed and implemented. The proposed multiview video transcoder can be used to encode multiple compressed video streams and reduce the cost of multiview video acquisition system. (3) A learning-based multiview video compression scheme is invented. The novel multiview video compression algorithms are built on the recent advances on semi-supervised learning algorithms and achieve compression by finding a sparse representation of images. (4) Two novel distributed source coding algorithms, EETG and SNS-SWC, are put forward. Both EETG and SNS-SWC are capable to achieve the whole Slepian-Wolf rate region and are syndrome-based schemes. EETG simplifies the code construction algorithm for distributed source coding schemes using extended Tanner graph and is able to handle mismatched bits at the encoder. SNS-SWC has two independent decoders and thus can simplify the decoding process. (5) We propose a novel distributed multiview video coding scheme which allows flexible rate allocation between two distributed multiview video encoders. SNS-SWC is used as the underlying Slepian-Wolf coding scheme. It is the first work to realize simultaneous Slepian-Wolf coding of stereo videos with the help of a distributed source code that achieves the whole Slepian-Wolf rate region. The proposed scheme has a better rate-distortion performance than the separate H.264 coding scheme in the high-rate case. / Computer Networks and Multimedia Systems
3

Using Multiview Annotation to Annotate Multiple Images Simultaneously

Price, Timothy C. 01 June 2017 (has links)
In order for a system to learn a model for object recognition, it must have a lot of positive images to learn from. Because of this, datasets of similar objects are built to train the model. These object datasets used for learning models are best when large, diverse and have annotations. But the process of obtaining the images and creating the annotations often times take a long time, and are costly. We use a method that obtains many images of the same objects in different angles very quickly and then reconstructs those images into a 3D model. We then use the 3D reconstruction of these images of an object to connect information about the different images of the same object together. We use that information to annotate all of the images taken very quickly and cheaply. These annotated images are then used to train the model.
4

Boosting a Biologically Inspired Local Descriptor for Geometry-free Face and Full Multi-view 3D Object Recognition

Yokono, Jerry Jun, Poggio, Tomaso 07 July 2005 (has links)
Object recognition systems relying on local descriptors are increasingly used because of their perceived robustness with respect to occlusions and to global geometrical deformations. Descriptors of this type -- based on a set of oriented Gaussian derivative filters -- are used in our recognition system. In this paper, we explore a multi-view 3D object recognition system that does not use explicit geometrical information. The basic idea is to find discriminant features to describe an object across different views. A boosting procedure is used to select features out of a large feature pool of local features collected from the positive training examples. We describe experiments on face images with excellent recognition rate.
5

Handling domain knowledge in system design models. An ontology based approach.

Hacid, Kahina 06 March 2018 (has links) (PDF)
Complex systems models are designed in heterogeneous domains and this heterogeneity is rarely considered explicitly when describing and validating processes. Moreover, these systems usually involve several domain experts and several design models corresponding to different analyses (views) of the same system. However, no explicit information regarding the characteristics neither of the domain nor of the performed system analyses is given. In our thesis, we propose a general framework offering first, the formalization of domain knowledge using ontologies and second, the capability to strengthen design models by making explicit references to the domain knowledgeformalized in these ontology. This framework also provides resources for making explicit the features of an analysis by formalizing them within models qualified as ‘’points of view ‘’. We have set up two deployments of our approach: a Model Driven Engineering (MDE) based deployment and a formal methods one based on proof and refinement. This general framework has been validated on several no trivial case studies issued from system engineering.
6

Harnessing Transfer Learning and Image Analysis Techniques for Enhanced Biological Insights: Multifaceted Approaches to Diagnosis and Prognosis of Diseases

Ziyu Liu (18410397) 22 April 2024 (has links)
<p dir="ltr">Despite the remarkable advancements of machine learning (ML) technologies in biomedical research, especially in tackling complex human diseases such as cancer and Alzheimer's disease, a considerable gap persists between promising theoretical results and dependable clinical applications in diagnosis, prognosis, and therapeutic decision-making. One of the primary challenges stems from the absence of large high-quality patient datasets, which arises from the cost and human labor required for collecting such datasets and the scarcity of patient samples. Moreover, the inherent complexity of the data often leads to a feature space dimension that is large compared with the sample size, potentially causing instability during training and unreliability in inference. To address these challenges, the transfer learning (TL) approach has been embraced in biomedical ML applications to facilitate knowledge transfer across diverse and related biological contexts. Leveraging this principle, we introduce an unsupervised multi-view TL algorithm, named MVTOT [1], which enables the analysis of various biomarkers across different cancer types. Specifically, we compress high-dimensional biomarkers from different cancer types into a low-dimensional feature space via nonnegative matrix factorization and distill common information shared by various cancer types using the Wasserstein distance defined by Optimal Transport theory. We evaluate the stratification performance on three early-stage cancers from the Cancer Genome Atlas (TCGA) project. Our framework, compared with other benchmark methods, demonstrates superior accuracy in patient survival outcome stratification.</p><p dir="ltr">Additionally, while patient-level stratification has enhanced clinical decision-making, our understanding of diseases at the single-cell (SC) level remains limited, which is crucial for deciphering disease progression mechanisms, monitoring drug responses, and prioritizing drug targets. It is essential to associate each SC with patient-level clinical traits such as survival hazard, drug response, and disease subtypes. However, SC samples often lack direct labeling with these traits, and the significant statistical gap between patient and SC-level gene expressions impedes the transfer of well-annotated patient-level disease attributes to SCs. Domain adaptation (DA), a TL subfield, addresses this challenge by training a domain-invariant feature extractor for both patient and SC gene expression matrices, facilitating the successful application of ML models trained on patient-level data to SC samples. Expanding upon an established deep-learning-based DA model, DEGAS [2], we substitute their computationally ineffective maximum mean discrepancy loss with the Wasserstein distance as the metric for domain discrepancy. This substitution facilitates the embedding of both SC and patient inputs into a common latent feature space. Subsequently, employing the model trained on patient-level disease attributes, we predict SC-level survival hazard, disease status, and drug response for prostate cancer, Alzheimer's SC data, and multiple myeloma data, respectively. Our approach outperforms benchmark studies, uncovering clinically significant cell subgroups and revealing the correlation between survival hazard and drug response at the SC level.</p><p dir="ltr">Furthermore, in addition to these approaches, we acknowledge the effectiveness of TL and image analysis in stratifying patients with early and late-stage Mild Cognitive Impairment based on neuroimaging, as well as predicting survival and metastasis in melanoma based on histological images. These applications underscore the potential of employing ML methods, especially TL algorithms, in addressing biomedical issues from various angles, thereby enhancing our understanding of disease mechanisms and developing new biomarkers predicting patient outcomes.</p>
7

Sistema de microscopia com multi-pontas : força atômica e campo próximo / Multiprobe microscopy system : atomic force and near field optics

Suárez, Vanessa Isabel Tardillo 16 February 2012 (has links)
In this work we made a review of how a multi-probes microscope using Atomic Force and Near Field Scanning Microscopy works. Currently, the Nanonics Multiview 4000 instaled at the Materials Caracterization and Microscopy Laboratory (LCMMAT) is not completly working. Nowadays, we are able to do Atomic Force Microscopy (AFM) and reflection and transmission Scanning Near Field Microscopy (SNOM) measurements. This kind of microscope have three probes which are able to do simultaneaous mesurements of AFM, C-AFM, SNOM, Raman Microscopy and nanolitography. It is the first multi-probe microscope to be instaled in Latin America. This work consists in studying the structure of this kind of microscope, how does it make AFM and SNOM measurements and how to analise them. We study the different electronic circuits which are used in this kind of microscopes and we compare both optical and tuning-fork feedback. It was explain step by step how to do and AFM and SNOM measurement. We study the processing and analise of this measurements. Finally, we made some different measurements using this tecniques. Some of this measurements were compared with that found in references in order to try to find some possible aplications which could be useful for future researches at our laboratory. / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Neste trabalho realizamos uma revisão do funcionamento do Microscópio Raman Confocal Multipontas com Campo Próximo e Força Atômica modelo Multiview 4000 da empresa Nanonics. Atualmente, o microscópio Multiview 4000 do Laboratório de Caracterização e Microscopia de Materiais (LCMMAT) ainda não se encontra operando aos 100%. Ele encontra-se em fase de montagem, estando disponível hoje em dia para uso só o Microscópio de Força Atômica (AFM) e o Microscópio de Campo Próximo (SNOM) nos modos reflexão e transmissão. Este modelo de microscópio, o qual possui três ponteiras que são capazes de fazer medidas em simultâneo de AFM, C-AFM, SNOM e microscopia Raman Confocal, alem de poder fazer nanolitografia, é o primeiro a ser instalado na America Latina. Durante a realização do presente trabalho, estudamos a estrutura do microscópio, como ele realiza as medidas destas duas técnicas e como elas são feitas. No estudo estrutural do Microscópio foram descritos os princípios físicos que são usados para a formação da imagem, alem dos diferentes tipos de circuitos eletrônicos usados em equipamentos de este tipo. Explicou-se passo a passo como são feitas as medidas de AFM e de SNOM. Estudamos também como é feito o analise e o processamento das imagens. Finalmente foram mostradas algumas imagens que foram feitas usando o microscópio, e comparou-se com alguns resultados encontrados na bibliografia a fim de encontrar possíveis aplicações de cada uma das amostras aqui mostradas.
8

Shape estimation of specular objects from multiview images / Estimation de la forme d'objets spéculaires à partir d'un système multi-vues

Chari, Visesh 20 November 2012 (has links)
Un des modèles les plus simples de surface de réfraction est une surface plane. Bien que sa présence soit omniprésente dans notre monde sous la forme de vitres transparentes, de fenêtres, ou la surface d'eau stagnante, très peu de choses sont connues sur la géométrie multi-vues causée par la réfraction d'une telle surface. Dans la première partie de cette thèse, nous analysons la géométrie à vues multiple d'une surface réfractive. Nous considérons le cas où une ou plusieurs caméras dans un milieu (p. ex. l'air) regardent une scène dans un autre milieu (p. ex. l'eau), avec une interface plane entre ces deux milieux. Le cas d'une photo sous-marine, par exemple, correspond à cette description. Comme le modèle de projection perspectif ne correspond pas à ce scenario, nous dérivons le modèle de caméra et sa matrice de projection associée. Nous montrons que les lignes 3D de la scène correspondent à des courbes quartiques dans les images. Un point intéressant à noter à propos de cette configuration est que si l'on considère un indice de réfraction homogène, alors il existe une courbe unique dans l'image pour chaque ligne 3D du monde. Nous décrivons et développons ensuite des éléments de géométrie multi-vues telles que les matrices fondamentales ou d'homographies liées à la scène, et donnons des éléments pour l'estimation de pose des caméras à partir de plusieurs points de vue. Nous montrons également que lorsque le milieu est plus dense, la ligne d'horizon correspond à une conique qui peut être décomposer afin d'en déduire les paramètres de l'interface. Ensuite, nous étendons notre approche en proposant des algorithmes pour estimer la géométrie de plusieurs surfaces planes refractives à partir d'une seule image. Un exemple typique d'un tel scenario est par exemple lorsque l'on regarde à travers un aquarium. Nous proposons une méthode simple pour calculer les normales de telles surfaces étant donné divers scenari, en limitant le système à une caméra axiale. Cela permet dans notre cas d'utiliser des approches basées sur ransac comme l'algorithme “8 points” pour le calcul de matrice fondamentale, d'une manière similaire à l'estimation de distortions axiales de la littérature en vision par ordinateur. Nous montrons également que le même modèle peut être directement adapté pour reconstruire des surfaces réflectives sous l'hypothèse que les surfaces soient planes par morceaux. Nous présentons des résultats de reconstruction 3D encourageants, et analysons leur précision. Alors que les deux approches précédentes se focalisent seulement sur la reconstruction d'une ou plusieurs surfaces planes réfractives en utilisant uniquement l'information géométrique, les surfaces spéculaires modifient également la manière dont l'énergie lumineuse à la surface est redistribuée. Le modèle sous-jacent correspondant peut être expliqué par les équations de Fresnel. En exploitant à la fois cette information géométrique et photométrique, nous proposons une méthode pour reconstruire la forme de surfaces spéculaires arbitraires. Nous montrons que notre approche implique un scenario d'acquisition simple. Tout d'abord, nous analysons plusieurs cas minimals pour la reconstruction de formes, et en déduisons une nouvelle contrainte qui combine la géométrie et la théorie de Fresnel à propos des surfaces transparentes. Ensuite, nous illustrons la nature complémentaire de ces attributs qui nous aident à obtenir une information supplémentaire sur l'objet, qu'il est difficile d'avoir autrement. Finalement, nous proposons une discussion sur les aspects pratiques de notre algorithme de reconstruction, et présentons des résultats sur des données difficiles et non triviales. / The task of understanding, 3D reconstruction and analysis of the multiple view geometry related to transparent objects is one of the long standing challenging problems in computer vision. In this thesis, we look at novel approaches to analyze images of transparent surfaces to deduce their geometric and photometric properties. At first, we analyze the multiview geometry of the simple case of planar refraction. We show how the image of a 3D line is a quartic curve in an image, and thus derive the first imaging model that accounts for planar refraction. We use this approach to then derive other properties that involve multiple cameras, like fundamental and homography matrices. Finally, we propose approaches to estimate the refractive surface parameters and camera poses, given images. We then extend our approach to derive algorithms for recovering the geometry of multiple planar refractive surfaces from a single image. We propose a simple technique to compute the normal of such surfaces given in various scenarios, by equating our setup to an axial camera. We then show that the same model could be used to reconstruct reflective surfaces using a piecewise planar assumption. We show encouraging 3D reconstruction results, and analyse the accuracy of results obtained using this approach. We then focus our attention on using both geometric and photometric cues for reconstructing transparent 3D surfaces. We show that in the presence of known illumination, we can recover the shape of such objects from single or multiple views. The cornerstone of our approach are the Fresnel equations, and we both derive and analyze their use for 3D reconstruction. Finally, we show our approach could be used to produce high quality reconstructions, and discuss other potential future applications.
9

Multiview Face Detection And Free Form Face Recognition For Surveillance

Anoop, K R 05 1900 (has links) (PDF)
The problem of face detection and recognition within a given database has become one of the important problems in computer vision. A simple approach for Face Detection in video is to run a learning based face detector every frame. But such an approach is computationally expensive and completely ignores the temporal continuity present in videos. Moreover the search space can be reduced by utilizing visual cues extracted based on the relevant task at hand(top down approach). Once detection is done next step is to perform a face recognition based on the available database. But the faces detected from face detect or output is neither aligned nor well cropped and is prone to scale change. We call such faces as free form faces. But the current existing algorithms on face recognition assume faces to be properly aligned and cropped, and having the same scale as the faces in the database, which is highly constrained. In this thesis, we propose an integrated detect-track framework for Multiview face detection in videos. We overcome the limitations of the frame based approaches, by utilizing the temporal continuity present in videos and also incorporating the top down information of the task. We model the problem based on the concept from Experiential sampling [2]. This consists of determining certain key positions which are relevant to the task(face detection). These key positions are referred to as attention samples and Multiview face detection is performed only at these locations. These statistical samples are estimated based on the visual cues, past experience and the temporal continuity and is modeled as a Bayesian filtering problem, which is solved using Particle Filters. In order to detect all views we use a tracker integrated with the detector and come out with a novel track termination algorithm using the concepts from Track Before Detect(TBD)[26]. Such an approach is computationally efficient and also results in lower false positive rate. We provide experiments showing the efficiency of the integrated detect-track approach over the multiview face detector approach without a tracker. For free form face recognition we propose to use the concept of Principal Geodesic Analysis(PGA) of the Covariance descriptors obtained from Gabor filters. This is similar to Principal Component Analysis in Euclidean spaces (Covariance descriptors lie on a Riemannian manifold). Such a descriptor is robust to alignment and scaling problems and also are of lower dimensions. We also employ sparse modeling technique for Face recognition task using these Covariance descriptor which are dimensionally reduced by transforming them on to a tangent space, which we call PGA feature. Further, we improve upon the recognition results of linear sparse modeling, by non-linear mapping of the PGA features by employing “Kernel Trick” for these sparse models. We show that the Kernelized sparse models using the PGA features are indeed very efficient for free form face recognition by testing on two standard databases namely AR and YaleB database.
10

Error resilience and concealment in MVC video over wireless networks

Ibrahim, Abdulkareem B. January 2015 (has links)
Multi-view video is capable of presenting a full and accurate depth perception of a scene. The concept of multi-view video is becoming more useful especially in 3D display systems by enhancing the viewing of high resolution stereoscopic images from arbitrary viewpoints without the use of any special glasses. Like monoscopic video, the multi-view video is faced with different challenges such as: reliable compression, storage and bandwidth due to the increased number of views as well as the high sensitivity to transmission errors. All these may lead to a detrimental effect on the reconstructed views. The work in this thesis investigates the problems and challenges of transmission losses in a multi-view video bitstream over error prone wireless networks. Based on the network simulation results, the proposed technique is capable of addressing the problem of transmission losses. In practical wireless networks, transmission errors are inevitable and pose a serious challenge to the coded video data. The aim of this research effort is to examine the effect of these errors in a multi-view video bitstream when transmitted over a lossy channel. Moreover, this research work aims to develop a novel scheme that can make the multi-view coded videos more robust to transmission errors by minimizing the error effects and improving the perceptual quality. Multi-layer data partitioning as an error resilient technique is developed in JMVC 8.5 reference software in order to make the multi-view video bitstream more robust during transmission. In addition to that, we propose a simple decoding scheme that can support the decoding of the multi-layer data partitioning bitstream over channels with high error rate. The proposed technique is benchmarked with the already existing H.264/AVC data partitioning technique. The work in this thesis also employs the use of group of pictures as a coding parameter to investigate and reduce the effects of transmission errors in multi-view video transmitted over a very high error rate channel. The experiments are carried out with different error loss rates in order to evaluate the performance of these techniques in terms of perceptual quality when transmitted over a simulated erroneous channel. Errors are introduced using the Sirannon network simulator. The error performance of each technique is evaluated and analysed both objectively and subjectively after reconstruction. The results of the research investigation and simulation are presented and analysed in chapter six of the thesis.

Page generated in 0.0265 seconds