Spelling suggestions: "subject:"objectbased"" "subject:"object.based""
41 |
Primitive Direcursion and Difunctorial Semantics of Typed Object CalculusGlimming, Johan January 2007 (has links)
In the first part of this thesis, we contribute to the semantics of typed object calculus by giving (a) a category-theoretic denotational semantics using partial maps making use of an algebraic compactness assumption, (b) a notion of "wrappers'' by which algebraic datatypes can be represented as object types, and (c) proofs of computational soundness and adequacy of typed object calculus via Plotkin's FPC (with lazy operational semantics), thus making models of FPC suitable also for first-order typed object calculus (with recursive objects supporting method update, but not subtyping). It follows that a valid equation in the model induces operationally congruent terms in the language, so that program algebras can be studied. For (c), we also develop an extended first-order typed object calculus, and prove subject reduction. The second part of the thesis concerns recursion principles on datatypes including the untyped lambda calculus as a special case. Freyd showed that in certain domain theoretic categories, locally continuous functors have minimal invariants, which possess a structure that he termed dialgebra. This gives rise to a category of dialgebras and homomorphisms, where the minimal invariants are initial, inducing a powerful recursion scheme (direcursion) on a complete partial order. We identify a problem that appears when we translate (co)iterative functions to direcursion, and as a solution to this problem we develop a recursion scheme (primitive direcursion). This immediately gives a number of examples of direcursive functions, improving on the situation in the literature where only a few examples have appeared. By means of a case study, this line of work is connected to object calculus models. / Delarbete II är även publicerad som Teknisk rapport, 2007, Oct, No2.
|
42 |
Sensitivity of high-resolution satellite sensor imagery to regenerating forest age and site preparation for wildlife habitat analysisWunderle, Ame Leontina 11 April 2006
In west-central Alberta increased landscape fragmentation has lead to increased human use, having negative effects on wildlife such as the grizzly bear (<i>Ursus arctos</i> L.). Recently, grizzly bears in the Foothills Model Forest were found to select clear cuts of different age ranges as habitat and selected or avoided certain clear cuts depending on the site preparation process employed. Satellite remote sensing offers a practical and cost-effective method by which cut areas, their age, and site preparation activities can be quantified. This thesis examines the utility of spectral reflectance of SPOT-5 pansharpened imagery (2.5m spatial resolution) to identify and map 44 regenerating stands sampled in August 2005. Using object based classification with the Normalized Difference Moisture Index (NDMI), green, and short wave infrared (SWIR) bands, 90% accuracy can be achieved in the detection of forest disturbance. Forest structural parameters were used to calculate the structural complexity index (SCI), the first loading of a principal components analysis. The NDMI, first-order standard deviation and second-order correlation texture measures were better able to explain differences in SCI among the 44 forest stands (R2=0.74). The best window size for the texture measures was 5x5, indicating that this is a measure only detectable at a very high spatial resolution. Age classes of these cut blocks were analysed using linear discriminant analysis and best separated (82.5%) with the SWIR and green spectral bands, second order correlation under a 25x25 window, and the predicted SCI. Site preparation was best classified (90.9%) using the NDMI and homogeneity texture under a 5x5 window. Future applications from this research include the selection of high probability grizzly habitat for high spatial resolution imagery acquisition for detailed mapping initiatives.
|
43 |
Optimization of Segmentation-Based Video Sequence Coding Techniques. Application to content based functionalitiesMorros Rubio, Josep Ramon 23 December 2004 (has links)
En aquest treball s'estudia el problema de la compressió de video utilitzant funcionalitats basades en el contingut en el marc teòric dels sistemes de codificació de seqüències de video basats en regions. Es tracten bàsicament dos problemes: El primer està relacionat amb com es pot aconseguir una codificació òptima en sistemes de codificació de video basats en regions. En concret, es mostra com es pot utilitzar un metodologia de 'rate-distortion' en aquest tipus de problemes. El segon problema que es tracta és com introduir funcionalitats basades en el contingut en un d'aquests sistemes de codificació de video.La teoria de 'rate-distortion' defineix l'optimalitat en la codificació com la representació d'un senyal que, per una taxa de bits donada, resulta en una distorsió mínima al reconstruir el senyal. En el cas de sistemes de codificació basats en regions, això implica obtenir una partició òptima i al mateix temps, un repartiment òptim dels bits entre les diferents regions d'aquesta partició. Aquest problema es formalitza per sistemes de codificació no escalables i es proposa un algorisme per solucionar-lo. Aquest algorisme s'aplica a un sistema de codificació concret anomenat SESAME. En el SESAME, cada quadre de la seqüència de video es segmenta en un conjunt de regions que es codifiquen de forma independent. La segmentació es fa seguint criteris d'homogeneitat espaial i temporal. Per eliminar la redundància temporal, s'utilitza un sistema predictiu basat en la informació de moviment tant per la partició com per la textura. El sistema permet seguir l'evolució temporal de cada regió per tota la seqüència. Els resultats de la codificació són òptims (o quasi-òptims) pel marc donat en un sentit de 'rate-distortion'. El procés de codificació inclou trobar una partició òptima i també trobar la tècnica de codificació i nivell de qualitat més adient per cada regió. Més endavant s'investiga el problema de codificació de video en sistemes amb escalabilitat i que suporten funcionalitats basades en el contingut. El problema es generalitza incloent en l'esquema de codificació les dependències espaials i temporals entre els diferents quadres o entre les diferents capes d'escalabilitat. En aquest cas, la solució requereix trobar la partició òptima i les tècniques de codificació de textura òptimes tant per la capa base com per la capa de millora. A causa de les dependències que hi ha entre aquestes capes, la partició i el conjunt de tècniques de codificació per la capa de millora dependran de les decisions preses en la capa base. Donat que aquest tipus de solucions generalment són molt costoses computacionalment, també es proposa una solució que no té en compte aquestes dependències.Els algorismes obtinguts s'apliquen per extendre SESAME. El sistema de codificació extès, anomenat XSESAME suporta diferents tipus d'escalabilitat (PSNR, espaial i temporal) així com funcionalitats basades en el contingut i la possibilitat de seguiment d'objectes a través de la seqüència de video. El sistema de codificació permet utilitzar dos modes diferents pel que fa a la selecció de les regions de la partició de la capa de millora: El primer mode (supervisat) està pensat per utilitzar funcionalitats basades en el contingut. El segon mode (no supervisat) no suporta funcionalitats basades en el contingut i el seu objectiu és simplement obtenir una codificació òptima a la capa de millora.Un altre tema que s'ha investigat és la integració d'un mètode de seguiment d'objectes en el sistema de codificació. En el cas general, el seguiment d'objectes en seqüències de video és un problema molt complex. Si a més aquest seguiment es vol integrar en un sistema de codificació apareixen problemes addicionals degut a que els requisits necessaris per obtenir eficiència en la codificació poden entrar en conflicte amb els requisits per una bona precisió en el seguiment d'objectes. Aquesta aparent incompatibilitat es soluciona utilitzant un enfocament basat en una doble partició de cada quadre de la seqüència. La partició que s'utilitza per la codificació es resegmenta utilitzant criteris purament espaials. Al projectar aquesta segona partició permet una millor adaptació dels contorns de l'objecte a seguir. L'excés de regions que implicaria aquesta re-segmentació s'elimina amb una etapa de fusió de regions realitzada a posteriori. / En este trabajo se estudia el problema de la compresión de vídeo utilizando funcionalidades basadas en el contenido en el marco teórico de los sistemas de codificación de secuencias de vídeo basados en regiones. Se tratan básicamente dos problemas: El primero está relacionado con la obtención de una codificación óptima en sistemas de codificación de vídeo basados en regiones. En concreto, se muestra como se puede utilizar un metodología de 'rate-distortion' para este tipo de problemas. El segundo problema tratado es como introducir funcionalidades basadas en el contenido en uno de estos sistemas de codificación de vídeo.La teoría de 'rate-distortion' define la optimalidad en la codificación como la representación de una señal que, para un tasa de bits dada, resulta en una distorsión mínima al reconstruir la señal. En el caso de sistemas de codificación basados en regiones, esto implica obtener una partición óptima y al mismo tiempo, un reparto óptimo de los bits entre las diferentes regiones de esta partición. Este problema se formaliza para sistemas de codificación no escalables y se propone un algoritmo para solucionar este problema. Este algoritmo se aplica a un sistema de codificación concreto llamado SESAME. En SESAME, cada cuadro de la secuencia de vídeo se segmenta en un conjunto de regiones que se codifican de forma independiente. La segmentación se hace siguiendo criterios de homogeneidad espacial y temporal. Para eliminar la redundancia temporal, se utiliza un sistema predictivo basado en la información de movimiento tanto para la partición como para la textura. El sistema permite seguir la evolución temporal de cada región a lo largo de la secuencia. Los resultados de la codificación son óptimos (o casi-óptimos) para el marco dado en un sentido de 'rate-distortion'. El proceso de codificación incluye encontrar una partición óptima y también encontrar la técnica de codificación y nivel de calidad más adecuados para cada región.Más adelante se investiga el problema de la codificación de vídeo en sistemas con escalabilidad y que suporten funcionalidades basadas en el contenido. El problema se generaliza incluyendo en el esquema de codificación las dependencias espaciales y temporales entre los diferentes cuadros o entre las diferentes capas de escalabilidad. En este caso, la solución requiere encontrar la partición óptima y las técnicas de codificación de textura óptimas tanto para la capa base como para la capa de mejora. A causa de les dependencias que hay entre estas capas, la partición y el conjunto de técnicas de codificación para la capa de mejora dependerán de las decisiones tomadas en la capa base. Dado que este tipo de soluciones generalmente son muy costosas computacionalmente, también se propone una solución que no tiene en cuenta estas dependencias.Los algoritmos obtenido se usan en la extensión de SESAME. El sistema de codificación extendido, llamado XSESAME soporta diferentes tipos de escalabilidad (PSNR, espacial y temporal) así como funcionalidades basadas en el contenido y la posibilidad de seguimiento de objetos a través de la secuencia de vídeo. El sistema de codificación permite utilizar dos modos diferentes por lo que hace referencia a la selección de les regiones de la partición de la capa de mejora: El primer modo (supervisado) está pensado para utilizar funcionalidades basadas en el contenido. El segundo modo (no supervisado) no soporta funcionalidades basadas en el contenido y su objetivo es simplemente obtener una codificación óptima en la capa de mejora.Otro tema investigado es la integración de un método de seguimiento de objetos en el sistema de codificación.En el caso general, el seguimiento de objetos en secuencias de vídeo es un problema muy complejo. Si este seguimiento se quiere integrar en un sistema de codificación aparecen problemas adicionales debido a que los requisitos necesarios para obtener eficiencia en la codificación pueden entrar en conflicto con los requisitos para obtener una buena precisión en el seguimiento de objetos. Esta aparente incompatibilidad se soluciona usando un enfoque basado en una doble partición de cada cuadro de la secuencia. La partición que se usa para codificar se resegmenta usando criterios puramente espaciales. Proyectando esta segunda partición se obtiene una mejor adaptación de los contornos al objeto a seguir. El exceso de regiones que implicaría esta resegmentación se elimina con una etapa de fusión de regiones realizada a posteriori. / This work addresses the problem of video compression with content-based functionalities in the framework of segmentation-based video coding systems. Two major problems are considered. The first one is related with coding optimality in segmentation-based coding systems. Regarding this subject, the feasibility of a rate-distortion approach for a complete region-based coding system is shown. The second one is how to address content-based functionalities in the coding system proposed as a solution of the first problem. Optimality, as defined in the framework of rate-distortion theory, deals with obtaining a representation of the video sequence that leads to a minimum distortion of the coded signal for a given bit budget. In the case of segmentation-based coding systems this means to obtain an 'optimal' partition together with the best coding technique for each region of this partition so that the result is optimal in an operational rate-distortion sense. The problem is formalized for independent, non-scalable coding.An algorithm to solve this problem is provided as well.This algorithms is applied to a specific segmentation-based coding system, the so called SESAME. In SESAME, each frame is segmented into a set of regions, that are coded independently. Segmentation involves both spatial and motion homogeneity criteria. To exploit temporal redundancy, a prediction for both the partition and the texture of the current frame is created by using motion information. The time evolution of each region is defined along the sequence (time tracking). The results are optimal (or near-optimal) for the given framework in a rate-distortion sense. The definition of the coding strategy involves a global optimization of the partition as well as of the coding technique/quality level for each region. Later, the investigation is also extended to the problem of video coding optimization in the framework of a scalable video coding system that can address content-based functionalities. The focus is set in the various types of content-based scalability and object tracking. The generality of the problem has also been extended by including the spatial and temporal dependencies between frames and scalability layers into the optimization schema. In this case the solution implies finding the optimal partition and set of quantizers for both the base and the enhancement layers. Due to the coding dependencies of the enhancement layer with respect to the base layer, the partition and the set of quantizers of the enhancement layer depend on the decisions made on the base layer. Also, a solution for the independent optimization problem (i.e. without tacking into account dependencies between different frames of scalability layers) has been proposed to reduce the computational complexity. These solutions are used to extend the SESAME coding system. The extended coding system, named XSESAME, supports different types of scalability (PSNR, Spatial and temporal) as well as content-based functionalities, such as content-based scalability and object tracking. Two different operating modes for region selection in the enhancement layer have been presented: One (supervised) aimed at providing content-based functionalities at the enhancement layer and the other (unsupervised) aimed at coding efficiency, without content-based functionalities. Integration of object tracking into the segmentation-based coding system is also investigated.In the general case, tracking is a very complex problem. If this capability has to be integrated into a coding system, additional problems arise due to conflicting requirements between coding efficiency and tracking accuracy. This is solved by using a double partition approach, where pure spatial criteria are used to re-segment the partition used for coding. The projection of the re-segmented partition results in more precise adaptation to object contours. A merging step is performed a posteriori to eliminate the excess of regions originated by the re-segmentation.
|
44 |
Sensitivity of high-resolution satellite sensor imagery to regenerating forest age and site preparation for wildlife habitat analysisWunderle, Ame Leontina 11 April 2006 (has links)
In west-central Alberta increased landscape fragmentation has lead to increased human use, having negative effects on wildlife such as the grizzly bear (<i>Ursus arctos</i> L.). Recently, grizzly bears in the Foothills Model Forest were found to select clear cuts of different age ranges as habitat and selected or avoided certain clear cuts depending on the site preparation process employed. Satellite remote sensing offers a practical and cost-effective method by which cut areas, their age, and site preparation activities can be quantified. This thesis examines the utility of spectral reflectance of SPOT-5 pansharpened imagery (2.5m spatial resolution) to identify and map 44 regenerating stands sampled in August 2005. Using object based classification with the Normalized Difference Moisture Index (NDMI), green, and short wave infrared (SWIR) bands, 90% accuracy can be achieved in the detection of forest disturbance. Forest structural parameters were used to calculate the structural complexity index (SCI), the first loading of a principal components analysis. The NDMI, first-order standard deviation and second-order correlation texture measures were better able to explain differences in SCI among the 44 forest stands (R2=0.74). The best window size for the texture measures was 5x5, indicating that this is a measure only detectable at a very high spatial resolution. Age classes of these cut blocks were analysed using linear discriminant analysis and best separated (82.5%) with the SWIR and green spectral bands, second order correlation under a 25x25 window, and the predicted SCI. Site preparation was best classified (90.9%) using the NDMI and homogeneity texture under a 5x5 window. Future applications from this research include the selection of high probability grizzly habitat for high spatial resolution imagery acquisition for detailed mapping initiatives.
|
45 |
A Supervised Approach For The Estimation Of Parameters Of Multiresolution Segementation And Its Application In Building Feature Extraction From VHR ImageryDey, Vivek 28 September 2011 (has links)
With the advent of very high spatial resolution (VHR) satellite, spatial details within the image scene have increased considerably. This led to the development of object-based image analysis (OBIA) for the analysis of VHR satellite images. Image segmentation is the fundamental step for OBIA. However, a large number of techniques exist for RS image segmentation. To identify the best ones for VHR imagery, a comprehensive literature review on image segmentation is performed. Based on that review, it is found that the multiresolution segmentation, as implemented in the commercial software eCognition, is the most widely-used technique and has been successfully applied for wide variety of VHR images. However, the multiresolution segmentation suffers from the parameter estimation problem. Therefore, this study proposes a solution to the problem of the parameter estimation for improving its efficiency in VHR image segmentation.
The solution aims to identify the optimal parameters, which correspond to optimal
segmentation. The solution to the parameter estimation is drawn from the Equations
related to the merging of any two adjacent objects in multiresolution segmentation. The
solution utilizes spectral, shape, size, and neighbourhood relationships for a supervised solution. In order to justify the results of the solution, a global segmentation accuracy evaluation technique is also proposed. The solution performs excellently with the VHR images of different sensors, scenes, and land cover classes.
In order to justify the applicability of solution to a real life problem, a building
detection application based on multiresolution segmentation from the estimated
parameters, is carried out. The accuracy of the building detection is found nearly to be
eighty percent. Finally, it can be concluded that the proposed solution is fast, easy to
implement and effective for the intended applications.
|
46 |
DIGITAL INPAINTING ALGORITHMS AND EVALUATIONMahalingam, Vijay Venkatesh 01 January 2010 (has links)
Digital inpainting is the technique of filling in the missing regions of an image or a video using information from surrounding area. This technique has found widespread use in applications such as restoration, error recovery, multimedia editing, and video privacy protection. This dissertation addresses three significant challenges associated with the existing and emerging inpainting algorithms and applications. The three key areas of impact are 1) Structure completion for image inpainting algorithms, 2) Fast and efficient object based video inpainting framework and 3) Perceptual evaluation of large area image inpainting algorithms.
One of the main approach of existing image inpainting algorithms in completing the missing information is to follow a two stage process. A structure completion step, to complete the boundaries of regions in the hole area, followed by texture completion process using advanced texture synthesis methods. While the texture synthesis stage is important, it can be argued that structure completion aspect is a vital component in improving the perceptual image inpainting quality. To this end, we introduce a global structure completion algorithm for completion of missing boundaries using symmetry as the key feature. While existing methods for symmetry completion require a-priori information, our method takes a non-parametric approach by utilizing the invariant nature of curvature to complete missing boundaries. Turning our attention from image to video inpainting, we readily observe that existing video inpainting techniques have evolved as an extension of image inpainting techniques. As a result, they suffer from various shortcoming including, among others, inability to handle large missing spatio-temporal regions, significantly slow execution time making it impractical for interactive use and presence of temporal and spatial artifacts. To address these major challenges, we propose a fundamentally different method based on object based framework for improving the performance of video inpainting algorithms. We introduce a modular inpainting scheme in which we first segment the video into constituent objects by using acquired background models followed by inpainting of static background regions and dynamic foreground regions. For static background region inpainting, we use a simple background replacement and occasional image inpainting. To inpaint dynamic moving foreground regions, we introduce a novel sliding-window based dissimilarity measure in a dynamic programming framework. This technique can effectively inpaint large regions of occlusions, inpaint objects that are completely missing for several frames, change in size and pose and has minimal blurring and motion artifacts. Finally we direct our focus on experimental studies related to perceptual quality evaluation of large area image inpainting algorithms. The perceptual quality of large area inpainting technique is inherently a subjective process and yet no previous research has been carried out by taking the subjective nature of the Human Visual System (HVS). We perform subjective experiments using eye-tracking device involving 24 subjects to analyze the effect of inpainting on human gaze. We experimentally show that the presence of inpainting artifacts directly impacts the gaze of an unbiased observer and this in effect has a direct bearing on the subjective rating of the observer. Specifically, we show that the gaze energy in the hole regions of an inpainted image show marked deviations from normal behavior when the inpainting artifacts are readily apparent.
|
47 |
Virtualization services: scalable methods for virtualizing multicore systemsRaj, Himanshu 10 January 2008 (has links)
Multi-core technology is bringing parallel processing capabilities
from servers to laptops and even handheld devices. At the same time,
platform support for system virtualization is making it easier to
consolidate server and client resources, when and as needed by
applications. This consolidation is achieved by dynamically mapping
the virtual machines on which applications run to underlying
physical machines and their processing cores. Low cost processor and
I/O virtualization methods efficiently scaled to different numbers of
processing cores and I/O devices are key enablers of such consolidation.
This dissertation develops and evaluates new methods for scaling
virtualization functionality to multi-core and future many-core systems.
Specifically, it re-architects virtualization functionality to improve
scalability and better exploit multi-core system resources. Results
from this work include a self-virtualized I/O abstraction, which
virtualizes I/O so as to flexibly use different platforms' processing
and I/O resources. Flexibility affords improved performance and resource
usage and most importantly, better scalability than that offered by
current I/O virtualization solutions. Further, by describing system virtualization as a
service provided to virtual machines and the underlying computing platform,
this service can be enhanced to provide new and innovative functionality.
For example, a virtual device may provide obfuscated data to guest operating
systems to maintain data privacy; it could mask differences in device
APIs or properties to deal with heterogeneous underlying resources; or it
could control access to data based on the ``trust' properties of the
guest VM.
This thesis demonstrates that extended virtualization services are
superior to existing operating system or user-level implementations
of such functionality, for multiple reasons. First, this solution
technique makes more efficient use of key performance-limiting resource in
multi-core systems, which are memory and I/O bandwidth. Second, this
solution technique better exploits the parallelism inherent in multi-core
architectures and exhibits good scalability properties, in
part because at the hypervisor level, there is greater control in precisely
which and how resources are used to realize extended virtualization services.
Improved control over resource usage makes it possible to provide
value-added functionalities for both guest VMs and the platform.
Specific instances of virtualization services described in this thesis are the
network virtualization service that exploits heterogeneous processing cores,
a storage virtualization service that provides location transparent access
to block devices by extending
the functionality provided by network virtualization service, a multimedia
virtualization service that allows efficient media device sharing based on semantic
information, and an object-based storage service with enhanced access
control.
|
48 |
Filmmixning i ljudformatet Dolby Atmos : Processer inom produktion av objektbaserade filmmixar / Film mixing in the audio format Dolby Atmos: Processes in the production of object-based film mixesRonquist, Ludwig January 2018 (has links)
Denna studie bygger på en fallstudie där filmen Pool, regisserad av Anders Lennberg mixas i bioljudsformatet Dolby Atmos. Syftet är att få fördjupad kunskap om de processer och den potential för immersion och filmberättande som ingår i filmmixning i det objektbaserade ljudformatet Dolby Atmos. Studien har utgått från tre stycken forskningsfrågor: 1) Vilka kreativa och tekniska mixningsstrategier används i filmen Pool för att förhöja den potentiella graden av immersion? 2) Hur kan olika mixningsstrategier, specifika för det objektbaserade ljudformatet Dolby Atmos, utformas för att bidra till filmljudets berättande funktioner? Och 3) Hur ser produktionsprocesserna ut vid filmmixning i Dolby Atmos? I studien har metodansatsen forskning genom design, tillsammans med autoetnografi använts. Undersökningen har alltså varit att mixa filmen i två iterationer, studera mixningsprocesserna och valen som görs. Extern värdering har sedan gjorts av regissören samt en fokusgruppintervju med personer från ljud- och musikproduktionsprogrammet vid Högskolan Dalarna. Resultatet har blivit ett antal mixningsstrategier som applicerats för att förhöja immersion i filmen och bidra till ljudets berättande funktioner. Med panoreringsmöjligheterna och möjligheten att placera ljud i höjdled i Dolby Atmos har studien lyckats visa ett större kreativt utrymme för filmmixaren att skapa immersion och bidra till ljudets berättande funktioner.
|
49 |
Segmentação de movimento coerente aplicada à codificação de vídeos baseada em objetosSilva, Luciano Silva da January 2011 (has links)
A variedade de dispositivos eletrônicos capazes de gravar e reproduzir vídeos digitais vem crescendo rapidamente, aumentando com isso a disponibilidade deste tipo de informação nas mais diferentes plataformas. Com isso, se torna cada vez mais importante o desenvolvimento de formas eficientes de armazenamento, transmissão, e acesso a estes dados. Nesse contexto, a codificação de vídeos tem um papel fundamental ao compactar informação, otimizando o uso de recursos aplicados no armazenamento e na transmissão de vídeos digitais. Não obstante, tarefas que envolvem a análise de vídeos, manipulação e busca baseada em conteúdo também se tornam cada vez mais relevantes, formando uma base para diversas aplicações que exploram a riqueza da informação contida em vídeos digitais. Muitas vezes a solução destes problemas passa pela segmentação de vídeos, que consiste da divisão de um vídeo em regiões que apresentam homogeneidade segundo determinadas características, como por exemplo cor, textura, movimento ou algum aspecto semântico. Nesta tese é proposto um novo método para segmentação de vídeos em objetos constituintes com base na coerência de movimento de regiões. O método de segmentação proposto inicialmente identifica as correspondências entre pontos esparsamente amostrados ao longo de diferentes quadros do vídeo. Logo após, agrupa conjuntos de pontos que apresentam trajetórias semelhantes. Finalmente, uma classificação pixel a pixel é obtida a partir destes grupos de pontos amostrados. O método proposto não assume nenhum modelo de câmera ou de movimento global para a cena e/ou objetos, e possibilita que múltiplos objetos sejam identificados, sem que o número de objetos seja conhecido a priori. Para validar o método de segmentação proposto, foi desenvolvida uma abordagem para a codificação de vídeos baseada em objetos. Segundo esta abordagem, o movimento de um objeto é representado através de transformações afins, enquanto a textura e a forma dos objetos são codificadas simultaneamente, de modo progressivo. O método de codificação de vídeos desenvolvido fornece funcionalidades tais como a transmissão progressiva e a escalabilidade a nível de objeto. Resultados experimentais dos métodos de segmentação e codificação de vídeos desenvolvidos são apresentados, e comparados a outros métodos da literatura. Vídeos codificados segundo o método proposto são comparados em termos de PSNR a vídeos codificados pelo software de referência JM H.264/AVC, versão 16.0, mostrando a que distância o método proposto está do estado da arte em termos de eficiência de codificação, ao mesmo tempo que provê funcionalidades da codificação baseada em objetos. O método de segmentação proposto no presente trabalho resultou em duas publicações, uma nos anais do SIBGRAPI de 2007 e outra no períodico IEEE Transactions on Image Processing. / The variety of electronic devices for digital video recording and playback is growing rapidly, thus increasing the availability of such information in many different platforms. So, the development of efficient ways of storing, transmitting and accessing such data becomes increasingly important. In this context, video coding plays a key role in compressing data, optimizing resource usage for storing and transmitting digital video. Nevertheless, tasks involving video analysis, manipulation and content-based search also become increasingly relevant, forming a basis for several applications that exploit the abundance of information in digital video. Often the solution to these problems makes use of video segmentation, which consists of dividing a video into homogeneous regions according to certain characteristics such as color, texture, motion or some semantic aspect. In this thesis, a new method for segmentation of videos in their constituent objects based on motion coherence of regions is proposed. The proposed segmentation method initially identifies the correspondences of sparsely sampled points along different video frames. Then, it performs clustering of point sets that have similar trajectories. Finally, a pixelwise classification is obtained from these sampled point sets. The proposed method does not assume any camera model or global motion model to the scene and/or objects. Still, it allows the identification of multiple objects, without knowing the number of objects a priori. In order to validate the proposed segmentation method, an object-based video coding approach was developed. According to this approach, the motion of an object is represented by affine transformations, while object texture and shape are simultaneously coded, in a progressive way. The developed video coding method yields functionalities such as progressive transmission and object scalability. Experimental results obtained by the proposed segmentation and coding methods are presented, and compared to other methods from the literature. Videos coded by the proposed method are compared in terms of PSNR to videos coded by the reference software JM H.264/AVC, version 16.0, showing the distance of the proposed method from the sate of the art in terms of coding efficiency, while providing functionalities of object-based video coding. The segmentation method proposed in this work resulted in two publications, one in the proceedings of SIBGRAPI 2007 and another in the journal IEEE Transactions on Image Processing.
|
50 |
Algoritmo rápido para segmentação de vídeos utilizando agrupamento de clustersMonma, Yumi January 2014 (has links)
Este trabalho propõe um algoritmo rápido para segmentação de partes móveis em vídeo, tendo como base a detecção de volumes fechados no espaço tridimensional. O vídeo de entrada é pré-processado com um algoritmo de detecção de bordas baseado em linhas de nível para produzir os objetos. Os objetos detectados são agrupados utilizando uma combinação dos métodos de mean shift clustering e meta-agrupamento. Para diminuir o tempo de computação, somente alguns objetos e quadros são utilizados no agrupamento. Uma vez que a forma de detecção garante que os objetos persistem com o mesmo rótulo em múltiplos quadros, a seleção de quadros impacta pouco no resultado final. Dependendo da aplicação desejada os grupos podem ser refinados em uma etapa de pós-processamento. / This work presents a very fast algorithm to segmentation of moving parts in a video, based on detection of surfaces of the scene with closed contours. The input video is preprocessed with an edge detection algorithm based on level lines to produce the objects. The detected objects are clustered using a combination of mean shift clustering and ensemble clustering. In order decrease even more the computation time required, two methods can be used combined: object filtering by size and selecting only a few frames of the video. Since the detected objects are coherent in time, frame skipping does not affect the final result. Depending on the application the detected clusters can be refined using post processing steps.
|
Page generated in 0.0459 seconds