Global ETD Search

11	3D position estimation of sports players through multi-view tracking Vos, Robert (Robbie) 12 1900 (has links) Thesis (MSc (Mathematics))--University of Stellenbosch, 2010. / ENGLISH ABSTRACT: Extracting data from video streams and using the data to better understand the observed world allows many systems to automatically perform tasks that ordinarily needed to be completed by humans. One such problem with a wide range of applications is that of detecting and tracking people in a video sequence. This thesis looks speci cally at the problem of estimating the positions of players on a sports eld, as observed by a multi-view camera setup. Previous attempts at solving the problem are discussed, after which the problem is broken down into three stages: detection, 2D tracking and 3D position estimation. Possible solutions to each of the problems are discussed and compared to one another. Motion detection is found to be a fast and e ective solution to the problem of detecting players in a single view. Tracking players in 2D image coordinates is performed by implementing a hierarchical approach to the particle lter. The hierarchical approach is chosen as it improves the computational complexity without compromising on accuracy. Finally 3D position estimation is done by multiview, forward projection triangulation. The components are combined to form a full system that is able to nd and locate players on a sports eld. The overall system that is developed is able to detect, track and triangulate player positions. The components are tested individually and found to perform well. By combining the components and introducing feedback between them the results of the individual components as well as those of the overall system are improved. / AFRIKAANSE OPSOMMING: Deur data uit 'n video-stroom te ontrek, en die data te gebruik om die wêreld wat waargeneem word beter te verstaan, kan baie rekenaarstelsels take outomaties voltooi wat voorheen deur 'n mens sou gedoen moes word. Een so 'n probleem wat 'n wye toepassingsveld het, is om mense te vind en te volg in 'n video. Hierdie tesis kyk spesi ek daarna om die posisie van spelers op 'n sportveld te vind, gegee 'n klomp kameras wat na die veld kyk. Daar word na vorige stelsels wat hierdie probleem probeer oplos gekyk, waarna die probleem in drie dele opgedeel word: vind die spelers, volg die spelers in 2D en skat die posisie van die spelers in 3D. Moontlike oplossings vir elk van hierdie dele word bespreek en vergelyk met mekaar. Daar word gevind dat om beweging te identi seer 'n eenvoudige manier is om die spelers te vind. Hulle word dan gevolg in 2D beeldkoördinate deur gebruik te maak van 'n hiërargiese implementasie van die partikel- lter. Die hiërargiese implementering word gekies omdat dit die spoed van die partikel- lter verbeter, sonder om die akkuraatheid te verswak. Laastens word die 3D posisie gevind deur multi-sigpunt, voorwaartse projeksie triangulering. Die verskillende komponente word kombineer om 'n volledige stelsel te vorm wat spelers kan vind en plaas op 'n veld. Die volledige stelsel wat ontwikkel is, is in staat om spelers te vind, volg en hulle posisies te bepaal. Elk van die individuele komponente word getoets, en daar word gevind dat hulle goed op hulle eie werk. Deur die komponente te kombineer en terugvoer tussen verskillende komponente te bewerkstellig word die resultate van die individuele komponente, sowel as dié van die volledige stelsel nog verbeter. 3D Tracking Multi-view tracking Sport Dissertations -- Mathematics Theses -- Mathematics Athletes
12	End-to-end 3D video communication over heterogeneous networks Mohib, Hamdullah January 2014 (has links) Three-dimensional technology, more commonly referred to as 3D technology, has revolutionised many fields including entertainment, medicine, and communications to name a few. In addition to 3D films, games, and sports channels, 3D perception has made tele-medicine a reality. By the year 2015, 30% of the all HD panels at home will be 3D enabled, predicted by consumer electronics manufacturers. Stereoscopic cameras, a comparatively mature technology compared to other 3D systems, are now being used by ordinary citizens to produce 3D content and share at a click of a button just like they do with the 2D counterparts via sites like YouTube. But technical challenges still exist, including with autostereoscopic multiview displays. 3D content requires many complex considerations--including how to represent it, and deciphering what is the best compression format--when considering transmission or storage, because of its increased amount of data. Any decision must be taken in the light of the available bandwidth or storage capacity, quality and user expectations. Free viewpoint navigation also remains partly unsolved. The most pressing issue getting in the way of widespread uptake of consumer 3D systems is the ability to deliver 3D content to heterogeneous consumer displays over the heterogeneous networks. Optimising 3D video communication solutions must consider the entire pipeline, starting with optimisation at the video source to the end display and transmission optimisation. Multi-view offers the most compelling solution for 3D videos with motion parallax and freedom from wearing headgear for 3D video perception. Optimising multi-view video for delivery and display could increase the demand for true 3D in the consumer market. This thesis focuses on an end-to-end quality optimisation in 3D video communication/transmission, offering solutions for optimisation at the compression, transmission, and decoder levels. 006.6
13	Shape Estimation under General Reflectance and Transparency Morris, Nigel Jed Wesley 31 August 2011 (has links) In recent years there has been significant progress in increasing the scope, accuracy and flexibility of 3D photography methods. However there are still significant open problems where complex optical properties of mirroring or transparent objects cause many assumptions of traditional algorithms to break down. In this work we present three approaches that attempt to deal with some of these challenges using a few camera views and simple illumination. First, we consider the problem of reconstructing the 3D position and surface normal of points on a time-varying refractive surface. We show that two viewpoints are sufficient to solve this problem in the general case, even if the refractive index is unknown. We introduce a novel ``stereo matching'' criterion called refractive disparity, appropriate for refractive scenes, and develop an optimization-based algorithm for individually reconstructing the position and normal of each point projecting to a pixel in the input views. Second, we present a new method for reconstructing the exterior surface of a complex transparent scene with inhomogeneous interior. We capture images from each viewpoint while moving a proximal light source to a 2D or 3D set of positions giving a 2D (or 3D) dataset per pixel, called the scatter-trace. The key is that while light transport within a transparent scene's interior can be exceedingly complex, a pixel's scatter trace has a highly-constrained geometry that reveals the direct surface reflection, and leads to a simple ``Scatter-trace stereo'' algorithm for computing the exterior surface geometry. Finally, we develop a reconstruction system for scenes with reflectance properties ranging from diffuse to specular. We capture images of the scene as it is illuminated by a planar, spatially non-uniform light source. Then we show that if the source is translated to a parallel position away from the scene, a particular scene point integrates a magnified region of light from the plane. We observe this magnification at each pixel and show how it relates to the source-relative depth of the surface. Next we show how calibration relating the camera and source planes allows for robustness to specular objects and recovery of 3D surface points. Computer Vision 3D reconstruction Transparency Stereo Multi-view Dynamic Specular 0984
14	Applications of Structure-from-Motion Photogrammetry to Fluvial Geomorphology Dietrich, James 14 January 2015 (has links) Since 2011, Structure-from-Motion Multi-View Stereo Photogrammetry (SfM or SfM-MVS) has gone from an overlooked computer vision technique to an emerging methodology for collecting low-cost, high spatial resolution three-dimensional data for topographic or surface modeling in many academic fields. This dissertation examines the applications of SfM to the field of fluvial geomorphology. My research objectives for this dissertation were to determine the error and uncertainty that are inherent in SfM datasets, the use of SfM to map and monitor geomorphic change in a small river restoration project, and the use of SfM to map and extract data to examine multi-scale geomorphic patterns for 32 kilometers of the Middle Fork John Day River. SfM provides extremely consistent results, although there are systematic errors that result from certain survey patterns that need to be accounted for in future applications. Monitoring change on small restoration stream channels with SfM gave a more complete spatial perspective than traditional cross sections on small-scale geomorphic change. Helicopter-based SfM was an excellent platform for low-cost, large scale fluvial remote sensing, and the data extracted from the imagery provided multi-scalar perspectives of downstream patterns of channel morphology. This dissertation makes many recommendations for better and more efficient SfM surveys at all of the spatial scales surveyed. By implementing the improvements laid out here and by other authors, SfM will be a powerful tool that will make 3D data collection more accessible to the wider geomorphic community. Fluvial geomorphology Fluvial remote sensing Multi-view photogrammetry River restoration monitoring Structure from motion
15	Aprendizado de máquina parcialmente supervisionado multidescrição para realimentação de relevância em recuperação de informação na WEB / Partially supervised multi-view machine learning for relevance feedback in WEB information retrieval Soares, Matheus Victor Brum 28 May 2009 (has links) Atualmente, o meio mais comum de busca de informações é a WEB. Assim, é importante procurar métodos eficientes para recuperar essa informação. As máquinas de busca na WEB usualmente utilizam palavras-chaves para expressar uma busca. Porém, não é trivial caracterizar a informação desejada. Usuários diferentes com necessidades diferentes podem estar interessados em informações relacionadas, mas distintas, ao realizar a mesma busca. O processo de realimentação de relevância torna possível a participação ativa do usuário no processo de busca. A idéia geral desse processo consiste em, após o usuário realizar uma busca na WEB permitir que indique, dentre os sites encontrados, quais deles considera relevantes e não relevantes. A opinião do usuário pode então ser considerada para reordenar os dados, de forma que os sites relevantes para o usuário sejam retornados mais facilmente. Nesse contexto, e considerando que, na grande maioria dos casos, uma consulta retorna um número muito grande de sites WEB que a satisfazem, das quais o usuário é responsável por indicar um pequeno número de sites relevantes e não relevantes, tem-se o cenário ideal para utilizar aprendizado parcialmente supervisionado, pois essa classe de algoritmos de aprendizado requer um número pequeno de exemplos rotulados e um grande número de exemplos não-rotulados. Assim, partindo da hipótese que a utilização de aprendizado parcialmente supervisionado é apropriada para induzir um classificador que pode ser utilizado como um filtro de realimentação de relevância para buscas na WEB, o objetivo deste trabalho consiste em explorar algoritmos de aprendizado parcialmente supervisionado, mais especificamente, aqueles que utilizam multidescrição de dados, para auxiliar na recuperação de sites na WEB. Para avaliar esta hipótese foi projetada e desenvolvida uma ferramenta denominada C-SEARCH que realiza esta reordenação dos sites a partir da indicação do usuário. Experimentos mostram que, em casos que buscas genéricas, que o resultado possui um bom diferencial entre sites relevantes e irrelevantes, o sistema consegue obter melhores resultados para o usuário / As nowadays the WEB is the most common source of information, it is very important to find reliable and efficient methods to retrieve this information. However, the WEB is a highly volatile and heterogeneous information source, thus keyword based querying may not be the best approach when few information is given. This is due to the fact that different users with different needs may want distinct information, although related to the same keyword query. The process of relevance feedback makes it possible for the user to interact actively with the search engine. The main idea is that after performing an initial search in the WEB, the process enables the user to indicate, among the retrieved sites, a small number of the ones considered relevant or irrelevant according with his/her required information. The users preferences can then be used to rearrange sites returned in the initial search, so that relevant sites are ranked first. As in most cases a search returns a large amount of WEB sites which fits the keyword query, this is an ideal situation to use partially supervised machine learning algorithms. This kind of learning algorithms require a small number of labeled examples, and a large number of unlabeled examples. Thus, based on the assumption that the use of partially supervised learning is appropriate to induce a classifier that can be used as a filter for relevance feedback in WEB information retrieval, the aim of this work is to explore the use of a partially supervised machine learning algorithm, more specifically, one that uses multi-description data, in order to assist the WEB search. To this end, a computational tool called C-SEARCH, which performs the reordering of the searched results using the users feedback, has been implemented. Experimental results show that in cases where the keyword query is generic and there is a clear distinction between relevant and irrelevant sites, which is recognized by the user, the system can achieve good results Aprendizado de máquina Information retrieval Machine learning Mineração de textos Multi-view Multidescrição Recuperação de informação Text mining
16	Statistical analysis of neuronal data : development of quantitative frameworks and application to microelectrode array analysis and cell type classification Cotterill, Ellese January 2017 (has links) With increasing amounts of data being collected in various fields of neuroscience, there is a growing need for robust techniques for the analysis of this information. This thesis focuses on the evaluation and development of quantitative frameworks for the analysis and classification of neuronal data from a variety of contexts. Firstly, I investigate methods for analysing spontaneous neuronal network activity recorded on microelectrode arrays (MEAs). I perform an unbiased evaluation of the existing techniques for detecting ‘bursts’ of neuronal activity in these types of recordings, and provide recommendations for the robust analysis of bursting activity in a range of contexts using both existing and adapted burst detection methods. These techniques are then used to analyse bursting activity in novel recordings of human induced pluripotent stem cell-derived neuronal networks. Results from this review of burst analysis methods are then used to inform the development of a framework for characterising the activity of neuronal networks recorded on MEAs, using properties of bursting as well as other common features of spontaneous activity. Using this framework, I examine the ontogeny of spontaneous network activity in in vitro neuronal networks from various brain regions, recorded on both single and multi-well MEAs. I also develop a framework for classifying these recordings according to their network type, based on quantitative features of their activity patterns. Next, I take a multi-view approach to classifying neuronal cell types using both the morphological and electrophysiological features of cells. I show that a number of multi-view clustering algorithms can more reliably differentiate between neuronal cell types in two existing data sets, compared to single-view clustering techniques applied to either the morphological or electrophysiological ‘view’ of the data, or a concatenation of the two views. To close, I examine the properties of the cell types identified by these methods. 612.8
17	The Effects of a Multi-View Camera System on Spatial Cognition, Cognitive Workload and Performance in a Minimally Invasive Surgery Task January 2019 (has links) abstract: Minimally invasive surgery is a surgical technique that is known for its reduced patient recovery time. It is a surgical procedure done by using long reached tools and an endoscopic camera to operate on the body though small incisions made near the point of operation while viewing the live camera feed on a nearby display screen. Multiple camera views are used in various industries such as surveillance and professional gaming to allow users a spatial awareness advantage as to what is happening in the 3D space that is presented to them on 2D displays. The concept has not effectively broken into the medical industry yet. This thesis tests a multi-view camera system in which three cameras are inserted into a laparoscopic surgical training box along with two surgical instruments, to determine the system impact on spatial cognition, perceived cognitive workload, and the overall time needed to complete the task, compared to one camera viewing the traditional set up. The task is a non-medical task and is one of five typically used to train surgeons’ motor skills when initially learning minimally invasive surgical procedures. The task is a peg transfer and will be conducted by 30 people who are randomly assigned to one of two conditions; one display and three displays. The results indicated that when three displays were present the overall time initially using them to complete a task was slower; the task was perceived to be completed more easily and with less strain; and participants had a slightly higher performance rate. / Dissertation/Thesis / Masters Thesis Human Systems Engineering 2019 Cognitive psychology Surgery Laparoscopic Minimally Invasive Surgery Multi-View Camera System Spatial Cognition Surgical Training Box
18	Inferring 3D Structure with a Statistical Image-Based Shape Model Grauman, Kristen, Shakhnarovich, Gregory, Darrell, Trevor 17 April 2003 (has links) We present an image-based approach to infer 3D structure parameters using a probabilistic "shape+structure'' model. The 3D shape of a class of objects may be represented by sets of contours from silhouette views simultaneously observed from multiple calibrated cameras. Bayesian reconstructions of new shapes can then be estimated using a prior density constructed with a mixture model and probabilistic principal components analysis. We augment the shape model to incorporate structural features of interest; novel examples with missing structure parameters may then be reconstructed to obtain estimates of these parameters. Model matching and parameter inference are done entirely in the image domain and require no explicit 3D construction. Our shape model enables accurate estimation of structure despite segmentation errors or missing views in the input silhouettes, and works even with only a single input view. Using a dataset of thousands of pedestrian images generated from a synthetic model, we can perform accurate inference of the 3D locations of 19 joints on the body based on observed silhouette contours from real images. AI 3D structure statistical shape model multi-view imagery pose estimation
19	Large-scale and high-quality multi-view stereo Vu, Hoang Hiep 05 December 2011 (has links) (PDF) Acquisition of 3D model of real objects and scenes is indispensable and useful in many practical applications, such as digital archives, game and entertainment industries, engineering, advertisement. There are 2 main methods for 3D acquisition : laser-based reconstruction (active method) and image-based reconstruction from multiple images of the scene in different points of view (passive method). While laser-based reconstruction achieves high accuracy, it is complex, expensive and difficult to set up for large-scale outdoor reconstruction. Image-based, or multi-view stereo methods are more versatile, easier, faster and cheaper. By the time we begin this thesis, most multi-view methods could handle only low resolution images under controlled environment. This thesis targets multi-view stereo both both in large scale and high accuracy issues. We significantly improve some previous methods and combine them into a remarkably effective multi-view pipeline with GPU acceleration. From high-resolution images, we produce highly complete and accurate meshes that achieve best scores in many international recognized benchmarks. Aiming even larger scale, on one hand, we develop Divide and Conquer approaches in order to reconstruct many small parts of a big scene. On the other hand, to combine separate partial results, we create a new merging method, which can merge automatically and quickly hundreds of meshes. With all these components, we are successful to reconstruct highly accurate water-tight meshes for cities and historical monuments from large collections of high-resolution images (around 1600 images of 5 M Pixel images) [INFO:INFO_OH] Computer Science/Other Multi-view stereo Large scale High accuracy Mesh merging Divide and conquer Gpu
20	Large-scale and high-quality multi-view stereo Vu, Hoang Hiep 05 December 2011 (has links) (PDF) Acquisition of 3D model of real objects and scenes is indispensable and useful in many practical applications, such as digital archives, game and entertainment industries, engineering, advertisement. There are 2 main methods for 3D acquisition : laser-based reconstruction (active method) and image-based reconstruction from multiple images of the scene in different points of view (passive method). While laser-based reconstruction achieves high accuracy, it is complex, expensive and difficult to set up for large-scale outdoor reconstruction. Image-based, or multi-view stereo methods are more versatile, easier, faster and cheaper. By the time we begin this thesis, most multi-view methods could handle only low resolution images under controlled environment. This thesis targets multi-view stereo both both in large scale and high accuracy issues. We significantly improve some previous methods and combine them into a remarkably effective multi-view pipeline with GPU acceleration. From high-resolution images, we produce highly complete and accurate meshes that achieve best scores in many international recognized benchmarks. Aiming even larger scale, on one hand, we develop Divide and Conquer approaches in order to reconstruct many small parts of a big scene. On the other hand, to combine separate partial results, we create a new merging method, which can merge automatically and quickly hundreds of meshes. With all these components, we are successful to reconstruct highly accurate water-tight meshes for cities and historical monuments from large collections of high-resolution images (around 1600 images of 5 M Pixel images) [INFO:INFO_OH] Computer Science/Other Multi-view stereo Large scale High accuracy Mesh merging Divide and conquer Gpu

Search results