Global ETD Search

1	Taking Man Out of the Loop: Methods to Reduce Human Involvement in Search and Surveillance Applications Brink, Kevin Michael 2010 December 1900 (has links) There has always been a desire to apply technology to human endeavors to increase a person's capabilities and reduce the numbers or skill level required of the people involved, or replace the people altogether. Three fundamental areas are investigated where technology can enable the reduction or removal of humans in complex tasks. The fi rst area of research is the rapid calibration of multiple camera systems when cameras share an overlapping fi eld of view allowing for 3D computer vision applications. A simple method for the rapid calibration of such systems is introduced. The second area of research is the autonomous exploration of hallways or other urbancanyon environments in the absence of a global positions system (GPS) using only an inertial motion unit (IMU) and a monocular camera. Desired paths that generate accurate vehicle state estimates for simple ground vehicles are identi fied and the bene fits of integrated estimation and control are investigated. It is demonstrated that considering estimation accuracy is essential to produce efficient guidance and control. The Schmidt-Kalman filter is applied to the vision-aided inertial navigation system in a novel manner, reducing the state vector size signi ficantly. The final area of research is a decentralized swarm based approach to source localization using a high fidelity environment model to directly provide vehicle updates. The approach is an extension of a standard quadratic model that provides linear updates. The new approach leverages information from the higher-order terms of the environment model showing dramatic improvement over the standard method. Camera Calibration 3D Computer Vision Vision-aided Inertial Navigation vSLAM Schmidt-Kalman Filter Cooperative Source Localization
2	Perceived Image Quality Assessment for Stereoscopic Vision Akhter, Roushain 07 April 2011 (has links) This thesis describes an automatic evaluation approach for estimating the quality of stereo displays and vision systems using image features. The method is inspired by the human visual system. Display of stereo images is widely used to enhance the viewing experience of three-dimensional (3D) visual displays and communication systems. Applications are numerous and range from entertainment to more specialized applications such as: 3D visualization and broadcasting, robot tele-operation, object recognition, body exploration, 3D teleconferencing, and therapeutic purposes. Consequently, perceived image quality is important for assessing the performance of 3D imaging applications. There is no doubt that subjective testing (i.e., asking human viewers to rank the quality of stereo images) is the most accurate method for quality evaluation. It reflects true human perception. However, these assessments are time consuming and expensive. Furthermore, they cannot be done in real time. Therefore, the goal of this research is to develop an objective quality evaluation methods computational models that can automatically predict perceived image quality) correlating well with subjective predictions that are required in the field of quality assessment. I believe that the perceived distortion and disparity of any stereoscopic display are strongly dependent on local features, such as edge (non-uniform) and non-edge (uniform) areas. Therefore, in this research, I propose a No-Reference (NR) objective quality assessment for coded stereoscopic images based on segmented local features of artifacts and disparity. Local feature information such as edge and non-edge area based relative disparity estimation, as well as the blockiness, blur, and the zero-crossing within the block of images, are evaluated in this method. A block-based edge dissimilarity approach is used for disparity estimation. I use the Toyama stereo images database to evaluate the performance and to compare it with other approaches both qualitatively and quantitatively. Image Quality Assessment Stereoscopic Images 3D Computer Vision Compressed Image Artifacts
3	Perceived Image Quality Assessment for Stereoscopic Vision Akhter, Roushain 07 April 2011 (has links) This thesis describes an automatic evaluation approach for estimating the quality of stereo displays and vision systems using image features. The method is inspired by the human visual system. Display of stereo images is widely used to enhance the viewing experience of three-dimensional (3D) visual displays and communication systems. Applications are numerous and range from entertainment to more specialized applications such as: 3D visualization and broadcasting, robot tele-operation, object recognition, body exploration, 3D teleconferencing, and therapeutic purposes. Consequently, perceived image quality is important for assessing the performance of 3D imaging applications. There is no doubt that subjective testing (i.e., asking human viewers to rank the quality of stereo images) is the most accurate method for quality evaluation. It reflects true human perception. However, these assessments are time consuming and expensive. Furthermore, they cannot be done in real time. Therefore, the goal of this research is to develop an objective quality evaluation methods computational models that can automatically predict perceived image quality) correlating well with subjective predictions that are required in the field of quality assessment. I believe that the perceived distortion and disparity of any stereoscopic display are strongly dependent on local features, such as edge (non-uniform) and non-edge (uniform) areas. Therefore, in this research, I propose a No-Reference (NR) objective quality assessment for coded stereoscopic images based on segmented local features of artifacts and disparity. Local feature information such as edge and non-edge area based relative disparity estimation, as well as the blockiness, blur, and the zero-crossing within the block of images, are evaluated in this method. A block-based edge dissimilarity approach is used for disparity estimation. I use the Toyama stereo images database to evaluate the performance and to compare it with other approaches both qualitatively and quantitatively. Image Quality Assessment Stereoscopic Images 3D/ Computer Vision Compressed Image Artifacts
4	Towards a Robust and Efficient Deep Neural Network for the Lidar Point Cloud Perception Zhou, Zixiang 01 January 2023 (has links) (PDF) In recent years, LiDAR has emerged as a crucial perception tool for robotics and autonomous vehicles. However, most LiDAR perception methods are adapted from 2D image-based deep learning methods, which are not well-suited to the unique geometric structure of LiDAR point cloud data. This domain gap poses challenges for the fast-growing LiDAR perception tasks. This dissertation aims to investigate suitable deep network structures tailored for LiDAR point cloud data, and therefore design a more efficient and robust LiDAR perception framework. Our approach to address this challenge is twofold. First, we recognize that LiDAR point cloud data is characterized by an imbalanced and sparse distribution in the 3D space, which is not effectively captured by traditional voxel-based convolution methods that treat the 3D map uniformly. To address this issue, we aim to develop a more efficient feature extraction method by either counteracting the imbalanced feature distribution or incorporating global contextual information using a transformer decoder. Second, besides the gap between the 2D and 3D domains, we acknowledge that different LiDAR perception tasks have unique requirements and therefore require separate network designs, resulting in significant network redundancy. To address this, we aim to improve the efficiency of the network design by developing a unified multi-task network that shares the feature-extracting stage and performs different tasks using specific heads. More importantly, we aim to enhance the accuracy of different tasks by leveraging the multi-task learning framework to enable mutual improvements. We propose different models based on these motivations and evaluate them on several large-scale LiDAR point cloud perception datasets, achieving state-of-the-art performance. Lastly, we summarize the key findings of this dissertation and propose future research directions. 3D Computer Vision LiDAR Perception Deep Learning Autonomous Driving Computer Sciences
5	Visual odometry and mapping in natural environments for arbitrary camera motion models Terzakis, George January 2016 (has links) This is a thesis on outdoor monocular visual SLAM in natural environments. The techniques proposed herein aim at estimating camera pose and 3D geometrical structure of the surrounding environment. This problem statement was motivated by the GPS-denied scenario for a sea-surface vehicle developed at Plymouth University named Springer. The algorithms proposed in this thesis are mainly adapted for the Springer’s environmental conditions, so that the vehicle can navigate on a vision based localization system when GPS is not available; such environments include estuarine areas, forests and the occasional semi-urban territories. The research objectives are constrained versions of the ever-abiding problems in the fields of multiple view geometry and mobile robotics. The research is proposing new techniques or improving existing ones for problems such as scene reconstruction, relative camera pose recovery and filtering, always in the context of the aforementioned landscapes (i.e., rivers, forests, etc.). Although visual tracking is paramount for the generation of data point correspondences, this thesis focuses primarily on the geometric aspect of the problem as well as with the probabilistic framework in which the optimization of pose and structure estimates takes place. Besides algorithms, the deliverables of this research should include the respective implementations and test data for these algorithms in the form of a software library and a dataset containing footage of estuarine regions taken from a boat, along with synchronized sensor logs. This thesis is not the final analysis on vision based navigation. It merely proposes various solutions for the localization problem of a vehicle navigating in natural environments either on land or on the surface of the water. Although these solutions can be used to provide position and orientation estimates when GPS is not available, they have limitations and there is still a vast new world of ideas to be explored. 629.8
6	Feature Extraction Based Iterative Closest Point Registration for Large Scale Aerial LiDAR Point Clouds Graehling, Quinn R. January 2020 (has links) No description available. Computer Engineering Computer Science Artificial Intelligence Electrical Engineering Engineering Registration Point Clouds LiDAR Iterative Closest Point 3D Computer Vision
7	Measurement of range of motion of human finger joints, using a computer vision system Ben-Naser, Abdusalam January 2011 (has links) Assessment of finger range of motion (ROM) is often required for monitoring the effectiveness of rehabilitative treatments and for evaluating patients' functional impairment. There are several devices which are used to measure this motion, such as wire tracing, tracing onto paper and mechanical and electronic goniometry. These devices are quite cheap, excluding electronic goniometry; however the drawbacks of these devices are their lack of accuracy and the time- consuming nature of the measurement process. The work described in this thesis considers the design, implementation and validation of a new medical measurement system utilized in the evaluation of the range of motion of the human finger joints instead of the current measurement tools. The proposed system is a non-contact measurement device based on computer vision technology and has many advantages over the existing measurement devices. In terms of accuracy, better results are achieved by this system, it can be operated by semi-skilled person, and is time saving for the evaluator. The computer vision system in this study consists of CCD cameras to capture the images, a frame-grabber to change the analogue signal from the cameras to digital signals which can be manipulated by a computer, Ultra Violet light (UV) to illuminate the measurement space, software to process the images and perform the required computation, a darkened enclosure to accommodate the cameras and UV light and to shield the working area from any undesirable ambient light. Two calibration techniques were used to calibrate the cameras, Direct Linear Transformation and Tsai. A calibration piece that suits this application was designed and manufactured. A steel hand model was used to measure the fingers joint angles. The average error from measuring the finger angles using this system was around 1 degree compared with 5 degrees for the existing used techniques. 621.39
8	Machine learning for blob detection in high-resolution 3D microscopy images Ter Haak, Martin January 2018 (has links) The aim of blob detection is to find regions in a digital image that differ from their surroundings with respect to properties like intensity or shape. Bio-image analysis is a common application where blobs can denote regions of interest that have been stained with a fluorescent dye. In image-based in situ sequencing for ribonucleic acid (RNA) for example, the blobs are local intensity maxima (i.e. bright spots) corresponding to the locations of specific RNA nucleobases in cells. Traditional methods of blob detection rely on simple image processing steps that must be guided by the user. The problem is that the user must seek the optimal parameters for each step which are often specific to that image and cannot be generalised to other images. Moreover, some of the existing tools are not suitable for the scale of the microscopy images that are often in very high resolution and 3D. Machine learning (ML) is a collection of techniques that give computers the ability to ”learn” from data. To eliminate the dependence on user parameters, the idea is applying ML to learn the definition of a blob from labelled images. The research question is therefore how ML can be effectively used to perform the blob detection. A blob detector is proposed that first extracts a set of relevant and nonredundant image features, then classifies pixels as blobs and finally uses a clustering algorithm to split up connected blobs. The detector works out-of-core, meaning it can process images that do not fit in memory, by dividing the images into chunks. Results prove the feasibility of this blob detector and show that it can compete with other popular software for blob detection. But unlike other tools, the proposed blob detector does not require parameter tuning, making it easier to use and more reliable. / Syftet med blobdetektion är att hitta regioner i en digital bild som skiljer sig från omgivningen med avseende på egenskaper som intensitet eller form. Biologisk bildanalys är en vanlig tillämpning där blobbar kan beteckna intresseregioner som har färgats in med ett fluorescerande färgämne. Vid bildbaserad in situ-sekvensering för ribonukleinsyra (RNA) är blobbarna lokala intensitetsmaxima (dvs ljusa fläckar) motsvarande platserna för specifika RNA-nukleobaser i celler. Traditionella metoder för blob-detektering bygger på enkla bildbehandlingssteg som måste vägledas av användaren. Problemet är att användaren måste hitta optimala parametrar för varje steg som ofta är specifika för just den bilden och som inte kan generaliseras till andra bilder. Dessutom är några av de befintliga verktygen inte lämpliga för storleken på mikroskopibilderna som ofta är i mycket hög upplösning och 3D. Maskininlärning (ML) är en samling tekniker som ger datorer möjlighet att “lära sig” från data. För att eliminera beroendet av användarparametrar, är tanken att tillämpa ML för att lära sig definitionen av en blob från uppmärkta bilder. Forskningsfrågan är därför hur ML effektivt kan användas för att utföra blobdetektion. En blobdetekteringsalgoritm föreslås som först extraherar en uppsättning relevanta och icke-överflödiga bildegenskaper, klassificerar sedan pixlar som blobbar och använder slutligen en klustringsalgoritm för att dela upp sammansatta blobbar. Detekteringsalgoritmen fungerar utanför kärnan, vilket innebär att det kan bearbeta bilder som inte får plats i minnet genom att dela upp bilderna i mindre delar. Resultatet visar att detekteringsalgoritmen är genomförbar och visar att den kan konkurrera med andra populära programvaror för blobdetektion. Men i motsats till andra verktyg behöver den föreslagna detekteringsalgoritmen inte justering av sina parametrar, vilket gör den lättare att använda och mer tillförlitlig. Computer and Information Sciences Data- och informationsvetenskap
9	Deep Brain Dynamics and Images Mining for Tumor Detection and Precision Medicine Lakshmi Ramesh (16637316) 30 August 2023 (has links) <p>Automatic brain tumor segmentation in Magnetic Resonance Imaging scans is essential for the diagnosis, treatment, and surgery of cancerous tumors. However, identifying the hardly detectable tumors poses a considerable challenge, which are usually of different sizes, irregular shapes, and vague invasion areas. Current advancements have not yet fully leveraged the dynamics in the multiple modalities of MRI, since they usually treat multi-modality as multi-channel, and the early channel merging may not fully reveal inter-modal couplings and complementary patterns. In this thesis, we propose a novel deep cross-attention learning algorithm that maximizes the subtle dynamics mining from each of the input modalities and then boosts feature fusion capability. More specifically, we have designed a Multimodal Cross-Attention Module (MM-CAM), equipped with a 3D Multimodal Feature Rectification and Feature Fusion Module. Extensive experiments have shown that the proposed novel deep learning architecture, empowered by the innovative MM- CAM, produces higher-quality segmentation masks of the tumor subregions. Further, we have enhanced the algorithm with image matting refinement techniques. We propose to integrate a Progressive Refinement Module (PRM) and perform Cross-Subregion Refinement (CSR) for the precise identification of tumor boundaries. A Multiscale Dice Loss was also successfully employed to enforce additional supervision for the auxiliary segmentation outputs. This enhancement will facilitate effectively matting-based refinement for medical image segmentation applications. Overall, this thesis, with deep learning, transformer-empowered pattern mining, and sophisticated architecture designs, will greatly advance deep brain dynamics and images mining for tumor detection and precision medicine.</p> Computer vision Multimodal analysis and synthesis Deep learning Neural networks Semantic Segmentation Brain Tumor Segmentation Deep Learning Computer Vision Multimodal ML 3D Computer Vision Attention Cross-Attention Biomedical Segmentation
10	Immersive Dynamic Scenes for Virtual Reality from a Single RGB-D Camera Lai, Po Kong 26 September 2019 (has links) In this thesis we explore the concepts and components which can be used as individual building blocks for producing immersive virtual reality (VR) content from a single RGB-D sensor. We identify the properties of immersive VR videos and propose a system composed of a foreground/background separator, a dynamic scene re-constructor and a shape completer. We initially explore the foreground/background separator component in the context of video summarization. More specifically, we examined how to extract trajectories of moving objects from video sequences captured with a static camera. We then present a new approach for video summarization via minimization of the spatial-temporal projections of the extracted object trajectories. New evaluation criterion are also presented for video summarization. These concepts of foreground/background separation can then be applied towards VR scene creation by extracting relative objects of interest. We present an approach for the dynamic scene re-constructor component using a single moving RGB-D sensor. By tracking the foreground objects and removing them from the input RGB-D frames we can feed the background only data into existing RGB-D SLAM systems. The result is a static 3D background model where the foreground frames are then super-imposed to produce a coherent scene with dynamic moving foreground objects. We also present a specific method for extracting moving foreground objects from a moving RGB-D camera along with an evaluation dataset with benchmarks. Lastly, the shape completer component takes in a single view depth map of an object as input and "fills in" the occluded portions to produce a complete 3D shape. We present an approach that utilizes a new data minimal representation, the additive depth map, which allows traditional 2D convolutional neural networks to accomplish the task. The additive depth map represents the amount of depth required to transform the input into the "back depth map" which would exist if there was a sensor exactly opposite of the input. We train and benchmark our approach using existing synthetic datasets and also show that it can perform shape completion on real world data without fine-tuning. Our experiments show that our data minimal representation can achieve comparable results to existing state-of-the-art 3D networks while also being able to produce higher resolution outputs. virtual reality immersive virtual reality VR immersive VR computer vision 3D computer vision machine learning 3D machine learning deep learning convolutional neural networks image processing 3D reconstruction scene reconstruction dynamic scene reconstruction shape completion occlusion filling video summarization

Search results