Spelling suggestions: "subject:"image anderstanding"" "subject:"image bunderstanding""
41 |
Advancing human pose and gesture recognitionPfister, Tomas January 2015 (has links)
This thesis presents new methods in two closely related areas of computer vision: human pose estimation, and gesture recognition in videos. In human pose estimation, we show that random forests can be used to estimate human pose in monocular videos. To this end, we propose a co-segmentation algorithm for segmenting humans out of videos, and an evaluator that predicts whether the estimated poses are correct or not. We further extend this pose estimator to new domains (with a transfer learning approach), and enhance its predictions by predicting the joint positions sequentially (rather than independently) in an image, and using temporal information in the videos (rather than predicting the poses from a single frame). Finally, we go beyond random forests, and show that convolutional neural networks can be used to estimate human pose even more accurately and efficiently. We propose two new convolutional neural network architectures, and show how optical flow can be employed in convolutional nets to further improve the predictions. In gesture recognition, we explore the idea of using weak supervision to learn gestures. We show that we can learn sign language automatically from signed TV broadcasts with subtitles by letting algorithms 'watch' the TV broadcasts and 'match' the signs with the subtitles. We further show that if even a small amount of strong supervision is available (as there is for sign language, in the form of sign language video dictionaries), this strong supervision can be combined with weak supervision to learn even better models.
|
42 |
A Deep Understanding of Structural and Functional Behavior of Tabular and Graphical Modules in Technical DocumentsAlexiou, Michail January 2021 (has links)
No description available.
|
43 |
Quantitative measurement of pH in stroke using chemical exchange saturation transfer magnetic resonance imagingTee, Yee Kai January 2013 (has links)
Stroke is one of the leading causes of death and adult disability worldwide. The major therapeutic intervention for acute ischemic stroke is the administration of recombinant tissue plasminogen activator (rtPA) to help to restore blood flow to the brain. This has been shown to increase the survival rate and to reduce the disability of ischemic stroke patients. However, rtPA is associated with intracranial haemorrhage and thus its administration is currently limited to only about 5% of ischemic stroke patients. More advanced imaging techniques can be used to better stratify patients for rtPA treatment. One new imaging technique, chemical exchange saturation transfer (CEST) magnetic resonance imaging, can potentially image intracellular pH and since tissue acidification happens prior to cerebral infarction, CEST has the potential to predict ischemic injury and hence to improve patient selection. Despite this potential, most studies have generated pH-weighted rather than quantitative pH maps; the most widely used metric to quantify the CEST effect is only able to generate qualitative contrast measurements and suffers from many confounds. The greatest clinical benefit of CEST imaging lies in its ability to non-invasively measure quantitative pH values which may be useful to identify salvageable tissue. The quantitative techniques and work presented in this thesis thus provide the necessary analysis to determine whether a threshold for the quantified CEST effect or for pH exists to help to define tissue outcome following stroke; to investigate the potential of CEST for clinical stroke imaging; and subsequently to facilitate clinical translation of CEST for acute stroke management.
|
44 |
From interactive to semantic image segmentationGulshan, Varun January 2011 (has links)
This thesis investigates two well defined problems in image segmentation, viz. interactive and semantic image segmentation. Interactive segmentation involves power assisting a user in cutting out objects from an image, whereas semantic segmentation involves partitioning pixels in an image into object categories. We investigate various models and energy formulations for both these problems in this thesis. In order to improve the performance of interactive systems, low level texture features are introduced as a replacement for the more commonly used RGB features. To quantify the improvement obtained by using these texture features, two annotated datasets of images are introduced (one consisting of natural images, and the other consisting of camouflaged objects). A significant improvement in performance is observed when using texture features for the case of monochrome images and images containing camouflaged objects. We also explore adding mid-level cues such as shape constraints into interactive segmentation by introducing the idea of geodesic star convexity, which extends the existing notion of a star convexity prior in two important ways: (i) It allows for multiple star centres as opposed to single stars in the original prior and (ii) It generalises the shape constraint by allowing for Geodesic paths as opposed to Euclidean rays. Global minima of our energy function can be obtained subject to these new constraints. We also introduce Geodesic Forests, which exploit the structure of shortest paths in implementing the extended constraints. These extensions to star convexity allow us to use such constraints in a practical segmentation system. This system is evaluated by means of a “robot user” to measure the amount of interaction required in a precise way, and it is shown that having shape constraints reduces user effort significantly compared to existing interactive systems. We also introduce a new and harder dataset which augments the existing GrabCut dataset with more realistic images and ground truth taken from the PASCAL VOC segmentation challenge. In the latter part of the thesis, we bring in object category level information in order to make the interactive segmentation tasks easier, and move towards fully automated semantic segmentation. An algorithm to automatically segment humans from cluttered images given their bounding boxes is presented. A top down segmentation of the human is obtained using classifiers trained to predict segmentation masks from local HOG descriptors. These masks are then combined with bottom up image information in a local GrabCut like procedure. This algorithm is later completely automated to segment humans without requiring a bounding box, and is quantitatively compared with other semantic segmentation methods. We also introduce a novel way to acquire large quantities of segmented training data relatively effortlessly using the Kinect. In the final part of this work, we explore various semantic segmentation methods based on learning using bottom up super-pixelisations. Different methods of combining multiple super-pixelisations are discussed and quantitatively evaluated on two segmentation datasets. We observe that simple combinations of independently trained classifiers on single super-pixelisations perform almost as good as complex methods based on jointly learning across multiple super-pixelisations. We also explore CRF based formulations for semantic segmentation, and introduce novel visual words based object boundary description in the energy formulation. The object appearance and boundary parameters are trained jointly using structured output learning methods, and the benefit of adding pairwise terms is quantified on two different datasets.
|
45 |
On wide dynamic range logarithmic CMOS image sensorsChoubey, Bhaskar January 2006 (has links)
Logarithmic sensors are capable of capturing the wide dynamic range of intensities available in nature with minimum number of bits and post-processing required. A simple circuit able to perform logarithmic capture is one utilising a MOS device in weak inversion. However, the output of this pixel is crippled due to fixed pattern noise. Technique proposed to reduce this noise fail to produce high quality images on account of unaccounted high gain variations in the pixel. An electronic calibration technique is proposed which is capable of reducing both multiplicative as well as additive FPN. Contrast properties matching that of human eye are reported from these sensors. With reduced FPN, the pixel performance at low intensities becomes concerning. In these regions, the high leakage current of the CMOS process affects the logarithmic pixel. To reduce this current, two different techniques using a modified circuit and another with modified layout are tested. The layout technique is observed to reduce the leakage current. In addition, this layout can be used to linearise the output of logarithmic pixel in low light regions. The unique linear response at low light and logarithmic pixel at high light is further investigated. A new model based on the device physics is derived to represent this response. The fixed pattern noise profile is also investigated. An intelligent iterative scheme is proposed and verified to extract the photocurrent flowing in the pixel and correct the fixed pattern noise utilising the new model. Future research ideas leading to better designs of logarithmic pixels and post-processing of these signals are proposed at the end of the thesis.
|
46 |
Tumour vessel structural analysis and its application in image analysisWang, Po January 2010 (has links)
Abnormal vascular structure has been identified as one of the major characteristics of tumours. In this thesis, we carry out quantitative analysis on different tumour vascular structures and research the relationship between vascular structure and its transportation efficiency. We first study segmentation methods to extract the binary vessel representations from microscope images. We found that local phase-hysteresis thresholding is able to segment vessel objects from noisy microscope images. We also study methods to extract the centre lines of segmented vessel objects, a process termed as skeletonization. We modified the conventional thinning method to regularize the extremely asymmetrical structure found in the segmented vessel objects. We found this method is capable to produce vessel skeletons with satisfactory accuracy. We have developed a software for 3D vessel structural analysis. This software is consisted of four major parts: image segmentation, vessel skeletonization, skeleton modification and structure quantification. This software has implemented local phase-hysteresis thresholding and structure regularization-thinning methods. A GUI was introduced to enable users to alter the skeleton structures based on their subjective judgements. Radius and inter branch length quantification can be conducted based on the segmentation and skeletonization results. The accuracy of segmentation, skeletonization and quantification methods have been tested on several synthesized data sets. The change of tumour vascular structure after drug treatment was then investigated. We proposed metrics to quantify tumour vascular geometry and statistically analysed the effect of tested drugs on normalizing tumour vascular structure. finally, we developed a spatio-temporal model to simulate the delivery of oxygen and 3-18 F-fluoro-1-(2-nitro-1-imidazolyl)-2-propanol (Fmiso), which is the hypoxia tracer that gives out PET signal in an Fmiso PET scanning. This model is based on compartmental models, but also considers the spatial diffusion of oxygen and Fmiso. We validated our model on in vitro spheroid data and simulated the oxygen and Fmiso distribution on the segmented vessel images. We contend that the tumour Fmiso distribution (as observed in Fmiso PET imaging) is caused by the abnormal tumour vascular structure which is further aroused from tumour angiogenesis process. We depicted a modelling framework to research the relationships between tumour angiogenesis, vessel structure and Fmiso distribution, which is going to be the focus of our future work.
|
47 |
MRI image analysis for abdominal and pelvic endometriosisChi, Wenjun January 2012 (has links)
Endometriosis is an oestrogen-dependent gynaecological condition defined as the presence of endometrial tissue outside the uterus cavity. The condition is predominantly found in women in their reproductive years, and associated with significant pelvic and abdominal chronic pain and infertility. The disease is believed to affect approximately 33% of women by a recent study. Currently, surgical intervention, often laparoscopic surgery, is the gold standard for diagnosing the disease and it remains an effective and common treatment method for all stages of endometriosis. Magnetic resonance imaging (MRI) of the patient is performed before surgery in order to locate any endometriosis lesions and to determine whether a multidisciplinary surgical team meeting is required. In this dissertation, our goal is to use image processing techniques to aid surgical planning. Specifically, we aim to improve quality of the existing images, and to automatically detect bladder endometriosis lesion in MR images as a form of bladder wall thickening. One of the main problems posed by abdominal MRI is the sparse anisotropic frequency sampling process. As a consequence, the resulting images consist of thick slices and have gaps between those slices. We have devised a method to fuse multi-view MRI consisting of axial/transverse, sagittal and coronal scans, in an attempt to restore an isotropic densely sampled frequency plane of the fused image. In addition, the proposed fusion method is steerable and is able to fuse component images in any orientation. To achieve this, we apply the Riesz transform for image decomposition and reconstruction in the frequency domain, and we propose an adaptive fusion rule to fuse multiple Riesz-components of images in different orientations. The adaptive fusion is parameterised and switches between combining frequency components via the mean and maximum rule, which is effectively a trade-off between smoothing the intrinsically noisy images while retaining the sharp delineation of features. We first validate the method using simulated images, and compare it with another fusion scheme using the discrete wavelet transform. The results show that the proposed method is better in both accuracy and computational time. Improvements of fused clinical images against unfused raw images are also illustrated. For the segmentation of the bladder wall, we investigate the level set approach. While the traditional gradient based feature detection is prone to intensity non-uniformity, we present a novel way to compute phase congruency as a reliable feature representation. In order to avoid the phase wrapping problem with inverse trigonometric functions, we devise a mathematically elegant and efficient way to combine multi-scale image features via geometric algebra. As opposed to the original phase congruency, the proposed method is more robust against noise and hence more suitable for clinical data. To address the practical issues in segmenting the bladder wall, we suggest two coupled level set frameworks to utilise information in two different MRI sequences of the same patients - the T2- and T1-weighted image. The results demonstrate a dramatic decrease in the number of failed segmentations done using a single kind of image. The resulting automated segmentations are finally validated by comparing to manual segmentations done in 2D.
|
48 |
Computer-assisted volumetric tumour assessment for the evaluation of patient response in malignant pleural mesotheliomaChen, Mitchell January 2011 (has links)
Malignant pleural mesothelioma (MPM) is a form of aggressive tumour that is almost always associated with prior exposure to asbestos. Currently responsible for over 47,000 deaths worldwide each year and rising, it poses a serious threat to global public health. Many clinical studies of MPM, including its diagnosis, prognostic planning, and the evaluation of a treatment, necessitate the accurate quantification of tumours based on medical image scans, primarily computed tomography (CT). Currently, clinical best practice requires application of the MPM-adapted Response Evaluation Criteria in Solid Tumours (MPM-RECIST) scheme, which provides a uni-dimensional measure of the tumour's size. However, the low CT contrast between the tumour and surrounding tissues, the extensive elongated growth pattern characteristic of MPM, and, as a consequence, the pronounced partial volume effect, collectively contribute to the significant intra- and inter-observer variations in MPM-RECIST values seen in clinical practice, which in turn greatly affect clinical judgement and outcome. In this thesis, we present a novel computer-assisted approach to evaluate MPM patient response to treatments, based on the volumetric segmentation of tumours (VTA) on CT. We have developed a 3D segmentation routine based on the Random Walk (RW) segmentation framework by L. Grady, which is notable for its good performance in handling weak tissue boundaries and the ability to segment any arbitrary shapes with appropriately placed initialisation points. Results also show its benefit with regard to computation time, as compared to other candidate methods such as level sets. We have also added a boundary enhancement regulariser to RW, to improve its performance with smooth MPM boundaries. The regulariser is inspired by anisotropic diffusion. To reduce the required level of user supervision, we developed a registration-assisted segmentation option. Finally, we achieved effective and highly manoeuvrable partial volume correction by applying a reverse diffusion-based interpolation. To assess its clinical utility, we applied our method to a set of 48 CT studies from a group of 15 MPM patients and compared the findings to the MPM-RECIST observations made by a clinical specialist. Correlations confirm the utility of our algorithm for assessing MPM treatment response. Furthermore, our 3D algorithm found applications in monitoring the patient quality of life and palliative care planning. For example, segmented aerated lungs demonstrated very good correlation with the VTA-derived patient responses, suggesting their use in assessing the pulmonary function impairment caused by the disease. Likewise, segmented fluids highlight sites of pleural effusion and may potentially assist in intra-pleural fluid drainage planning. Throughout this thesis, to meet the demands of probabilistic analyses of data, we have used the Non-Parametric Windows (NPW) probability density estimator. NPW outperforms the histogram in terms of its smoothness and kernel density estimator in its parameter setting, and preserves signal properties such as the order of occurrence and band-limitedness of the sample, which are important for tissue reconstruction from discrete image data. We have also worked on extending this estimator to analysing vector-valued quantities; which are essential for multi-feature studies involving values such as image colour, texture, heterogeneity and entropy.
|
49 |
Development and application of image analysis techniques to study structural and metabolic neurodegeneration in the human hippocampus using MRI and PETBishop, Courtney Alexandra January 2012 (has links)
Despite the association between hippocampal atrophy and a vast array of highly debilitating neurological diseases, such as Alzheimer’s disease and frontotemporal lobar degeneration, tools to accurately and robustly quantify the degeneration of this structure still largely elude us. In this thesis, we firstly evaluate previously-developed hippocampal segmentation methods (FMRIB’s Integrated Registration and Segmentation Tool (FIRST), Freesurfer (FS), and three versions of a Classifier Fusion (CF) technique) on two clinical MR datasets, to gain a better understanding of the modes of success and failure of these techniques, and to use this acquired knowledge for subsequent method improvement (e.g., FIRSTv3). Secondly, a fully automated, novel hippocampal segmentation method is developed, termed Fast Marching for Automated Segmentation of the Hippocampus (FMASH). This combined region-growing and atlas-based approach uses a 3D Sethian Fast Marching (FM) technique to propagate a hippocampal region from an automatically-defined seed point in the MR image. Region growth is dictated by both subject-specific intensity features and a probabilistic shape prior (or atlas). Following method development, FMASH is thoroughly validated on an independent clinical dataset from the Alzheimer’s Disease Neuroimaging Initiative (ADNI), with an investigation of the dependency of such atlas-based approaches on their prior information. In response to our findings, we subsequently present a novel label-warping approach to effectively account for the detrimental effects of using cross-dataset priors in atlas-based segmentation. Finally, a clinical application of MR hippocampal segmentation is presented, with a combined MR-PET analysis of wholefield and subfield hippocampal changes in Alzheimer’s disease and frontotemporal lobar degeneration. This thesis therefore contributes both novel computational tools and valuable knowledge for further neurological investigations in both the academic and the clinical field.
|
50 |
Simultaneous recognition, localization and mapping for wearable visual robotsCastle, Robert Oliver January 2009 (has links)
With the advent of ever smaller and more powerful portable computing devices, and ever smaller cameras, wearable computing is becoming more feasible. The ever increasing numbers of augmented reality applications are allowing users to view additional data about their world overlaid on their world using portable computing devices. The main aim of this research is to enable a user of a wearable robot to explore large environments automatically viewing augmented reality at locations and on objects of interest. To implement this research a wearable visual robotic assistant is designed and constructed. Evaluation of the different technologies results in a final design that combines a shoulder mounted self stabilizing active camera, and a hand held magic lens into a single portable system. To enable the wearable assistant to locate known objects, a system is designed that combines an established method for appearance-based recognition with one for simultaneous localization and mapping using a single camera. As well as identifying planar objects, the objects are located relative to the camera in 3D by computing the image-to-database homography. The 3D positions of the objects are then used as additional measurements in the SLAM process, which routinely uses other point features to acquire and maintain a map of the surroundings, irrespective of whether objects are present or not. The monocular SLAM system is then replaced with a new method for building maps and tracking. Instead of tracking and mapping in a linear frame-rate driven manner, this adopted method separates the mapping from the tracking. This allows higher density maps to be constructed, and provides more robust tracking. The flexible framework provided by this method is extended to support multiple independent cameras, and multiple independent maps, allowing the user of the wearable two-camera robot to escape the confines of the desk top and explore arbitrarily sized environments. The final part of the work brings together the parallel tracking and multiple mapping system with the recognition and localization of planar objects from a database. The method is able to build multiple feature rich maps of the world and simultaneously recognize, reconstruct and localize objects within these maps. The object reconstruction process uses the spatially separated keyframes from the tracking and mapping processes to recognize and localize known objects in the world. These are then used for augmented reality overlays related to the objects.
|
Page generated in 0.103 seconds