41 |
Optimisation for image processingLuong, Vu Ngoc Duy January 2014 (has links)
The main purpose of optimisation in image processing is to compensate for missing, corrupted image data, or to find good correspondences between input images. We note that image data essentially has infinite dimensionality that needs to be discretised at certain levels of resolution. Most image processing methods find a suboptimal solution, given the characteristics of the problem. While the general optimisation literature is vast, there does not seem to be an accepted universal method for all image problems. In this thesis, we consider three interrelated optimisation approaches to exploit problem structures of various relaxations to three common image processing problems: 1. The first approach to the image registration problem is based on the nonlinear programming model. Image registration is an ill-posed problem and suffers from many undesired local optima. In order to remove these unwanted solutions, certain regularisers or constraints are needed. In this thesis, prior knowledge of rigid structures of the images is included in the problem using linear and bilinear constraints. The aim is to match two images while maintaining the rigid structure of certain parts of the images. A sequential quadratic programming algorithm is used, employing dimensional reduction, to solve the resulting discretised constrained optimisation problem. We show that pre-processing of the constraints can reduce problem dimensionality. Experimental results demonstrate better performance of our proposed algorithm compare to the current methods. 2. The second approach is based on discrete Markov Random Fields (MRF). MRF has been successfully used in machine learning, artificial intelligence, image processing, including the image registration problem. In the discrete MRF model, the domain of the image problem is fixed (relaxed) to a certain range. Therefore, the optimal solution to the relaxed problem could be found in the predefined domain. The original discrete MRF is NP hard and relaxations are needed to obtain a suboptimal solution in polynomial time. One popular approach is the linear programming (LP) relaxation. However, the LP relaxation of MRF (LP-MRF) is excessively high dimensional and contains sophisticated constraints. Therefore, even one iteration of a standard LP solver (e.g. interior-point algorithm), may take too long to terminate. Dual decomposition technique has been used to formulate a convex-nondifferentiable dual LP-MRF that has geometrical advantages. This has led to the development of first order methods that take into account the MRF structure. The methods considered in this thesis for solving the dual LP-MRF are the projected subgradient and mirror descent using nonlinear weighted distance functions. An analysis of the convergence properties of the method is provided, along with improved convergence rate estimates. The experiments on synthetic data and an image segmentation problem show promising results. 3. The third approach employs a hierarchy of problem's models for computing the search directions. The first two approaches are specialised methods for image problems at a certain level of discretisation. As input images are infinite-dimensional, all computational methods require their discretisation at some levels. Clearly, high resolution images carry more information but they lead to very large scale and ill-posed optimisation problems. By contrast, although low level discretisation suffers from the loss of information, it benefits from low computational cost. In addition, a coarser representation of a fine image problem could be treated as a relaxation to the problem, i.e. the coarse problem is less ill-conditioned. Therefore, propagating a solution of a good coarse approximation to the fine problem could potentially improve the fine level. With the aim of utilising low level information within the high level process, we propose a multilevel optimisation method to solve the convex composite optimisation problem. This problem consists of the minimisation of the sum of a smooth convex function and a simple non-smooth convex function. The method iterates between fine and coarse levels of discretisation in the sense that the search direction is computed using information from either the gradient or a solution of the coarse model. We show that the proposed algorithm is a contraction on the optimal solution and demonstrate excellent performance on experiments with image restoration problems.
|
42 |
Realising affect-sensitive multimodal human-computer interface : hardware and software infrastructureShen, Jie January 2014 (has links)
With the industry's recent paradigm shift from PC-centred applications to services delivered through ubiquitous computing in a more human centred manner, multimodal human-computer interfaces (MHCI) became an emerging research topic. As an important but often neglected aspect, the lack of appropriate system integration tools hinders the development of MHCI systems. Therefore, the work presented in this thesis aims at delivering hardware / software infrastructure to facilitate the full development cycle of MHCI systems. Specifically, we first built a hardware platform for synchronised, multimodal-data capturing to support and facilitate automatic human behaviour understanding from multiple audiovisual sensors. Then we developed a software framework, called the HCI^2 Framework, to facilitate the modular development and rapid prototyping of readily-applicable MHCI systems. As a proof of concept, we also present an affect-sensitive game with humanoid robot NAO developed using the HCI^2 Framework. Studies on automatic human behaviour understanding require high-bandwidth recording from multiple cameras, as well as from other sensors such as microphones and eye-gaze trackers. In addition, sensor fusion should be realised with high accuracy as to achieve tight synchronisation between sensors and, in turn, enable studies of correlation between various behavioural signals. Using commercial off-the-shelf components may compromise quality and accuracy due to several issues including handling the combined data rate from multiple sensors, unknown offset and rate discrepancies between independent hardware clocks, the absence of trigger inputs or -outputs in the hardware, as well as the existence of different methods for time-stamping the recorded data. To achieve accurate synchronisation, we centralise the synchronisation task by recording all trigger or timestamp signals with a multi-channel audio interface. For sensors not having an external trigger signal, we let the computer that captures the sensor data periodically generate timestamp signals from its serial port output. These signals can also be used as a common time base to synchronise multiple asynchronous audio interfaces. The resulted data recording platform, which is built upon two consumer-grade PCs, is capable of capturing 8-bit video data with 1024 x 1024 spatial- and 59.1 Hz temporal resolution, from at least 14 cameras, together with 8 channels of 24-bit audio at 96 kHz and eye-gaze tracking result sampled at a frequency of 60 or 120 Hz. The attained synchronisation accuracy is unprecedented up to date. To facilitate rapid development of readily-applicable MHCI systems using algorithms designed to detect and track behavioural signals (e.g. face detector, facial fiducially points tracker, expression recogniser, etc.), a software integration framework is required. The proposed software framework, which is called the HCI^2 Framework, is built upon publish/subscribe (P/S) architecture. It implements a shared-memory-based data transport protocol for message delivery and a TCP-based system management protocol. The latter ensures that the integrity of system structure is maintained at runtime. With the inclusion of 'bridging modules', the HCI^2 Framework is interoperable with other software frameworks including Psyclone and ActiveMQ. In addition to the core communication middleware, we also present the integrated development environment (IDE) of the HCI^2 Framework. It provides a complete graphical environment to support every step in a typical MHCI system development process, including module development, debugging, packaging, and management, as well as the whole system management and testing. The quantitative evaluation indicates that our framework outperforms other similar tools in terms of average message latency and maximum data throughput under a typical single PC scenario. To demonstrate HCI^2 Framework's capabilities in integrating heterogeneous modules, we present several example modules working with a variety of hardware and software. We also present two use cases of the HCI^2 Framework: a computer game, called CamGame, based on hand-held marker(s) and low-cost camera(s) and the human affective signal analysis component of the Fun Robotic Outdoor Guide (FROG) project (http://www.frogrobot.eu/). Using the HCI^2 Framework, we further developed the Mimic-Me Game, which consists of an interactive game played with the NAO humanoid robot. The game involves the robot 'mimicking' the player's facial expression using a combination of body gestures and audio cues. A multimodal dialogue model has been designed and implemented to enable the robot to interact with the human player in a naturalistic way using only natural language, head movement and facial expressions.
|
43 |
Optimising reconfigurable systems for real-time applicationsChau, Thomas Chun Pong January 2014 (has links)
This thesis addresses the problem of designing real-time reconfigurable systems. Our first contribution of this thesis is to propose novel data structures and memory architectures for accelerating real-time proximity queries, with potential application to robotic surgery. We optimise performance while maintaining accuracy by several techniques including mixed precision, function transformation and streaming data flow. Significant speedup is achieved using our reconfigurable system over double-precision CPU, GPU and FPGA designs. The second contribution of this thesis is an adaptation methodology for real-time sequential Monte Carlo methods. Adapting to workload over time, different configurations with various performance and power consumption trade-offs are loaded onto the FPGAs dynamically. Promising energy reduction has been achieved in addition to speedup over CPU and GPU designs. The approach is evaluated in an application to robot localisation. The third contribution of this thesis is a design flow for automated mapping and optimisation of real-time sequential Monte Carlo methods. Machine learning algorithms are used to search for an optimal parameter set to produce the highest solution quality while satisfying all timing and resource constraints. The approach is evaluated in an application to air traffic management.
|
44 |
Improving Internet path performance through detour routingHaddow, Thom January 2014 (has links)
With the rise of cloud computing, distributed services are supplanting the role of traditional host-based systems. The performance of such applications is dependent on the properties of the network that connects their nodes. However, measurement studies have shown that the end-to-end performance for almost all network paths is suboptimal with regards to latency and throughput-alternative paths which could improve upon those metrics can be seen to exist, but applications have no means by which to exploit them. The performance of network paths can be improved with detour routing, an approach which can enhance path performance by redirecting end-to-end communication flows via tertiary detour nodes, exploiting otherwise unrealised connectivity in the network. However, discovering effective detour nodes for arbitrary end-to-end Internet paths incurs a high measurement cost, and discovering such nodes in a scalable fashion remains an open problem. Existing proposals have been restricted to optimising simple metrics, such as latency, or have exploited third-party infrastructure to gather measurements. Where they have been evaluated within a practical context, such systems have shown limited performance improvement, especially with regards to throughput. In this thesis, we show through large scale measurement that the existence of detour routes is widespread, and develop a concrete architecture for improving latency and throughput on arbitrary Internet paths. We find that to achieve effective detour routing in practice, it is necessary to consider entirely separate approaches for latency and throughput. We propose two novel approaches for scalable detour discovery: a network-structure based approach for discovering latency detours, which identifies detour paths by analysing AS-paths; and a statistical approach for discovering bandwidth detours, which identifies the most effective detours based on their aggregate detouring potential. Furthermore, we establish that network-layer detouring cannot be effective for optimising TCP throughput and instead develop a transport-layer approach, which is demonstrated to achieve significant bandwidth improvements over a diverse range of Internet paths.
|
45 |
Wiki-health : from quantified self to self-understandingLi, Yang January 2014 (has links)
Today, healthcare providers are experiencing explosive growth in data, and medical imaging represents a significant portion of that data. Meanwhile, the pervasive use of mobile phones and the rising adoption of sensing devices, enabling people to collect data independently at any time or place is leading to a torrent of sensor data. The scale and richness of the sensor data currently being collected and analysed is rapidly growing. The key challenges that we will be facing are how to effectively manage and make use of this abundance of easily-generated and diverse health data. This thesis investigates the challenges posed by the explosive growth of available healthcare data and proposes a number of potential solutions to the problem. As a result, a big data service platform, named Wiki-Health, is presented to provide a unified solution for collecting, storing, tagging, retrieving, searching and analysing personal health sensor data. Additionally, it allows users to reuse and remix data, along with analysis results and analysis models, to make health-related knowledge discovery more available to individual users on a massive scale. To tackle the challenge of efficiently managing the high volume and diversity of big data, Wiki-Health introduces a hybrid data storage approach capable of storing structured, semi-structured and unstructured sensor data and sensor metadata separately. A multi-tier cloud storage system - CACSS - has been developed and serves as a component for the Wiki-Health platform, allowing it to manage the storage of unstructured data and semi-structured data, such as medical imaging files. CACSS has enabled comprehensive features such as global data de-duplication, performance-awareness and data caching services. The design of such a hybrid approach allows Wiki-Health to potentially handle heterogeneous formats of sensor data. To evaluate the proposed approach, we have developed an ECG-based health monitoring service and a virtual sensing service on top of the Wiki-Health platform. The two services demonstrate the feasibility and potential of using the Wiki-Health framework to enable better utilisation and comprehension of the vast amounts of sensor data available from different sources, and both show significant potential for real-world applications.
|
46 |
Ungrounded haptic-feedback for hand-held surgical robotsPayne, Christopher January 2015 (has links)
Surgical robotic technology has evolved over the last few decades: from autonomous systems, to master-slave and cooperatively-controlled assistive robots. Whilst these various approaches have proven to be technically successful, clinical adoption of robotic technology remains moderate, largely as a result of the financial cost of such technology. An alternative approach that has been recently explored is the integration of mechatronic technology in to surgical devices that are held by the hands of the surgeon and are unattached to a grounding frame. These ungrounded hand-held devices exploit the existing dexterity of the surgeon's hand that allows them to be simpler, physically compact, lower cost, more easily integrated in to the surgical workflow and with fewer barriers to clinical translation. This thesis explores the use of mechatronic technology in ungrounded, hand-held surgical tools for the purpose of augmenting a surgeon's haptic perception. During microsurgery in particular, the tool-tissue manipulation forces are often so low that they cannot be perceived by the operating surgeon. This thesis initially proposes a hand-held device that can amplify these sub-threshold forces to magnitudes that can be perceived by human subjects. The mechatronic force amplification concept is further evolved for use in microsurgical forceps designs. In this case, haptic perception is diminished by the elastic spring return of the forceps which is significantly greater in magnitude than the micro-scale manipulation forces. Having investigated the force amplification concept, vibrotactile-based feedback of predefined force-thresholds is investigated. The concept is studied through the clinical exemplar of microneurosurgery: a device is proposed which can inform the operating surgeon if they are exerting excessive force based on a force threshold at which iatrogenic injury of neurovascular tissue is known to occur. Finally, an ungrounded force-feedback strategy is investigated for use with a hand-held device that incorporates position-based active constraints of the tool tip.
|
47 |
Scalable, data-driven brain decoding using functional MRIMarkides, Loizos January 2014 (has links)
Functional Magnetic Resonance Imaging (fMRI) has established the field of brain decoding, meaning the prediction of the task that a subject is performing in the MRI scanner, given the corresponding images. This has been quite successful, especially when attempting discrimination between the representations of two or four distinct stimuli across the brains of multiple subjects. However, there are currently only a few studies that deal with ways to improve the scalability of existing brain decoding methodologies, in order for the resulting classifiers to be able to discriminate among tens or hundreds of possible stimuli. Such advances have potential for the creation of rigorous brain-computer interfaces, which could establish a solid communication channel with people in a vegetative state. In this work, I propose and evaluate a series of methods leading to the development of a new data-driven, scalable brain decoding framework that will enable better stimulus discrimination. The methods include: (1) A novel inter-subject spatial feature selection method that can be run using the native brain images of each subject directly, and which is not sensitive to differences in the morphology of the brain of each subject. (2) Three novel data-driven feature selection methods that use statistical association metrics in order to select regions that exhibit similar behaviour across-subjects over the course of a given experiment. The methods aim to promote enhanced exploratory power and are not susceptible to region-specific variations of the haemodynamic response function. (3) Two novel data-driven temporal denoising algorithms that can be used to improve the signal-to-noise ratio of any given task-related fMRI image and which do not impose constraints in either the experimental design nor the nature of the involved stimuli. (4) A thorough evaluation of four intensity normalisation techniques that are commonly used for across-subjects and across-sessions decoding, in order to determine their applicability for across-datasets decoding. (5) A novel feature compression and information recovery method that aims at lowering the system memory requirements for training and testing a large-scale brain decoding model using multiple datasets simultaneously.
|
48 |
Robust online subspace learningLiwicki, Stephan January 2014 (has links)
In this thesis, I aim to advance the theories of online non-linear subspace learning through the development of strategies which are both efficient and robust. The use of subspace learning methods is very popular in computer vision and they have been employed to numerous tasks. With the increasing need for real-time applications, the formulation of online (i.e. incremental and real-time) learning methods is a vibrant research field and has received much attention from the research community. A major advantage of incremental systems is that they update the hypothesis during execution, thus allowing for the incorporation of the real data seen in the testing phase. Tracking acts as an attractive and popular evaluation tool for incremental systems, and thus, the connection between online learning and adaptive tracking is seen commonly in the literature. The proposed system in this thesis facilitates learning from noisy input data, e.g. caused by occlusions, casted shadows and pose variations, that are challenging problems in general tracking frameworks. First, a fast and robust alternative to standard L2-norm principal component analysis (PCA) is introduced, which I coin Euler PCA (e-PCA). The formulation of e-PCA is based on robust, non-linear kernel PCA (KPCA) with a cosine-based kernel function that is expressed via an explicit feature space. When applied to tracking, face reconstruction and background modeling, promising results are achieved. In the second part, the problem of matching vectors of 3D rotations is explicitly targeted. A novel distance which is robust for 3D rotations is introduced, and formulated as a kernel function. The kernel leads to a new representation of 3D rotations, the full-angle quaternion (FAQ) representation. Finally, I propose 3D object recognition from point clouds, and object tracking with color values using FAQs. A domain-specific kernel function designed for visual data is then presented. KPCA with Krein space kernels is introduced, as this kernel is indefinite, and an exact incremental learning framework for the new kernel is developed. In a tracker framework, the presented online learning outperforms the competitors in nine popular and challenging video sequences. In the final part, the generalized eigenvalue problem is studied. Specifically, incremental slow feature analysis (SFA) with indefinite kernels is proposed, and applied to temporal video segmentation and tracking with change detection. As online SFA allows for drift detection, further improvements are achieved in the evaluation of the tracking task.
|
49 |
WikiSensing : a collaborative sensor management system with trust assessment for big dataSilva, Dilshan January 2014 (has links)
Big Data for sensor networks and collaborative systems have become ever more important in the digital economy and is a focal point of technological interest while posing many noteworthy challenges. This research addresses some of the challenges in the areas of online collaboration and Big Data for sensor networks. This research demonstrates WikiSensing (www.wikisensing.org), a high performance, heterogeneous, collaborative data cloud for managing and analysis of real-time sensor data. The system is based on the Big Data architecture with comprehensive functionalities for smart city sensor data integration and analysis. The system is fully functional and served as the main data management platform for the 2013 UPLondon Hackathon. This system is unique as it introduced a novel methodology that incorporates online collaboration with sensor data. While there are other platforms available for sensor data management WikiSensing is one of the first platforms that enable online collaboration by providing services to store and query dynamic sensor information without any restriction of the type and format of sensor data. An emerging challenge of collaborative sensor systems is modelling and assessing the trustworthiness of sensors and their measurements. This is with direct relevance to WikiSensing as an open collaborative sensor data management system. Thus if the trustworthiness of the sensor data can be accurately assessed, WikiSensing will be more than just a collaborative data management system for sensor but also a platform that provides information to the users on the validity of its data. Hence this research presents a new generic framework for capturing and analysing sensor trustworthiness considering the different forms of evidence available to the user. It uses an extensible set of metrics that can represent such evidence and use Bayesian analysis to develop a trust classification model. Based on this work there are several publications and others are at the final stage of submission. Further improvement is also planned to make the platform serve as a cloud service accessible to any online user to build up a community of collaborators for smart city research.
|
50 |
Segmentation of pelvic structures from preoperative images for surgical planning and guidanceGao, Qinquan January 2014 (has links)
Prostate cancer is one of the most frequently diagnosed malignancies globally and the second leading cause of cancer-related mortality in males in the developed world. In recent decades, many techniques have been proposed for prostate cancer diagnosis and treatment. With the development of imaging technologies such as CT and MRI, image-guided procedures have become increasingly important as a means to improve clinical outcomes. Analysis of the preoperative images and construction of 3D models prior to treatment would help doctors to better localize and visualize the structures of interest, plan the procedure, diagnose disease and guide the surgery or therapy. This requires efficient and robust medical image analysis and segmentation technologies to be developed. The thesis mainly focuses on the development of segmentation techniques in pelvic MRI for image-guided robotic-assisted laparoscopic radical prostatectomy and external-beam radiation therapy. A fully automated multi-atlas framework is proposed for bony pelvis segmentation in MRI, using the guidance of MRI AE-SDM. With the guidance of the AE-SDM, a multi-atlas segmentation algorithm is used to delineate the bony pelvis in a new \ac{MRI} where there is no CT available. The proposed technique outperforms state-of-the-art algorithms for MRI bony pelvis segmentation. With the SDM of pelvis and its segmented surface, an accurate 3D pelvimetry system is designed and implemented to measure a comprehensive set of pelvic geometric parameters for the examination of the relationship between these parameters and the difficulty of robotic-assisted laparoscopic radical prostatectomy. This system can be used in both manual and automated manner with a user-friendly interface. A fully automated and robust multi-atlas based segmentation has also been developed to delineate the prostate in diagnostic MR scans, which have large variation in both intensity and shape of prostate. Two image analysis techniques are proposed, including patch-based label fusion with local appearance-specific atlases and multi-atlas propagation via a manifold graph on a database of both labeled and unlabeled images when limited labeled atlases are available. The proposed techniques can achieve more robust and accurate segmentation results than other multi-atlas based methods. The seminal vesicles are also an interesting structure for therapy planning, particularly for external-beam radiation therapy. As existing methods fail for the very onerous task of segmenting the seminal vesicles, a multi-atlas learning framework via random decision forests with graph cuts refinement has further been proposed to solve this difficult problem. Motivated by the performance of this technique, I further extend the multi-atlas learning to segment the prostate fully automatically using multispectral (T1 and T2-weighted) MR images via hybrid \ac{RF} classifiers and a multi-image graph cuts technique. The proposed method compares favorably to the previously proposed multi-atlas based prostate segmentation. The work in this thesis covers different techniques for pelvic image segmentation in MRI. These techniques have been continually developed and refined, and their application to different specific problems shows ever more promising results.
|
Page generated in 0.057 seconds