Global ETD Search

71	Conversational CBR for Improved Patient Information Acquisition Marthinsen, Tor Henrik Aasness January 2007 (has links) <p>In this thesis we describe our study of two knowledge intensive Conversational Case-Based Reasoning (CCBR) systems and their methods. We look in particular at the way they have solved inferencing and question ranking. Then we continue with a description of our own design for a CCBR system, that will help patients share their experiences of side effects with drugs, with other patients. We describe how we create cases, how our question selection methods work and present an example of how the domain model will look. It is also included a simulation of how a dialogue would be for a patient. The design we have created is a good basis for implementing a knowledge intensive CCBR system. The system should work better than a normal CCBR system, because of the inferencing and question ranking methods, which should lessen the cognitive load on the user and require fewer questions answered, to reach a good solution.</p> ntnudaim SIF2 datateknikk Intelligente systemer
72	Real-Time Simulation and Visualization of Large Sea Surfaces Løset, Tarjei Kvamme January 2007 (has links) <p>The open ocean is the setting for enterprises that require extensive monitoring, planning and training. In the offshore industry, virtual environments have been embraced to improve such processes. The presented work focuses on real-time simulation and visualization of open seas. This implies very large water surfaces dominated by wind-driven waves, but also influenced by the presence of watercraft activity and offshore installations. The implemented system treats sea surfaces as periodic elevation fields, obtained by synthesis from statistically sampled frequency spectra. Apparent repeating structures across a surface, due to this periodic nature, are avoided by decomposing the elevation field synthesis, using two or more discrete spectra with different frequency scales. A GPU-based water solver is also included. Its implementation features a convenient input interface, which exploits hardware rasterization both for efficiency and to supply the algorithm with arbitrary data, e.g. smooth, connected deflective paths. Finally, polygonal representations of visible ocean regions are obtained using a GPU-accelerated tessellation scheme suitable for wave fields. The result is realistic, unbounded ocean surfaces with natural distributions of wind-driven waves, avoiding the artificial periodicity associated with previous similar techniques. Further, the simulation allows for superposed boat wakes and surface obstacles in regions of interest. With the proposed tessellation scheme, the visualization is economic with regards to data transfer, conforming with the goal of delivering highly interactive rendering rates.</p> ntnudaim SIF2 datateknikk Komplekse datasystemer
73	Implementing LOD for physically-based real-time fire rendering Tangvald, Lars January 2007 (has links) <p>In this paper, I present a framework for implementing level of detail (LOD) for a 3d physically based fire rendering running on the GPU. While realistic fire rendering that runs in real time exists, it is generally not used in real-time applications such as game, due to the high cost of running such a rendering. Most research into the rendering of fire is only concerned with the fire itself, and not how it can best be included in larger scenes with a multitude of other complex objects. I present methods for increasing the efficiency of a physically based fire rendering without harming its visual quality, by dynamically adjusting the detail level of the fire according to its importance for the current view. I adapt and use methods created both for LOD and for other areas to alter the detail level of the visualization and simulation of a fire rendering. The desired detail level is calculated by evaluating certain conditions such as visibility and distance from the viewpoint, and then used to adjust the detail level of the visualization and simulation of the fire. The implementation of the framework could not be completed in time, but a number of tests were run to determine the effect of the different methods used. These results indicate that by making adjustments to the simulation and visualization of the fire, large boosts in performance are gained without significantly harming the visual quality of the fire rendering.</p> ntnudaim SIF2 datateknikk Komplekse datasystemer
74	Segmentation of Medical Images Using CBR Rieck, Christian Marshall January 2007 (has links) <p>This paper describes a case based reasoning system that is used to guide the parameters of a segmentation algorithm. Instead of using a fixed set of parameters that gives the best average result over all images, the parameteres are tuned to maximize the score for each image separately. The system's foundation is a set of 20 cases that each contains one 3D MRI image and the parameters needed for its optimal segmentation. When a new image is presented to the system a new case is generated and compared to the other cases based on image similarity. The parameters from the best matching case are then used to segment the new image. The key issue is the use of an iterative approach that lets the system adapt the parameters to suit the new image better, if necessary. Each iteration contains a segmentation and a revision of the result, and this is done until the system approves the result. The revision is based on metadata stored in each case to see if the result has the expected properties as defined by the case. The results show that combining case based reasoning and segmentation can be applied within image processing. This is valid for choosing a good set of starting parameters, and also for using case specific knowledge to guide their adaption. A set of challenges for future research is identified and discussed at length.</p> ntnudaim SIF2 datateknikk Intelligente systemer
75	Improving the Performance of Parallel Applications in Chip Multiprocessors with Architectural Techniques Jahre, Magnus January 2007 (has links) <p>Chip Multiprocessors (CMPs) or multi-core architectures are a new class of processor architectures. Here, multiple processing cores are placed on the same physical chip. To reach the performance potential of these architectures with a single application, it must be multi-threaded. In these applications, the processing cores cooperate to solve a single task, and this requires a large amount of inter-processor communication in many cases. Consequently, CMPs need to support this communication in an efficient manner. To investigate inter-processor communication in CMPs, a good understanding of the state-of-the-art of CMP design options, interconnect network design and cache coherence protocol solutions is required. Furthermore, a good computer architecture simulator is needed to evaluate both new and conventional architectural solutions. The M5 simulator is used for this purpose and has been extended with a generic split transaction bus, a crossbar based on the IBM Power 5 crossbar, a butterfly network and an ideal interconnect. The unrealistic ideal interconnect provides an upper bound on the performance improvement available from enhancing the interconnect. In addition, a directory-based coherence protocol proposed by Stenström has been implemented. The performance of 2-, 4- and 8-core CMPs with crossbar and bus interconnects, private L1 caches and shared L2 caches is investigated. The bus and the crossbar are the conventional ways of implementing the L1 to L2 cache interconnect. These configurations have been evaluated with multiprogrammed workloads from the SPEC2000 benchmark suite and parallel, scientific benchmarks from the SPLASH-2 benchmark suite. With multiprogrammed workloads, the crossbar interconnect configurations perform nearly as well as a configuration with an ideal interconnect. However, the performance of the crossbar CMPs is similar to the performance of the bus CMPs when there is intensive L1 to L1 cache communication. The reason is limited L1 to L1 bandwidth. The bus CMPs experience a severe performance degradation with some benchmarks for all processor counts and workload classes. A butterfly interconnect is proposed to alleviate the L1 to L1 communication bottleneck. The butterfly CMP performs on average 3.9 times better than the bus CMP and 3.8 times better than the crossbar CMP when there are 8 processor cores. These numbers are based on the performance of the WaterNSquared, Raytrace, Radix and LUNoncontig benchmarks. The reason is that the other SPLASH-2 benchmarks had issues with the M5 thread implementation for these configurations. For the multiprogrammed workloads, the butterfly CMPs are a bit slower than the crossbar CMPs.</p> ntnudaim SIF2 datateknikk Komplekse datasystemer
76	Directional Decomposition of Images: Implementation Issues Including GPU Techniques Dubois, Jérôme January 2008 (has links) <p>Directional decomposition of an image consists of separating it into several components, each containing directional information in some specific directions. It has many applications in digital image processing, such as image improvement or linear feature detection, and could be used on seismic data to help geophysicists finding faults. In this thesis, we look at a directional filter bank (DFB) introduced by Bamberger and Smith and how to implement it efficiently on CPU and GPU. Graphics Processing Units (GPUs) are becoming increasingly more suitable for general scientific computing, and applications with suitable properties run much quicker on a GPU than a CPU. For instance, NVIDIA CUDA (Compute Unified Device Architecture) is a new programming interface that lets users program NVIDIA General Purpose GPUs (GPGPUs) in a C-like fashion for data parallel intensive computation. We translate the DFB algorithm from a theoretical signal processing description to an algorithmic description from computer scientists'point of view, including a readable C implementation. Tools are developed to ease our DFB investigation, including a tailored library to manipulate images in suitable text-based and binary formats and for generating test images with suitable properties. Several implementations of 1D filter banks are also provided. Finally, part of the Bamberger DFB is implemented efficiently using the CUDA environment for NVIDIA GPUs. We show that directional filter banks can efficiently be executed on GPUs and demonstrate that the CPU-GPU bandwidth affects performance considerably. Hence, care should be taken to do as many steps as possible on the GPU before returning results to the CPU.</p> ntnudaim SIF2 datateknikk Komplekse datasystemer
77	Co-design implementation of FPGA hardware acceleration of DNA motif identification Linvåg, Elisabeth January 2008 (has links) <p>Pattern matching in bio-informatics is a discipline in sturdy growth, and has a great need for searching through large amounts of data. At NTNU, a prototype specified in VHDL has been developed for an FPGA-solution identifying short motifs or patterns in genetic data using a Position-Weight Matrix (PWM). But programming FPGAs using VHDL is a complicated and time consuming process that requires intimate knowledge of how hardware works, and the prototype is not yet complete in terms of required functionality. Consequently, a desirable alternative is to make use of co-design languages to facilitate the use of hardware for a software developer, as well as to integrate the environment for development of soft- and hardware. This thesis deal with specification and implementation of a co-design based alternative to the existing VHDL based solution, as well as an evaluation of productivity vs final performance of the newly developed solution compared to the VHDL based solution. The chosen co-design language is Impulse-C, created by Impulse Accelerated Technologies Inc., which is a co-design language designed for data-flow oriented applications, but with the flexibility to support other programming models as well. The programming model simplifies the expression of highly parallel algorithms through the use of well-defined data communication, message passing and synchronization mechanisms. The affiliated development environment, CoDeveloper, contains tools that allow the FPGA system to be developed and debugged using Impulse-C. The software-to-hardware compiler and optimizer translates C-language processes to (RTL) VHDL code, while optimizing the generated logic and identifying opportunities for parallelism. Ease-of-use for the CoDeveloper environment is evaluated in this thesis, based on the authors experiences with the tools. In total, four variations of the Impulse-C solution has been implemented; a basic solution and a multicore solution, both implemented in a floating-point and a 'fixed-point' version. The implemented solutions are analyzed through various experiments described in this thesis, done during simulation using CoDeveloper. Attempts were made to get the solutions to run on the target platform, the Cray XD1 supercomputer Musculus, but these were unsuccessful. A wrong choice of properties and constraints in Xilinx ISE are believed to have caused the FPGA programming file to be generated faulty. There was no time to confirm and correct this. Some information about device utilization and performance could still be extracted from the Xilinx ISE 'Static timing' and 'Place and route' reports.</p> ntnudaim SIF2 datateknikk Komplekse datasystemer
78	Online Meat Cutting Optimisation Wikborg, Uno January 2008 (has links) <p>Nortura, Norways largest producer of meat, faces many challenges in their operation. One of these challenges is to decide which products to make out of each of the slaughtered animals. The meat from the animals can be made into different products, some more valuable than others. However, someone has to buy the products as well. It is therefore important to produce what the customers ask for. This thesis is about a computer system based on online optimisation which helps the meat cutters decide what to make. Two different meat cutting plants have been visited to specify how the system should work. This information has been used to develop a program which can give a recommendation for what to produce from carcasses during cutting. The system has been developed by considering both the attributes of the animals and the orders from the customers. The main focus of the thesis is how to deal with the fact that the attributes are only known for a small number of the animals, since they are measured right after slaughtering. A method has been made to calculate what should be made from the different carcasses, and this method has been realised with both exact and heuristic algorithms.</p> ntnudaim SIF2 datateknikk Komplekse datasystemer
79	Empirical evaluation of metric indexing methods Fevang, Rune, Fossaa, Arne Bergene January 2008 (has links) <p>Metric indexing is a branch of search technology that is designed for search non-textual data. Examples of this includes image search (where the search query is an image), document search (finding documents that are roughly equal) to search in high-dimensional Euclidean spaces. Metric indexing is based on the theory of metric spaces, where the only thing known about a set of objects is the distance between them (defined by a metric distance function). A large number of methods have been proposed to solve the metric indexing problem. In this thesis, we have concentrated on new approaches to solving these problems, as well as combining existing methods to create better ones. The methods studied in this thesis include D-Index, GNAT, EMVP-Forest, HC, SA-Tree, SSS-Tree, M-Tree, PM-Tree, M-Tree and PM-Tree. These have all been implemented and tested against each other to find strengths and weaknesses. This thesis also studies a group of indexing methods called hybrid methods which combines tree-based methods (like SA-Tree, SSS-tree and M-Tree), with pivoting methods (like AESA and LAESA). The thesis also proposes a method to create hybrid trees from existing trees by using features in the programming language. Hybrid methods have been shown in this thesis to be very promising. While they may have a considerable overhead in construction time,CPU usage and/or memory usage, they show large benefits in reduced number of distance computations. We also propose a new way of calculating the Minimal Spanning Tree of a graph operating on metric objects, and show that it reduces the number of distance computations needed.</p> ntnudaim SIF2 datateknikk Komplekse datasystemer
80	Optimizing & Parallelizing a Large Commercial Code for Modeling Oil-well Networks Rudshaug, Atle January 2008 (has links) <p>In this project, a complex, serial application that models networks of oil wells is analyzed for today's parallel architectures. By heavy use of the profiling tool Valgrind, several serial optimizations are achieved, causing up to a 30-50x speedup on previously dominant sections of the code, on different architectures. Our initial main goal is to parallelize our application for GPGPUs (General Purpose Graphics Processing Units) such as the NVIDIA GeForce 8800GTX. However, our optimized application is shown not to have a high enough computational intensity to be suitable for the GPU platforms, with the data transfer over the PCI-express port showing to be a serious bottleneck. We then target our applications for another, more common, parallel architecture -- the multi-core CPU. Instead of focusing on the low-level hotspots found by the profiler, a new approach is taken. By analyzing the functionality of the application and the problem it is to solve, the high-level structure of the application is identified. A thread pool in combination with a task queue is implemented using PThreads in Linux, which fit the structure of the application. It also supports nested parallel queues, while maintaining all serial dependencies. However, the sheer size and complexity of the serial application, introduces a lot of problems when trying to go multithreaded. A tight coupling of all parts of the code, introduces several race conditions, creating erroneous results for complex cases. Our focus is hence shifted to developing models to help analyze how suitable applications with traversal of dependence-tree structures, such as our oil well network application is, given benchmarks of the node times. First, we benchmark the serial execution of each child in the network and predict the overall parallel performance by computing dummy tasks reflecting these times on the same tree structure on two given well networks, a large and a small case. Based on these benchmarks, we then predict the speedup of these two cases, with the assumption of balanced loads on each level in the network. Finally, the minimum amount of time needed to calculate a given network is predicted. Our predictions of low scalability, due to the nature of the oil networks in the test cases, are then shown. This project thus concludes that the amount of work needed to successfully introduce multithreading in this application might not be worth it, due to all the serial dependencies in the problem the application tries to solve. However, if there are multiple individual networks to be calculated, we suggest using Grid technology to manage multiple individual instances of the application simultaneously. This can be done either by using script files or by adding DRMAA API calls in the application. This, in combination with further serial optimizations, is the way to go for good speedup for these types of applications.</p> ntnudaim SIF2 datateknikk Komplekse datasystemer

Search results