Global ETD Search

71	Parallel Processing Architecture for Solving Large Scale Linear Systems Nagari, Arun 01 August 2009 (has links) Solving linear systems with multiple variables is at the core of many scienti…c problems. Parallel processing techniques for solving such system problems has have received much attention in recent years. A key theme in the literature pertains to the application of Lower triangular matrix and Upper triangular matrix(LU) decomposing, which factorizes an N N square matrix into two triangular matrices. The resulting linear system can be more easily solved in O(N2) work. Inher- ently, the computational complexity of LU decomposition is O(N3). Moreover, it is a challenging process to parallelize. A highly-parallel methodology for solving large-scale, dense, linear systems is proposed in this thesis by means of the novel application of Cramer’s Rule. A numerically stable scheme is described, yielding an overall computational complexity of O(N) with N2 processing units. Computer Engineering
72	Accelerating the Stochastic Simulation Algorithm Using Emerging Architectures Jenkins, David Dewayne 01 December 2009 (has links) In order for scientists to learn more about molecular biology, it is imperative that they have the ability to construct and evaluate models. Model statistics consistent with the chemical master equation can be obtained using Gillespie's stochastic simulation algorithm (SSA). Due to the stochastic nature of the Monte Carlo simulations, large numbers of simulations must be run in order to get accurate statistics for the species populations and reactions. However, the algorithm tends to be computationally heavy and leads to long simulation runtimes for large systems. In this research, the performance of Gillespie's stochastic simulation algorithm is analyzed and optimized using a number of techniques and architectures. These techniques include parallelizing simulations using streaming SIMD extensions (SSE), message passing interface with multicore systems and computer cluters, and CUDA with NVIDIA graphics processing units. This research is an attempt to make using the SSA a better option for modeling biological and chemical systems. Through this work, it will be shown that accelerating the algorithm in both of the serial and SSE implementations proved to be beneficial, while the CUDA implementation had lower than expected results. Computer Engineering
73	Vision-Based Reinforcement Learning Using A Consolidated Actor-Critic Model Niedzwiedz, Christopher Allen 01 December 2009 (has links) Vision-based machine learning agents are tasked with making decisions based on high-dimensional, noisy input, placing a heavy load on available resources. Moreover, observations typically provide only partial information with respect to the environment state, necessitating robust state inference by the agent. Reinforcement learning provides a framework for decision making with the goal of maximizing long-term reward. This thesis introduces a novel approach to vision-based reinforce- ment learning through the use of a consolidated actor-critic model (CACM). The approach takes advantage of artificial neural networks as non-linear function approximators and the reduced com- putational requirements of the CACM scheme to yield a scalable vision-based control system. In this thesis, a comparison between the actor-critic and CACM is made. Additionally, the affect of observation prediction and correlated exploration has on the agent's performance is investigated. Computer Engineering
74	Minimum Transmission Power Configuration in Real-Time Wireless Sensor Networks Wang, Xiaodong 01 August 2009 (has links) Multi-channel communications can effectively reduce channel competition and interferences in a wireless sensor network, and thus achieve increased throughput and improved end-to-end delay guarantees with reduced power consumption. However, existing work relies only on a small number of orthogonal channels, resulting in degraded performance when a large number of data flows need to be transmitted on different channels. In this thesis, empirical studies are conducted to investigate the interferences among overlapping channels. The results show that overlapping channels can also be utilized for improved real-time performance if the node transmission power is carefully configured. In order to minimize the overall power consumption of a network with multiple data flows under end-to-end delay constraints, a constrained optimization problem is formulated to configure the transmission power level for every node and assign overlapping channels to different data flows. Since the optimization problem has an exponential computational complexity, a heuristic algorithm designed based on Simulated Annealing is then presented to find a suboptimal solution. The extensive empirical results on a 25-mote testbed demonstrate that the proposed algorithm achieves better real-time performance and less power consumption than two baselines including a scheme using only orthogonal channels. Computer Engineering
75	Optimization of Digital Filter Design Using Hardware Accelerated Simulation Liang, Getao 01 May 2007 (has links) iii Abstract The goal to this research was to develop a scheme to optimize a digital filter design using an optimization engine and hardware-accelerated simulation using a Field Programmable Gate Array (FPGA). A parameterizable generic digital filter, which was fully implemented on a prototyping board with a Xilinx Virtex-II Pro xc2vp30-7-ff896 FPGA, was developed using Xilinx System Generator for DSP. The optimization engine, which actually is a random candidate generator that will eventually be replaced by a differential evolution engine, was implemented using MATLAB along with a candidate evaluator and other supporting programs. Automatic hardware co-simulations of 100 candidate filters were performed successfully to demonstrate that this approach is feasible, reliable and efficient for complex systems. Computer Engineering
76	Study of Interaction Between Mexican Free-tailed Bats (Tadarida Brasiliensis) and Moths and Counting Moths in a Real Time Video Kolli, Haritha 01 August 2007 (has links) Brazilian free tailed bats (Tadarida brasiliensis) are among the most abundant and widely distributed species in the southwestern United States in the summer. Because of their high metabolic needs and diverse diets, bats can impact the communities in which they live in a variety of important ways. The role of bats in pollination, seed dispersal and insect control has been proven to be extremely significant. Due to human ignorance, habitat destruction, fear and low reproductive rates of bats, there is a decline in bat populations. T.brasiliensis eats large quantities of insects but is not always successful in prey capture. In the face of unfavorable foraging condition bats reduce energy expenditure by roosting. By studying the interaction between bats and adults insects along with the associated energetics, we estimate the pest control provided by bats in agro-ecosystems to help understand their ecological importance. To visualize the interaction between bats and adult insects, a simulator has been designed. This simulator is based upon an individual based modeling approach. Using the simulator, we investigated the effect of insect densities and their escape response on the foraging pattern of bats. Traditionally synthetic pesticides were used to control pest population. But recently the use of transgenic crops has become widespread because of the benefits such as fewer pesticide applications and increased yield for growers. To study the effect of these transgenic crops on moth densities and subsequently on bats foraging activity, videos were recorded in the fields at Texas. To count the moths in the videos, we utilized image segmentation techniques such as thresholding and connected component labeling. Accuracy up to 90% has been achieved using these techniques. Computer Engineering
77	A SECURE INFORMATION INFRASTRUCTURE FOR SERVICE ORIENTED ARCHITECTURES St. Onge, Joseph Giacomo 27 September 2006 (has links) In todays ever-evolving design environments, a focus switch is needed from workstation-centric software tools to distributed services. For ComputerAided Design, the use of distributed services has the potential to incorporate all of the needed software features for a given project into a new design system that utilizes services. Thus, the designer would have access to features that are not locally installed. This thesis presents a secure middleware solution for design environments. The secure middleware solution provides a system architecture and information infrastructure to facilitate the needs of the designer while also providing access to remote services. The system architecture and information infrastructure are designed with the designer in mind by providing access to any file at any time at any location, and the ability to submit jobs to any available services. These fundamental components are implemented as to not compromise security or accountability. Enabling the system architecture are four fundamental technologies created for this system. They include: (1) a Secure Java Messaging Service, (2) Verification Services, (3) Gateway and Directory Services, and (4) a Secure File System. Through the creation of these four technologies, the system architecture and information infrastructure was developed and deployed into a simulated design environment. Results showing the benefits of this design environment over other design environments are explored within this thesis. Overall, the secure middleware solution for design environments benefits designers and enterprises in a secure, traceable, and accountable manner. Computer Engineering
78	ELECTRONIC DESIGN AUTOMATION FOR AN ENERGY-EFFICIENT COARSE-GRAIN RECONFIGURABLE FABRIC ARCHITECTURE Stander, Justin Nathanial 25 September 2007 (has links) In the past those looking to accelerate computationally intensive applications through hardware implementations have had relatively few target platforms to choose from, each with wildly opposing benefits and drawbacks. The SuperCISC Energy-Efficient Coarse-Grain Reconfigurable Fabric provides an ultra-low power alternative to field-programmable gate array (FPGA) devices and application specific integrated circuits (ASICs). The proposed Fabric combines the reconfigurable nature and manageable Computer-Aided Design (CAD) flow of FPGAs with power and energy characteristic similar to those of an ASIC. This thesis establishes the design flow and explores issues central to the design space exploration of the SuperCISC Reconfigurable Fabric Project. The Fabric Interconnect Model specification facilitates rapid design space exploration for a range of Fabric Models. Significant effort was put into the development of the Greedy Heuristic Fabric Mapper which automates the problem of programming the Fabric to perform the desired hardware function. Coupled with additional automation the Mapper allows for conversion of C-code specified application kernels into Fabric Configurations. The FIMFabricPrinter automates the verification, simulation, statistics gathering, and visualization of these Fabric Configurations. Results show the Fabric achieving power improvements of 68X to 369X, and energy improvements of 38X to 127X over the same benchmarks performed on an FPGA device. Computer Engineering
79	Static Timing Analysis Based Transformations of Super-Complex Instruction Set Hardware Functions Ihrig, Colin James 09 June 2008 (has links) Application specific hardware implementations are an increasingly popular way of reducing execution time and power consumption in embedded systems. This application specific hardware typically consumes a small fraction of the execution time and power consumption that the equivalent software code would require. Modern electronic design automation (EDA) tools can be used to apply a variety of transformations to hardware blocks in an effort to achieve additional performance and power savings. A number of such transformations require a tool with knowledge of the designs timing characteristics. This thesis describes a static timing analyzer and two timing analysis based design automation tools. The static timing analyzer estimates the worst-case timing characteristics of a hardware data flow graph. These hardware data flow graphs are intermediate representations generated within a C to VHDL hardware acceleration compiler. Two EDA tools were then developed which utilize static timing analysis. An automated pipelining tool was developed to increase the throughput of large blocks of combinational logic generated by the hardware acceleration compiler. Another tool was designed in an attempt to mitigate power consumption resulting from extraneous combinational switching. By inserting special signal buffers, known as delay elements, with preselected propagation delays, combinational functional units can be kept inactive until their inputs have stabilized. The hardware descriptions generated by both tools were synthesized, simulated, and power profiled using existing commercial EDA tools. The results show that pipelining leads to an average performance increase of 3.3x, while delay elements saved between 25% and 33% of the power consumption when tested on a set of signal and image processing benchmarks. Computer Engineering
80	ADVANCED HASHING SCHEMES FOR PACKET FORWARDING USING SET ASSOCIATIVE MEMORY ARCHITECTURES Hanna, Michel Nabil 26 January 2010 (has links) Building a high performance IP packet forwarding (PF) engine remains a challenge due to increasingly stringent throughput requirements and the growing sizes of IP forwarding tables. The router has to match the incoming packet's IP address against the forwarding table. The matching process has to be done in wire speed which is why scalability and low power consumption are features that PF engines must maintain. It is common for PF engines to use hash tables; however, the classic hashing downsides have to be dealt with (e.g., collisions, worst case memory access time, ... etc.). While open addressing hash tables, in general, provide good average case search performance, their memory utilization and worst case performance can degrade quickly due to collisions that leads to bucket overflows. Set associative memory can be used for hardware implementations of hash tables with the property that each bucket of a hash table can be searched in one memory cycle. Hence, PF engine architectures based on associative memory will outperform those based on the conventional Ternary Content Addressable Memory (TCAM) in terms of power and scalability. The two standard solutions to the overflow problem are either to use some sort of predefined probing (e.g., linear or quadratic) or to use multiple hash functions. This work presents two new hash schemes that extend both aforementioned solutions to tackle the overflow problem efficiently. The first scheme is a hash probing scheme that is called Content-based HAsh Probing, or CHAP. CHAP is a probing scheme that is based on the content of the hash table to avoid the classical side effects of predefined hash probing methods (i.e., primary and secondary clustering phenomena) and at the same time reduces the overflow. The second scheme, called Progressive Hashing, or PH, is a general multiple hash scheme that reduces the overflow as well. PH splits the prefixes into groups where each group is assigned one hash function, then reuse some hash functions in a progressive fashion to reduce the overflow. We show by experimenting with real IP lookup tables that both schemes outperform other hashing schemes. Computer Engineering

Search results