Global ETD Search

241	Efficiently mapping high-performance early vision algorithms onto multicore embedded platforms Apewokin, Senyo 09 January 2009 (has links) The combination of low-cost imaging chips and high-performance, multicore, embedded processors heralds a new era in portable vision systems. Early vision algorithms have the potential for highly data-parallel, integer execution. However, an implementation must operate within the constraints of embedded systems including low clock rate, low-power operation and with limited memory. This dissertation explores new approaches to adapt novel pixel-based vision algorithms for tomorrow's multicore embedded processors. It presents : - An adaptive, multimodal background modeling technique called Multimodal Mean that achieves high accuracy and frame rate performance with limited memory and a slow-clock, energy-efficient, integer processing core. - A new workload partitioning technique to optimize the execution of early vision algorithms on multi-core systems. - A novel data transfer technique called cat-tail dma that provides globally-ordered, non-blocking data transfers on a multicore system. By using efficient data representations, Multimodal Mean provides comparable accuracy to the widely used Mixture of Gaussians (MoG) multimodal method. However, it achieves a 6.2x improvement in performance while using 18% less storage than MoG while executing on a representative embedded platform. When this algorithm is adapted to a multicore execution environment, the new workload partitioning technique demonstrates an improvement in execution times of 25% with only a 125 ms system reaction time. It also reduced the overall number of data transfers by 50%. Finally, the cat-tail buffering technique reduces the data-transfer latency between execution cores and main memory by 32.8% over the baseline technique when executing Multimodal Mean. This technique concurrently performs data transfers with code execution on individual cores, while maintaining global ordering through low-overhead scheduling to prevent collisions. Computer vision Embedded Multicore Computer vision Algorithms
242	Customization of floating-point units for embedded systems and field programmable gate arrays Chong, Michael Yee Jern, Computer Science & Engineering, Faculty of Engineering, UNSW January 2009 (has links) While Application Specific Instruction Set Processors (ASIPs) have allowed designers to create processors with custom instructions to target specific applications, floating-point units (FPUs) are still instantiated as non-customizable general-purpose units, which if under utilized, wastes area and performance. However, customizing FPUs manually is a complex and time-consuming process. Therefore, there is a need for an automated custom FPU generation scheme. This thesis presents a methodology for generating application-specific FPUs customized at the instruction level, with integrated datapath merging to minimize area. The methodology reduces the subset of floating-point instructions implemented to the minimum required for the application. Datapath merging is then performed on the required datapaths to minimize area. Previous datapath merging techniques failed to consider merging components of different bit-widths and thus ignore the bit-alignment problem in datapath merging. This thesis presents a novel bit-alignment solution during datapath merging. In creating the custom FPU, the subset of floating-point instructions that should be implemented in hardware has to be determined. Implementing more instructions in hardware reduces the cycle count of the application, but may lead to increased delay due to multiplexers inserted on the critical path during datapath merging. A rapid design space exploration was performed to explore the trade-offs. By performing this exploration, a designer could determine the number of instructions that should be implemented as a custom FPU and the number that should be left for software emulation, such that performance and area meets the designer's requirements. Customized FPUs were generated for different Mediabench applications and compared to a fully-featured reference FPU that implemented all floating-point operations. Reducing the floating-point instruction set reduced the FPU area by an average of 55%. Performing instruction reduction and then datapath merging reduced the FPU area by an average of 68%. Experiments showed that datapath merging without bit-alignment achieved an average area reduction of 10.1%. With bit-alignment, an average of 16.5% was achieved. Bit-alignment proved most beneficial when there was a diverse mix of different bit-widths in the datapaths. Performance of Field-Programmable Gate Arrays (FPGAs) used for floating-point applications is poor due to the complexity of floating-point arithmetic. Implementing floating-point units on FPGAs consume a large amount of resources. Therefore, there is a need for embedded FPUs in FPGAs. However, if unutilized, they waste area on the FPGA die. To overcome this issue, a novel flexible multi-mode embedded FPU for FPGAs is presented in this thesis that can be configured to perform a wide range of operations. The floating-point adder and multiplier in the embedded FPU can each be configured to perform one double-precision operation or two single-precision operations in parallel. To increase flexibility further, access to the large integer multiplier, adder and shifters in the FPU is provided. It is also capable of floating-point and integer multiply-add operations. Benchmark circuits were implemented on both a standard Xilinx Virtex-II FPGA and on the FPGA with embedded FPU blocks. The implementations on the FPGA with embedded FPUs showed mean area and delay improvements of 5.2x and 5.8x respectively for the double-precision benchmarks, and 4.4x and 4.2x for the single-precision benchmarks. embedded systems floating-point FPGA FPU
243	Enabling Gigabit IP for Embedded Systems Tsakiris, Nicholas, n.tsakiris@internode.on.net January 2009 (has links) For any practical implementation of chip design, there needs to be a hardware platform available for the purpose of prototyping and implementation of FPGA-based programs, whether they are written in VHDL or Verilog. Communication between the platform and a computer is a useful feature of many hardware solutions as it allows for the capability of regular data transmission between the two devices. Furthermore, the ability to communicate between the platform and a computer at high-speeds requires a specially constructed interface, one that can be modified by the designer at their choosing. There are a number of commercial packages which provide a hardware platform to perform this task, however there are drawbacks to many of the available options. Some may require special hardware to connect to a computer using proprietary connectors or boards, which increases the cost and reduces the flexibility of any solution. Other options may have limited access to the internal structure of the interface, limiting the ability of the developer to modify the interface to suit their needs. There may be an extra cost to provide the code to the interface, separate from the board, which can also tax design budgets. This dissertation provides a solution in the form of a Gigabit Ethernet connection with a custom IP/network layer written in VHDL to facilitate the connection. With an increasing number of IP-enabled devices available such as IPTV and set top boxes, the ability to link hardware using Ethernet is very useful and so the development of a lean and capable network layer was considered a suitable focus for the project. The overall goal has been to provide an interface which is cheap, open, robust and efficient, retaining the flexibility a developer might require to modify the code to their needs. After covering some basic background information about the project, the dissertation looks at the requirements of the board and interface, as well as the alternative interface solutions which were looked at before deciding on Gigabit Ethernet. The protocols used in Ethernet are then covered, with both an explanation of the structure of each and their relevance to the implementation. The Finite State Machines which control operation of the interface are covered in depth, with an explanation of their inter-connectivity to each other and how they fit in the data-flow between the computer and the board. Error correction and reliability is discussed, as well as any remaining components critical to the operation of the interface. Pipelining, the method of design which provides the speed required for Gigabit Ethernet, is covered along with the extra speed optimisation techniques used in the design such as RAM swinging buffers. Testing and synthesis are covered which ensure the design is as robust as possible, both in simulations and in real-world applications. The final design was implemented on a Xilinx Spartan 3 FPGA (XC3S5000-5FG900C) and capable of a maximum speed of 128.287 MHz, which is more than enough to satisfy the requirements of Gigabit Ethernet under a variety of network conditions. The interface code occupies 1,166 slices of logic on the FPGA (3% of the total amount of logic available), making it sufficiently compact to run large projects on the same chip. The core was tested on physical hardware and performed correctly at real line Gigabit speeds. Configuration of the computer along with the method of connecting to the board and transferring data is mentioned, with explanation of the code run on the computer to make this possible. Finally, the dissertation provides an example application through the use of JPEG2000 image compression/decompression. embedded vhdl fpga jpeg jpeg2000 pipelining xilinx
244	Customization of floating-point units for embedded systems and field programmable gate arrays Chong, Michael Yee Jern, Computer Science & Engineering, Faculty of Engineering, UNSW January 2009 (has links) While Application Specific Instruction Set Processors (ASIPs) have allowed designers to create processors with custom instructions to target specific applications, floating-point units (FPUs) are still instantiated as non-customizable general-purpose units, which if under utilized, wastes area and performance. However, customizing FPUs manually is a complex and time-consuming process. Therefore, there is a need for an automated custom FPU generation scheme. This thesis presents a methodology for generating application-specific FPUs customized at the instruction level, with integrated datapath merging to minimize area. The methodology reduces the subset of floating-point instructions implemented to the minimum required for the application. Datapath merging is then performed on the required datapaths to minimize area. Previous datapath merging techniques failed to consider merging components of different bit-widths and thus ignore the bit-alignment problem in datapath merging. This thesis presents a novel bit-alignment solution during datapath merging. In creating the custom FPU, the subset of floating-point instructions that should be implemented in hardware has to be determined. Implementing more instructions in hardware reduces the cycle count of the application, but may lead to increased delay due to multiplexers inserted on the critical path during datapath merging. A rapid design space exploration was performed to explore the trade-offs. By performing this exploration, a designer could determine the number of instructions that should be implemented as a custom FPU and the number that should be left for software emulation, such that performance and area meets the designer's requirements. Customized FPUs were generated for different Mediabench applications and compared to a fully-featured reference FPU that implemented all floating-point operations. Reducing the floating-point instruction set reduced the FPU area by an average of 55%. Performing instruction reduction and then datapath merging reduced the FPU area by an average of 68%. Experiments showed that datapath merging without bit-alignment achieved an average area reduction of 10.1%. With bit-alignment, an average of 16.5% was achieved. Bit-alignment proved most beneficial when there was a diverse mix of different bit-widths in the datapaths. Performance of Field-Programmable Gate Arrays (FPGAs) used for floating-point applications is poor due to the complexity of floating-point arithmetic. Implementing floating-point units on FPGAs consume a large amount of resources. Therefore, there is a need for embedded FPUs in FPGAs. However, if unutilized, they waste area on the FPGA die. To overcome this issue, a novel flexible multi-mode embedded FPU for FPGAs is presented in this thesis that can be configured to perform a wide range of operations. The floating-point adder and multiplier in the embedded FPU can each be configured to perform one double-precision operation or two single-precision operations in parallel. To increase flexibility further, access to the large integer multiplier, adder and shifters in the FPU is provided. It is also capable of floating-point and integer multiply-add operations. Benchmark circuits were implemented on both a standard Xilinx Virtex-II FPGA and on the FPGA with embedded FPU blocks. The implementations on the FPGA with embedded FPUs showed mean area and delay improvements of 5.2x and 5.8x respectively for the double-precision benchmarks, and 4.4x and 4.2x for the single-precision benchmarks. embedded systems floating-point FPGA FPU
245	Event pattern detection for embedded systems / Carlson, Jan, January 2007 (has links) Diss. Västerås : Mälardalens högskola, 2007. / S. 153-[169]: Bibliografi.
246	Chaotic embedding of the Whitehead continuum / Jubran, Isa S. January 1992 (has links) Thesis (Ph. D.)--Oregon State University, 1993. / Typescript (photocopy). Includes bibliographical references (leaves 88-91). Also available on the World Wide Web.
247	A reconfigurable simulator for coupled converyors Hayslip, Nunzio. January 2006 (has links) Thesis (M.S.)--University of Akron, Dept. of Electrical and Computer Engineering, 2006. / "December, 2006." Title from electronic thesis title page (viewed 12/31/2008) Advisor, Shivakumar Sastry; Committee members, Nathan Ida, James E. Grover; Department Chair, Alex De Abreu Garcia; Dean of the College, George K. Haritos; Dean of the Graduate School, George R. Newkome. Includes bibliographical references.
248	Minimum power consumption for rate monotonic tasks Huang, Chiao Ching, Baskiyar, Sanjeev, January 2008 (has links) (PDF) Thesis (M.S.)--Auburn University, 2008. / Abstract. Vita. Includes bibliographical references (p. 49-50).
249	A graphical approach to testing real-time embedded devices a thesis / Day, Steven Michael. Kearns, Timothy J. January 1900 (has links) Thesis (M.S.)--California Polytechnic State University, 2009. / Title from PDF title page; viewed on July 2, 2009. "June 2009." "In partial fulfillment of the requirements for the degree [of] Master of Science in Computer Science." "Presented to the faculty of California Polytechnic State University, San Luis Obispo." Major professor: Tim Kearns, Ph.D. Includes bibliographical references (p. 62-65).
250	Deployed software analysis Diep, Madeline M. January 2009 (has links) Thesis (Ph.D.)--University of Nebraska-Lincoln, 2009. / Title from title screen (site viewed June 26, 2009). PDF text: 169 p. : ill. ; 3 Mb. UMI publication number: AAT 3350444. Includes bibliographical references. Also available in microfilm and microfiche formats.

Search results