Global ETD Search

1	GPTT: A Cross-Platform Graphics Performance Tuning Tool for Embedded System Lin, Keng-Yu 22 August 2006 (has links) This thesis presents a new cross-platform graphics performance tool, GPTT (Graphics Performance Tuning Tool), which is designed for helping developers to find the performance bottleneck of their games or applications on embedded systems. The functions of performance tool are embedded into the standard graphics library, OpenGL ES, to achieve cross-platform. In order to verify the proposed tool, we also implement the OpenGL ES specification in addition to the tool itself. The performance tool is separated into visualization part and measurement part from which it successfully decreases the load in embedded system, while running the application. Via the tool it identifies many bottlenecks that can be improved. Rendering Pipeline OpenGLES Performance Tuning Embedded System
2	OpenGL ES-based Emulator with Performance Tuning in the 3DApplication Development Platform for Embedded Systems Hung, Chih-Yang 04 September 2009 (has links) Developing 3D application for low-performance embedded system often contains some limitations as hardware specifications (e.g. memory and processing efficiency). Existing OpenGL ES emulators are designed to provide the development environment for programmers, but these emulators often are lack of cross-platform performance tuning analysis for embedded systems and are only suitable for a designated hardware. In this thesis, we present an OpenGL ES emulator with performance tuning for developing 3D application of embedded systems without conforming to a specific hardware. It can further help programmers to emulate 3D application on PC for different development platforms. Emulator OpenGL ES Embedded System Performance Tuning
3	Performance Tuning of Big Data Platform : Cassandra Case Study Sathvik, Katam January 2016 (has links) Usage of cloud-based storage systems gained a lot of prominence in fast few years. Every day millions of files are uploaded and downloaded from cloud storage. This data that cannot be handled by traditional databases and this is considered to be Big Data. New powerful platforms have been developed to store and organize big and unstructured data. These platforms are called Big Data systems. Some of the most popular big data platform are Mongo, Hadoop, and Cassandra. In this, we used Cassandra database management system because it is an open source platform that is developed in java. Cassandra has a masterless ring architecture. The data is replicated among all the nodes for fault tolerance. Unlike MySQL, Cassandra uses per-column basis technique to store data. Cassandra is a NoSQL database system, which can handle unstructured data. Most of Cassandra parameters are scalable and are easy to configure. Amazon provides cloud computing platform that helps a user to perform heavy computing tasks over remote hardware systems. This cloud computing platform is known as Amazon Web Services. AWS services also include database deployment and network management services, that have a non-complex user experience. In this document, a detailed explanation on Cassandra database deployment on AWS platform is explained followed by Cassandra performance tuning. In this study impact on read and write performance with change Cassandra parameters when deployed on Elastic Cloud Computing platform are investigated. The performance evaluation of a three node Cassandra cluster is done. With the knowledge of configuration parameters a three node, Cassandra database is performance tuned and a draft model is proposed. A cloud environment suitable for the experiment is created on AWS. A three node Cassandra database management system is deployed in cloud environment created. The performance of this three node architecture is evaluated and is tested with different configuration parameters. The configuration parameters are selected based on the Cassandra metrics behavior with the change in parameters. Selected parameters are changed and the performance difference is observed and analyzed. Using this analysis, a draft model is developed after performance tuning selected parameters. This draft model is tested with different workloads and compared with default Cassandra model. The change in the key cache memory and memTable parameters showed improvement in performance metrics. With increases of key cache size and save time period, read performance improved. This also showed effect on system metrics like increasing CPU load and disk through put, decreasing operation time and The change in memTable parameters showed the effect on write performance and disk space utilization. With increase in threshold value of memTable flush writer, disk through put increased and operation time decreased. The draft derived from performance evaluation has better write and read performance. Big Data platforms Cassandra Amazon Web Service Performance tuning.
4	Definition of Framework-based Performance Models for Dynamic Performance Tuning Cesar Galobardes, Eduardo 07 April 2006 (has links) Parallel and distributed programming constitutes a highly promising approach to improving the performance of many applications. However, in comparison to sequential programming, many new problems arise in all phases of the development cycle of this kind of applications. For example, in the analysis phase of parallel/distributed programs, the programmer has to decompose the problem (data and/or code) to find the concurrency of the algorithm. In the design phase, the programmer has to be aware of the communication and synchronization conditions between tasks. In the implementation phase, the programmer has to learn how to use specific communication libraries and runtime environments but also to find a way of debugging programs. Finally, to obtain the best performance, the programmer has to tune the application by using monitoring tools, which collect information about the application's behavior. Tuning can be a very difficult task because it can be difficult to relate the information gathered by the monitor to the application's source code. Moreover, tuning can be even more difficult for those applications that change their behavior dynamically because, in this case, a problem might happen or not depending on the execution conditions.It can be seen that these issues require a high degree of expertise, which prevents the more widespread use of this kind of solution. One of the best ways to solve these problems would be to develop, as has been done in sequential programming, tools to support the analysis, design, coding, and tuning of parallel/distributed applications. In the particular case of performance analysis and/or tuning, it is important to note that the best way of analyzing and tuning parallel/distributed applications depends on some of their behavioral characteristics. If the application to be tuned behaves in a regular way then a static analysis (predictive or trace based) would be enough to find the application's performance bottlenecks and to indicate what should be done to overcome them. However, if the application changes its behavior from execution to execution or even dynamically changes its behavior in a single execution then the static analysis cannot offer efficient solutions for avoiding performance bottlenecks. In this case, dynamic monitoring and tuning techniques should be used instead. However, in dynamic monitoring and tuning, decisions must be taken efficiently, which means that the application's performance analysis outcome must be accurate and punctual in order to effectively tackle problems; at the same time, intrusion on the application must be minimized because the instrumentation inserted in the application in order to monitor and tune it alters its behavior and could introduce performance problems that were not there before the instrumentation. This is more difficult to achieve if there is no information about the structure and behavior of the application; therefore, blind automatic dynamic tuning approaches have limited success, whereas cooperative dynamic tuning approaches can cope with more complex problems at the cost of asking for user collaboration. We have proposed a third approach. If a programming tool, based on the use of skeletons or frameworks, has been used in the development of the application then much information about the structure and behavior of the application is available and a performance model associated to the structure of the application can be defined for use by the dynamic tuning tool. The resulting tuning tool should produce the outcome of a collaborative one while behaving like an automatic one from the point of view of the application developer. Performance models Performance tuning Parallel applications Tecnologies 519.1
5	Optimalizace provozních režimů zážehového motoru / SI Engine Performance Tuning Beran, Martin January 2008 (has links) The main scope of this thesis is the four stoke petrol engine performance tuning by ECU. The thesis analyses processes during engine management, describes and explains singular signals processed and generated by ECU. Designs measuring strings and optimal procedures for measuring on whose basis has been assembled optimal methodology leading to the optimalization of single operating mode of engine.
6	High-Performance Matrix Multiplication: Hierarchical Data Structures, Optimized Kernel Routines, and Qualitative Performance Modeling Wu, Wenhao 02 August 2003 (has links) The optimal implementation of matrix multiplication on modern computer architectures is of great importance for scientific and engineering applications. However, achieving the optimal performance for matrix multiplication has been continuously challenged both by the ever-widening performance gap between the processor and memory hierarchy and the introduction of new architectural features in modern architectures. The conventional way of dealing with these challenges benefits significantly from the blocking algorithm, which improves the data locality in the cache memory, and from the highly tuned inner kernel routines, which in turn exploit the architectural aspects on the specific processor to deliver near peak performance. A state-of-art improvement of the blocking algorithm is the self-tuning approach that utilizes "heroic" combinatorial optimization of parameters spaces. Other recent research approaches include the approach that explicitly blocks for the TLB (Translation Lookaside Buffer) and the hierarchical formulation that employs memoryriendly Morton Ordering (a spaceilling curve methodology). This thesis compares and contrasts the TLB-blocking-based and Morton-Order-based methods for dense matrix multiplication, and offers a qualitative model to explain the performance behavior. Comparisons to the performance of self-tuning library and the "vendor" library are also offered for the Alpha architecture. The practical benchmark experiments demonstrate that neither conventional blocking-based implementations nor the self-tuning libraries are optimal to achieve consistent high performance in dense matrix multiplication of relatively large square matrix size. Instead, architectural constraints and issues evidently restrict the critical path and options available for optimal performance, so that the relatively simple strategy and framework presented in this study offers higher and flatter overall performance. Interestingly, maximal inner kernel efficiency is not a guarantee of global minimal multiplication time. Also, efficient and flat performance is possible at all problem sizes that fit in main memory, rather than "jagged" performance curves often observed in blocking and self-tuned blocking libraries. performance tuning matrix multiplication hierarchical matrix storage cache model
7	On the design of architecture-aware algorithms for emerging applications Kang, Seunghwa 30 January 2011 (has links) This dissertation maps various kernels and applications to a spectrum of programming models and architectures and also presents architecture-aware algorithms for different systems. The kernels and applications discussed in this dissertation have widely varying computational characteristics. For example, we consider both dense numerical computations and sparse graph algorithms. This dissertation also covers emerging applications from image processing, complex network analysis, and computational biology. We map these problems to diverse multicore processors and manycore accelerators. We also use new programming models (such as Transactional Memory, MapReduce, and Intel TBB) to address the performance and productivity challenges in the problems. Our experiences highlight the importance of mapping applications to appropriate programming models and architectures. We also find several limitations of current system software and architectures and directions to improve those. The discussion focuses on system software and architectural support for nested irregular parallelism, Transactional Memory, and hybrid data transfer mechanisms. We believe that the complexity of parallel programming can be significantly reduced via collaborative efforts among researchers and practitioners from different domains. This dissertation participates in the efforts by providing benchmarks and suggestions to improve system software and architectures. MapReduce Nested parallelism Parallel algorithm Algorithm engineering Performance tuning GPU Transactional memory Algorithms Parallel algorithms Multiprocessors
8	Optimisation of a Graph Visualization Tool: Vizz3D Carlsson, Johan January 2006 (has links) <p>Vizz3D is a graph visualization tool developed at Växjö University. It is used to visualize different aspects of software systems in 3D, based on the static analysis of source code. It can optionally use Java3D or OpenGL as a graphics library.</p><p>In order to visualize huge 3D structures performance is very important. This comes from the fact that the structures must be redrawn with no delay when a user interacts with the system. If there were a delay the user would loose the cognitive orientation because his interaction and the feedback would not fit. Vizz3D was not capable to run huge visualizations fast enough, and therefore careful optimisation was essential. Additionally, the Vizz3D tool is just at the beginning of its software life cycle.</p><p>For optimisation, JOGL (Java Bindings for OpenGL) was chosen. The extension with a JOGL version was necessary since the GL4Java (OpenGL for Java) wrapper used for the implementation of Vizz3D is no longer supported. JOGL was therefore needed for assuring future maintainability.</p><p>The JOGL version of Vizz3D was optimised to be able to visualize huge graphs with acceptable performance. To determine what areas of Vizz3D that consumed most of its resources, the process of profiling were used. The system performance was improved according to several aspects: Computational performance, Scalability, Perceived performance, RAM footprint and Start-up time. The results were then evaluated by using benchmarking techniques. After optimisation, the performance of Vizz3D was improved a lot which led to that huge graphs now could be visualized with acceptable performance.</p> Optimisation Performance Tuning Vizz3D Graph Visualization Java OpenGL JOGL Computer science Datalogi
9	Optimisation of a Graph Visualization Tool: Vizz3D Carlsson, Johan January 2006 (has links) Vizz3D is a graph visualization tool developed at Växjö University. It is used to visualize different aspects of software systems in 3D, based on the static analysis of source code. It can optionally use Java3D or OpenGL as a graphics library. In order to visualize huge 3D structures performance is very important. This comes from the fact that the structures must be redrawn with no delay when a user interacts with the system. If there were a delay the user would loose the cognitive orientation because his interaction and the feedback would not fit. Vizz3D was not capable to run huge visualizations fast enough, and therefore careful optimisation was essential. Additionally, the Vizz3D tool is just at the beginning of its software life cycle. For optimisation, JOGL (Java Bindings for OpenGL) was chosen. The extension with a JOGL version was necessary since the GL4Java (OpenGL for Java) wrapper used for the implementation of Vizz3D is no longer supported. JOGL was therefore needed for assuring future maintainability. The JOGL version of Vizz3D was optimised to be able to visualize huge graphs with acceptable performance. To determine what areas of Vizz3D that consumed most of its resources, the process of profiling were used. The system performance was improved according to several aspects: Computational performance, Scalability, Perceived performance, RAM footprint and Start-up time. The results were then evaluated by using benchmarking techniques. After optimisation, the performance of Vizz3D was improved a lot which led to that huge graphs now could be visualized with acceptable performance. Optimisation Performance Tuning Vizz3D Graph Visualization Java OpenGL JOGL Computer Sciences Datavetenskap (datalogi)
10	CONFPROFITT: A CONFIGURATION-AWARE PERFORMANCE PROFILING, TESTING, AND TUNING FRAMEWORK Han, Xue 01 January 2019 (has links) Modern computer software systems are complicated. Developers can change the behavior of the software system through software configurations. The large number of configuration option and their interactions make the task of software tuning, testing, and debugging very challenging. Performance is one of the key aspects of non-functional qualities, where performance bugs can cause significant performance degradation and lead to poor user experience. However, performance bugs are difficult to expose, primarily because detecting them requires specific inputs, as well as specific configurations. While researchers have developed techniques to analyze, quantify, detect, and fix performance bugs, many of these techniques are not effective in highly-configurable systems. To improve the non-functional qualities of configurable software systems, testing engineers need to be able to understand the performance influence of configuration options, adjust the performance of a system under different configurations, and detect configuration-related performance bugs. This research will provide an automated framework that allows engineers to effectively analyze performance-influence configuration options, detect performance bugs in highly-configurable software systems, and adjust configuration options to achieve higher long-term performance gains. To understand real-world performance bugs in highly-configurable software systems, we first perform a performance bug characteristics study from three large-scale opensource projects. Many researchers have studied the characteristics of performance bugs from the bug report but few have reported what the experience is when trying to replicate confirmed performance bugs from the perspective of non-domain experts such as researchers. This study is meant to report the challenges and potential workaround to replicate confirmed performance bugs. We also want to share a performance benchmark to provide real-world performance bugs to evaluate future performance testing techniques. Inspired by our performance bug study, we propose a performance profiling approach that can help developers to understand how configuration options and their interactions can influence the performance of a system. The approach uses a combination of dynamic analysis and machine learning techniques, together with configuration sampling techniques, to profile the program execution, analyze configuration options relevant to performance. Next, the framework leverages natural language processing and information retrieval techniques to automatically generate test inputs and configurations to expose performance bugs. Finally, the framework combines reinforcement learning and dynamic state reduction techniques to guide subject application towards achieving higher long-term performance gains. Configurable Software System Performance Testing Performance Bugs Performance Tuning Software Engineering

Search results