Global ETD Search

1	Modular Objective-C run-time library / Modular Objective-C run-time library Váša, Kryštof January 2013 (has links) This thesis contains analysis of currently available Objective-C run-time libraries (GCC, Apple and Étoilé run-times), their prerequisites and dependencies on the particular platform and operating system. The result of the analysis is a design of a modular run-time library that allows dynamic configuration of each component for the particular need (e.g. disabling run-time locks in a single-threaded environment). The resulting design can also be easily ported to other atypical platforms (e.g. kernel, or an experimental OS) and extended feature-wise (e.g. adding support for Objective-C categories, or associated objects). A prototype implementation of such a modular run-time for Objective-C also is included. Objective-C; run-time
2	Design and implementation of reconfigurable DSP circuit architectures on FPGA Heron, Jean-Paul Stephen January 1998 (has links) No description available. 621.3192 Run time reconfiguration
3	Run-time optimization of adaptive irregular applications Yu, Hao 15 November 2004 (has links) Compared to traditional compile-time optimization, run-time optimization could oﬀer signiﬁcant performance improvements when parallelizing and optimizing adaptive irregular applications, because it performs program analysis and adaptive optimizations during program execution. Run-time techniques can succeed where static techniques fail because they exploit the characteristics of input data, programs' dynamic behaviors, and the underneath execution environment. When optimizing adaptive irregular applications for parallel execution, a common observation is that the effectiveness of the optimizing transformations depends on programs' input data and their dynamic phases. This dissertation presents a set of run-time optimization techniques that match the characteristics of programs' dynamic memory access patterns and the appropriate optimization (parallelization) transformations. First, we present a general adaptive algorithm selection framework to automatically and adaptively select at run-time the best performing, functionally equivalent algorithm for each of its execution instances. The selection process is based on off-line automatically generated prediction models and characteristics (collected and analyzed dynamically) of the algorithm's input data, In this dissertation, we specialize this framework for automatic selection of reduction algorithms. In this research, we have identiﬁed a small set of machine independent high-level characterization parameters and then we deployed an off-line, systematic experiment process to generate prediction models. These models, in turn, match the parameters to the best optimization transformations for a given machine. The technique has been evaluated thoroughly in terms of applications, platforms, and programs' dynamic behaviors. Speciﬁcally, for the reduction algorithm selection, the selected performance is within 2% of optimal performance and on average is 60% better than "Replicated Buffer," the default parallel reduction algorithm speciﬁed by OpenMP standard. To reduce the overhead of speculative run-time parallelization, we have developed an adaptive run-time parallelization technique that dynamically chooses effcient shadow structures to record a program's dynamic memory access patterns for parallelization. This technique complements the original speculative run-time parallelization technique, the LRPD test, in parallelizing loops with sparse memory accesses. The techniques presented in this dissertation have been implemented in an optimizing research compiler and can be viewed as effective building blocks for comprehensive run-time optimization systems, e.g., feedback-directed optimization systems and dynamic compilation systems. compiler optimizations adaptive optimization performance modeling run-time parallelization run-time optimization reduction parallelization
4	Run-time optimization of adaptive irregular applications Yu, Hao 15 November 2004 (has links) Compared to traditional compile-time optimization, run-time optimization could oﬀer signiﬁcant performance improvements when parallelizing and optimizing adaptive irregular applications, because it performs program analysis and adaptive optimizations during program execution. Run-time techniques can succeed where static techniques fail because they exploit the characteristics of input data, programs' dynamic behaviors, and the underneath execution environment. When optimizing adaptive irregular applications for parallel execution, a common observation is that the effectiveness of the optimizing transformations depends on programs' input data and their dynamic phases. This dissertation presents a set of run-time optimization techniques that match the characteristics of programs' dynamic memory access patterns and the appropriate optimization (parallelization) transformations. First, we present a general adaptive algorithm selection framework to automatically and adaptively select at run-time the best performing, functionally equivalent algorithm for each of its execution instances. The selection process is based on off-line automatically generated prediction models and characteristics (collected and analyzed dynamically) of the algorithm's input data, In this dissertation, we specialize this framework for automatic selection of reduction algorithms. In this research, we have identiﬁed a small set of machine independent high-level characterization parameters and then we deployed an off-line, systematic experiment process to generate prediction models. These models, in turn, match the parameters to the best optimization transformations for a given machine. The technique has been evaluated thoroughly in terms of applications, platforms, and programs' dynamic behaviors. Speciﬁcally, for the reduction algorithm selection, the selected performance is within 2% of optimal performance and on average is 60% better than "Replicated Buffer," the default parallel reduction algorithm speciﬁed by OpenMP standard. To reduce the overhead of speculative run-time parallelization, we have developed an adaptive run-time parallelization technique that dynamically chooses effcient shadow structures to record a program's dynamic memory access patterns for parallelization. This technique complements the original speculative run-time parallelization technique, the LRPD test, in parallelizing loops with sparse memory accesses. The techniques presented in this dissertation have been implemented in an optimizing research compiler and can be viewed as effective building blocks for comprehensive run-time optimization systems, e.g., feedback-directed optimization systems and dynamic compilation systems. compiler optimizations adaptive optimization performance modeling run-time parallelization run-time optimization reduction parallelization
5	Predicting Performance Run-time Metrics in Fog Manufacturing using Multi-task Learning Nallendran, Vignesh Raja 26 February 2021 (has links) The integration of Fog-Cloud computing in manufacturing has given rise to a new paradigm called Fog manufacturing. Fog manufacturing is a form of distributed computing platform that integrates Fog-Cloud collaborative computing strategy to facilitate responsive, scalable, and reliable data analysis in manufacturing networks. The computation services provided by Fog-Cloud computing can effectively support quality prediction, process monitoring, and diagnosis efforts in a timely manner for manufacturing processes. However, the communication and computation resources for Fog-Cloud computing are limited in Fog manufacturing. Therefore, it is significant to effectively utilize the computation services based on the optimal computation task offloading, scheduling, and hardware autoscaling strategies to finish the computation tasks on time without compromising on the quality of the computation service. A prerequisite for adapting such optimal strategies is to accurately predict the run-time metrics (e.g., Time-latency) of the Fog nodes by capturing their inherent stochastic nature in real-time. It is because these run-time metrics are directly related to the performance of the computation service in Fog manufacturing. Specifically, since the computation flow and the data querying activities vary between the Fog nodes in practice. The run-time metrics that reflect the performance in the Fog nodes are heterogenous in nature and the performance cannot be effectively modeled through traditional predictive analysis. In this thesis, a multi-task learning methodology is adopted to predict the run-time metrics that reflect performance in Fog manufacturing by addressing the heterogeneities among the Fog nodes. A Fog manufacturing testbed is employed to evaluate the prediction accuracies of the proposed model and benchmark models. The proposed model can be further extended in computation tasks offloading and architecture optimization in Fog manufacturing to minimize the time-latency and improve the robustness of the system. / Master of Science / Smart manufacturing aims at utilizing Internet of things (IoT), data analytics, cloud computing, etc. to handle varying market demand without compromising the productivity or quality in a manufacturing plant. To support these efforts, Fog manufacturing has been identified as a suitable computing architecture to handle the surge of data generated from the IoT devices. In Fog manufacturing computational tasks are completed locally through the means of interconnected computing devices called Fog nodes. However, the communication and computation resources in Fog manufacturing are limited. Therefore, its effective utilization requires optimal strategies to schedule the computational tasks and assign the computational tasks to the Fog nodes. A prerequisite for adapting such strategies is to accurately predict the performance of the Fog nodes. In this thesis, a multi-task learning methodology is adopted to predict the performance in Fog manufacturing. Specifically, since the computation flow and the data querying activities vary between the Fog nodes in practice. The metrics that reflect the performance in the Fog nodes are heterogenous in nature and cannot be effectively modeled through conventional predictive analysis. A Fog manufacturing testbed is employed to evaluate the prediction accuracies of the proposed model and benchmark models. The results show that the multi-task learning model has better prediction accuracy than the benchmarks and that it can model the heterogeneities among the Fog nodes. The proposed model can further be incorporated in scheduling and assignment strategies to effectively utilize Fog manufacturing's computational services. Fog computing Fog manufacturing Multi-task learning Run-time metrics
6	Optimizing Distributed Transactions: Speculative Client Execution, Certified Serializability, and High Performance Run-Time Pandey, Utkarsh 01 September 2016 (has links) On-line services already form an important part of modern life with an immense potential for growth. Most of these services are supported by transactional systems, which are backed by database management systems (DBMS) in many cases. Many on-line services use replication to ensure high-availability, fault tolerance and scalability. Replicated systems typically consist of different nodes running the service co-ordinated by a distributed algorithm which aims to drive all the nodes along the same sequence of states by providing a total order to their operations. Thus optimization of both local DBMS operations through concurrency control and the distributed algorithm driving replicated services can lead to enhancing the performance of the on-line services. Deferred Update Replication (DUR) is a well-known approach to design scalable replicated systems. In this method, the database is fully replicated on each distributed node. User threads perform transactions locally and optimistically before a total order is reached. DUR based systems find their best usage when remote transactions rarely conflict. Even in such scenarios, transactions may abort due to local contention on nodes. A generally adopted method to alleviate the local contention is to invoke a local certification phase to check if a transaction conflicts with other local transactions already completed. If so, the given transaction is aborted locally without burdening the ordering layer. However, this approach still results in many local aborts which significantly degrades the performance. The first main contribution of this thesis is PXDUR, a DUR based transactional system, which enhances the performance of DUR based systems by alleviating local contention and increasing the transaction commit rate. PXDUR alleviates local contention by allowing speculative forwarding of shared objects from locally committed transactions awaiting total order to running transactions. PXDUR allows transactions running in parallel to use speculative forwarding, thereby enabling the system to utilize the highly parallel multi-core platforms. PXDUR also enhances the performance by optimizing the transaction commit process. It allows the committing transactions to skip read-set validation when it is safe to do so. PXDUR achieves performance gains of an order of magnitude over closest competitors under favorable conditions. Transactions also form an important part of centralized DBMS, which tend to support multi-threaded access to utilize the highly parallel hardware platforms. The applications can be wrapped in transactions which can then access the DBMS as per the rules of concurrency control. This allows users to develop applications that can run on DBMSs without worrying about synchronization. texttt{Serializability} is the de-facto standard form of isolation required by transactions for many applications. The existing methods employed by DBMSs to enforce serializability employ explicit fine-grained locking. The eager-locking based approach is pessimistic and can be too conservative for many applications. The locking approach can severely limit the performance of DBMSs especially for scenarios with moderate to high contention. This leads to the second major contribution of this thesis is TSAsR, an adaptive transaction processing framework, which can be applied to DBMSs to improve performance. TSAsR allows the DBMS's internal synchronization to be more relaxed and enforces serializability through the processng of external meta-data in an optimistic manner. It does not require any changes in the application code and achieves orders of magnitude performance improvements for high and moderate contention cases. The replicated transaction processing systems require a distributed algorithm to keep the system consistent by ensuring that each node executes the same sequence of deterministic commands. These algorithms generally employ texttt{State Machine Replication (SMR)}. Enhancing the performance of such algorithms is a potential way to increase the performance of distributed systems. However, developing new SMR algorithms is limited in production settings because of the huge verification cost involved in proving their correctness. There are frameworks that allow easy specification of SMR algorithms and subsequent verification. However, algorithms implemented in such framework, give poor performance. This leads to the third major contribution of this thesis Verified JPaxos, a JPaxos based runtime system which can be integrated to an easy to verify I/O automaton based on Multipaxos protocol. Multipaxos is specified in Higher Order Logic (HOL) for ease of verification which is used to generates executable code representing the Multipaxos state changes (I/O Automaton). The runtime drives the HOL generated code and interacts with the service and network to create a fully functional replicated Multipaxos system. The runtime inherits its design from JPaxos along with some optimizations. It achieves significant improvement over a state-of-art SMR verification framework while still being comparable to the performance of non-verified systems. / Master of Science Distributed systems Transactions Databases Verification Run-time Concurrency Paxos Replication.
7	A Genetic Algorithm-Based Place-and-Route Compiler For A Run-time Reconfigurable Computing System Kahne, Brian C. 14 May 1997 (has links) Configurable Computing is a technology which attempts to increase computational power by customizing the computational platform to the specific problem at hand. An experimental computing model known as wormhole run-time reconfiguration allows for partial reconfiguration and is highly scalable. In this approach, configuration information and data are grouped together in a computing unit called a stream, which can tunnel through the chip creating a series of interconnected pipelines. The Colt/Stallion project at Virginia Tech implements this computing model into integrated circuits. In order to create applications for this platform, a compiler is needed which can convert a human readable description of an algorithm into the sequences of configuration information understood by the chip itself. This thesis covers two compilers which perform this task. The first compiler, Tier1, requires a programmer to explicitly describe placement and routing inside of the chip. This could be considered equivalent to an assembler for a traditional microprocessor. The second compiler, Tier2, allows the user to express a problem as a dataflow graph. Actual placing and routing of this graph onto the physical hardware is taken care of through the use of a genetic algorithm. A description of the two languages is presented, followed by example applications. In addition, experimental results are included which examine the behavior of the genetic algorithm and how alterations to various genetic operator probabilities affects performance. / Master of Science placement genetic algorithm routing configurable computing wormhole run-time reconfiguration
8	An Investigation of Run-time Operations in a Heterogeneous Desktop Grid Environment: The Texas Tech University Desktop Grid Case Study Perez, Jerry Felix 01 January 2013 (has links) The goal of the dissertation study was to evaluate the existing DG scheduling algorithm. The evaluation was developed through previously explored simulated analyses of DGs performed by researchers in the field of DG scheduling optimization and to improve the current RT framework of the DG at TTU. The author analyzed the RT of an actual DG, thereby enabling other investigators to compare theoretical results with the results of this dissertation case study. Two statistical methods were used to formulate and validate predictive models: multiple linear regression and graphical exploratory data analysis techniques. Using both statistical methods, the author was able to determine that the theoretical model was able to predict the significance of four independent variables of resource fragmentation, computational volatility, resource management, and grid job scheduling on the dependent variables quality of service and job performance affecting RT. After an experimental case study analysis of the DG variables, the author identified the best DG resources to perform optimization of run-time performance of DG at TTU. The projected outcome of this investigation is the improved job scheduling techniques of the DG at TTU. Desktop Grid Distributed Computing Grid Computing Improved Performance QoS Run-time Computer Sciences
9	Monitoring And Checking Of Discrete Event Simulations Ulu, Buket 01 January 2003 (has links) (PDF) Discrete event simulation is a widely used technique for decision support. The results of the simulation must be reliable for critical decision making problems. Therefore, much research has concentrated on the verification and validation of simulations. In this thesis, we apply a well-known dynamic verification technique, assertion checking method, as a validation technique. Our aim is to validate the particular runs of the simulation model, rather than the model itself. As a case study, the operations of a manufacturing cell have been simulated. The cell, which is METUCIM Laboratory at the Mechanical Engineering Department of METU, has a robot and a conveyor to carry the materials, and two machines to manufacture the items, and a quality control to measure the correctness of the manufactured items. This simulation is monitored and checked by using the Monitoring and Checking (MaC) tool, a prototype developed at the University of Pennsylvania. The separation of low-level implementation details (pertaining to the code) from the high-level requirement specifications (pertaining to the simuland) helps keep monitoring and checking the simulations at an abstract level.
10	Compiling an Interpreted Processing Language : Improving Performance in a Large Telecommunication System Mejstad, Valdemar, Tångby, Karl-Johan January 2001 (has links) In this report we evaluate different techniques for increasing the performance of an interpreted processing language in a telecommunication system, called Billing Gateway R8. We have implemented a prototype in which we first translate the language into C++ code, and then compile it using a C++ compiler. In our prototype we experienced a threefold increase in processing throughput, compared to the original system, when running on a Symmetric Multi Processor with four CPU:s that were under full load. The prototype also showed better scalability than Billing Gateway R8, due to less use of dynamic memory management. performance Billing Gateway run-time compilation scalability source-to-source translation Telecommunications Telekommunikation Software Engineering Programvaruteknik

Search results