91 |
Continuous system-wide profiling of High Performance Computing parallel applications : Profiling high performance applicationsDugani, Vishwanath January 2016 (has links)
Profiling of an application identifies parts of the code being executed using the hardware performance counters thus providing the application’s performance. Profiling has long been standard in the development process focused on a single execution of a single program. As computing systems have evolved, understanding the bigger picture across multiple machines has become increasingly important. As supercomputing grows in pervasiveness and scale, understanding parallel applications performance and utilization characteristics is critically important, because even minor performance improvements translate into large cost savings. The study surveys various tools for the application. After which, Perfminer was integrated in SCANIA’s Linux clusters to profile CFD and FEA applications exploiting the batch queue system features for continuous system wide profiling, which provides performance insights for high performance applications, with negligible overhead. Perfminer provides stable, accurate profiles and a cluster-scale tool for performance analysis. Perfminer effectively highlights the micro-architectural bottlenecks. / Profilering av en ansökan identifierar delar av koden exekveras med hjälp av hårdvara prestandaräknare därmed ger programmets prestanda. Profilering har länge varit standard i utvecklingsprocessen fokuserad på en enda exekvering av ett enda program. Som datorsystem har utvecklats, att förstå helheten på flera datorer har blivit allt viktigare. Som superdatorer växer i genomslagskraft och skala, är förståelsen parallella applikationer prestanda och användningsegenskaper avgörande betydelse, eftersom även prestandaförbättringar mindre översätta till stora kostnadsbesparingar. Studien granskar olika verktyg för tillämpningen. Därefter var Perfminer integrerat i Scanias Linux-kluster att profilera CFD och FEA-program som utnyttjar sats kösystem funktioner för kontinuerlig hela systemet profilering, vilket ger prestanda insikter för högpresterande tillämpningar, med försumbar overhead. Perfminer ger stabila, noggranna profiler och ett kluster skala verktyg för prestandaanalys. Perfminer belyser effektivt mikro arkitektoniska flaskhalsar.
|
92 |
Dependable Distributed Control System : Redundancy and Concurrency defectsJohansson, Bjarne January 2022 (has links)
Intelligent devices, interconnectivity, and information exchange are characteristics often associated with Industry 4.0. A peer-to-peer-oriented architecture with the network as the system center succeeds the traditional controller-centric topology used in today's distributed control systems, improving information exchange in future designs. The network-centric architecture allows IT-solution such as cloud, fog, and edge computing to enter the automation industry. IT-solution that rely on virtualization techniques such as virtual machines and containers. Virtualization technology, combined with virtual instance management, provide the famous elasticity that cloud computing offer. Container management systems like Kubernetes can scale the number of containers to match the service demand and redeploy containers affected by failures. Distributed control systems constitute automation infrastructure core in many critical applications and domains. The criticality puts high dependability requirements upon the systems, i.e., dependability is essential. High-quality software and redundancy solutions are examples of traditional ways to increase dependability. Dependability is the common denominator for the challenges addressed in this thesis. Challenges that range from concurrency defect localization with static code analysis to utilization of failure recovery mechanisms provided by container management systems in a control system context. We evaluate the feasibility of locating concurrency defects in embedded industrial software with static code analysis. Furthermore, we propose a deployment agnostic failure detection and role selection mechanism for controller redundancy in a network-centric context. Finally, we use the container management system Kubernetes to orchestrate a cluster of virtualized controllers. We evaluate the failure recovery properties of the container management system in combination with redundant virtualized controllers - redundant controllers using the proposed failure detection and role selection solution.
|
93 |
Power-Aware Software Development For EMCA DSPZhang, Meishenglan January 2017 (has links)
The advent of FinFET technology necessitates a shift towards early dynamic power awareness, not only for ASIC block designers but also for software engineers that develop code for those blocks. CMOS dynamic power is typically reduced by optimizing the RTL models in terms of switching activity and clock gating efficiency. There is not much to be done after a model is committed. Programmable blocks though, like the Phoenix 4 Digital Signal Processor(EMCA Ericsson Multi Core Architecture), can have a “second chance” for low power even after silicon is produced by efficient use of the software source code in order to impact the dynamic power metrics. This requires a "full-stack" of power awareness all the way from the DSP hardware model up to the software development IDE. This Thesis work aims at two goals. The first goal is to realize a prototype, encapsulated flow for the DSP software developers which connects software IDE entry point to the low level, complex hardware power analysis tools. The second goal is to demonstrate how software can be used as an auxiliary knob to exploit potential tradeoffs in order to improve the DSP's dynamic power metrics. This hypothesis is tested by rescheduling operations on the DSP's resources either manually or implicitly through the compiler. Moreover, a method to align and compare algorithms, when it is possible to tradeoff performance for power, is devised and the estimation results are compared against real silicon measurements. The results show that the developed analysis flow is reliable and very efficient for the given purpose, even for people who have limited knowledge about low level hardware to facilitate quick power exploration and profiling. This is mainly realized by a unique feature that associates specific lines in the source code with the toggling behavior of the hardware model while execution. Based on that, the tradeoffs between power and performance for several testcases are demonstrated at both the assembly and C levels with good correlation versus silicon. Overall, this work's outcome hints that the compiler and software teams have many options to consider in order to optimize dynamic power for products already in the field.
|
94 |
Component-based software design of embedded real-time systemsWiklander, Jimmie January 2009 (has links)
Embedded systems have become commonplace in today's society and their complexity and number of functions are steadily increasing. This can be attributed to the unceasing advances in the microprocessor technology and the continuous delivery of more powerful and power-efficient microprocessors, which, in turn, allows more elaborate software implementations. Consequently, there is a strong interest in finding methods and tools that support flexible and efficient development of embedded software. Since these qualities are typically attributed to component-based design it makes sense to develop new design techniques targeting embedded systems based on components. This thesis aims to adapt the traditional component-based design approach for development of embedded real-time software. Component-based design relies on the existence of consistent and coherent models of individual components that can be composed to model the whole system. However, it can be argued that the special characteristics of embedded systems make such modeling challenging. One reason is that embedded systems typically exhibit a strong integration between hardware and software, which leads to a need for a common design space, or at least the possibility to create consistent models of both hardware and software components of an embedded system. Another reason is that the majority of embedded systems can be viewed as real-time systems and therefore it is necessary to express timing requirements alongside functional properties in the model. In order to overcome these difficulties, we adopt a reactive perspective, in which the functionality of both hardware and software is described in terms of time-constrained reactions of reactive objects. This enables capturing the complete functionality of the system (hardware and software) along with timing requirements in a single model.The reactive view lies behind the modeling framework for embedded real-time systems and the component-based software design methodology presented in this thesis. The methodology allows both functional and timing properties of a system model to be preserved during implementation process by means of a seamless transition between a model and an implementation, whereas the modeling framework enables the developer to offer platform-independent correctness for real-time systems, provided that the software can be scheduled on a given hardware platform. Further, this thesis includes a case study, in which the methodology is used for designing a real-life system. The case study demonstrates the potential of the methodology to bring the benefits of classical component-based design to the realm of embedded systems. / ESIS
|
95 |
Programming embedded real-time systems : implementation techniques for concurrent reactive objectsAittamaa, Simon January 2011 (has links)
An embedded system is a computer system that is a part of a larger device with hardware and mechanical parts. Such a system often has limited resources (such as processing power, memory, and power) and it typically has to meet hard real-time requirements. Today, as the area of application of embedded systems is constantly increasing, resulting in higher demands on system performance and a growing complexity of embedded software, there is a clear trend towards multi-core and multi-processor systems. Such systems are inherently concurrent, but programming concurrent systems using the traditional abstractions (i.e., explicit threads of execution) has been shown to be both difficult and error-prone. The natural solution is to raise the abstraction level and make concurrency implicit, in order to aid the programmer in the task of writing correct code. However, when we raise the abstraction level, there is always an inherent cost. In this thesis we consider one possible concurrency model, the concurrent reactive object approach that offers implicit concurrency at the object level. This model has been implemented in the programming language Timber, which primarily targets development of real-time systems. It is also implemented in TinyTimber, a subset of the C language closely matching Timber’s execution model. We quantify various costs of a TinyTimber implementation of the model (such as context switching and message passing overheads) on a number of hardware platforms and compare them to the costs of the more common thread-based approach. We then demonstrate how some of these costs can be mitigated using stack resource policy. On a separate track, we present a feasibility test for garbage collection in a reactive real-time system with automatic memory management, which is a necessary component for verification of correctness of a real-time system implemented in Timber
|
96 |
Robust industrial automation software: outsets for non-determinism and real-time executionLindner, Marcus January 2016 (has links)
Studies about the industrial standard IEC 61499 and its relation to the RTFM Model of Computation represent the basis of this thesis. An overview of industrial automation software in general and in the scope of Svenska Kraftnät introduces the subject of software related issues. The thesis focuses on selected properties, which are important for software development to improve the robustness of industrial automation software. Among others, timing is essential due to its importance in real-time applications. An example case of the nuclear power plant Forsmark in Sweden illustrates problems correlated with timing issues and makes the lack of an overall system modelling (including timing) evident. A review of the relevant industrial standards for software development in industrial applications provides a background for various aspects of software compliance to safety requirements. Special attention lies on the standards IEC 61131 and IEC 61499 for industrial software development and their programming and execution model. The presented RTFM framework defines a concurrent model of execution based on tasks and resources together with a timing semantics that was designed from the outset for the development of embedded real-time systems. It can serve as a scheduling and resource management for the run-time environments of industrial applications, while addressing the aforementioned issues. Mappings from the functional layer (IEC 61499 function block networks) and safety layer (PLCopen safety function blocks) to RTFM show the applicability and possibility of using IEC 61499 as an overall, distributed, and hierarchical model. A discussion on options for future work presents choices to pursue the second half of the PhD studies. Formal methods for program specification and verification open up an interesting path to further increase the robustness of industrial automation software. / Frekvensomriktares funktion i beredskapskritiska system
|
97 |
DISTRIBUTED ARTIFICIAL INTELLIGENCE FOR ANOMALY DETECTION IN A MODULAR MANUFACTURING ENVIRONMENTHodzic, Hana January 2023 (has links)
This thesis investigates anomaly detection and classification in a simulated modular manufacturingenvironment using Machine Learning algorithm Random Forest. This algorithm is tested on a localcomputer and an embedded device, specifically the Raspberry PI. The performance of Random Forestmodels is evaluated for anomaly detection and classification tasks, considering different evaluationmetrics and execution time. The results indicate variations in model performance across differentmodules and classification tasks. It is observed that the limited computing resources of the RaspberryPI for anomaly detection tasks lead to significantly higher prediction times compared to a computer,highlighting the impact of embedded systems’ constraints on ML model execution
|
98 |
LP_MQTT - A Low-Power IoT Messaging Protocol Based on MQTT StandardAntony, Anchu, Myladi Kelambath, Deepthi January 2024 (has links)
In the Internet of Things (IoT) era, the MQTT Protocol played a bigpart in increasing the flow of uninterrupted communication betweenconnected devices. With its functioning being on the publish/subscribe messaging system and having a central broker framework, MQTTconsidering its lightweight functionality, played a very vital role inIoT connectivity. Nonetheless, there are challenges ahead, especiallyin energy consumption, because the majority of IoT devices operateunder constrained power sources. In line with this, our research suggests how the MQTT broker can make an intelligent decision usingan intelligent algorithm. The algorithm idealizes wake-up times forsubscriber clients with the aid of previous data, including machinelearning (ML) regression techniques in the background that producesubstantial energy savings. The study combines the regression machine learning approaches with the quality of service levels’ incorporation into the decision framework through the introduction ofoperational modes designed for effective client management. The research, therefore, aims universally to enhance the efficiency availablein MQTT making it applicable across diverse IoT applications by simultaneously addressing both the broker and the client sides . Theversatile approach ensures more performance and sustainability forMQTT, further strengthening its build as one of the building blocksfor energy efficient and responsive communication in the IoT. Deeplearning approaches that follow regression will be the required leapfor the transformation of energy consumption and adoption of resource allocation within IoT networks to an optimization level thatwould unlock new frontiers of efficiency for a sustainable connectedfuture.
|
99 |
Communication Protocols on the PIC24EP and Arduino - A Tutorial for Undergraduate StudentsChintapalli, Srikar January 2017 (has links)
No description available.
|
100 |
A SystemC model for the eBrainStathis, Dimitrios January 2017 (has links)
The development of neural networks has become one of the most interesting topics in the scientific community. Systems that are based on the brain behavior can find applications in a wide variety of fields, from simulating the brain to better understand it (applications in neuroscience), to control theory and super computing. Brain-like systems could possible be a new kind of computer architecture that will lead us away from the classic von Neumann architecture. That can help us bypass the problems that we now face, with the Moore’s low slowing down and complex problems becoming all the more common. With brain-like computing, we might be in the road to computer systems that are no longer programmed but taught. To-date, the most common platform for simulating such systems are the GPGPUs and super computers. But they lack on scalability, and real time simulations are far from trivial. Because of that there is an interest in custom hardware implementation of such system (in ASIC or FPGAs). In this work, we focus on the ASIC design of such a system. Specifically, with the characterization and design space exploration of the eBrain architecture, a hardware architecture for the BCPNN model. During the design process of an ASIC, in order to be able to characterize it, the simulation of the synthesized physical design of the RTL model is required. Those kinds of simulations require an extensive amount of time. In this thesis, to tackle with this problem a systemC model of the architecture is developed. This model is able to be modified and fits different configurations of a general hardware architecture. The systemC model can be used to reduce the amount of time the simulation requires and, by using back annotated data from synthesized parts of the hardware architecture, to provide us with accurate characterization of the design. In this work, we go through the basics of the BCPNN and the eBrain architecture. Then we develop a model that can emulate the behavior of the eBrain architecture in a probabilistic manner. A specific configuration is chosen to be explored. Furthermore, floating-point units are synthesized in the physical level in order to be able to back annotate their power measurements to the model. Moreover, the BCPNN equations are explored and implemented in an RTL level with the use of the floating-point units. Finally, an example configuration is simulated and its results are presented.
|
Page generated in 0.0532 seconds