Global ETD Search

111	MIMOPack: A High Performance Computing Library for MIMO Communication Systems Ramiro Sánchez, Carla 30 July 2015 (has links) [EN] Nowadays, several communication standards are emerging and evolving, searching higher transmission rates, reliability and coverage. This expansion is primarily driven by the continued increase in consumption of mobile multimedia services due to the emergence of new handheld devices such as smartphones and tablets. One of the most significant techniques employed to meet these demands is the use of multiple transmit and receive antennas, known as MIMO systems. The use of this technology allows to increase the transmission rate and the quality of the transmission through the use of multiple antennas at the transmitter and receiver sides. MIMO technologies have become an essential key in several wireless standards such as WLAN, WiMAX and LTE. These technologies will be incorporated also in future standards, therefore is expected in the coming years a great deal of research in this field. Clearly, the study of MIMO systems is critical in the current investigation, however the problems that arise from this technology are very complex. High Performance Computing (HPC) systems, and specifically, modern hardware architectures as multi-core and many-cores (e.g Graphics Processing Units (GPU)) are playing a key role in the development of efficient and low-complexity algorithms for MIMO transmissions. Proof of this is that the number of scientific contributions and research projects related to its use has increased in the last years. Also, some high performance libraries have been implemented as tools for researchers involved in the development of future communication standards. Two of the most popular libraries are: IT++ that is a library based on the use of some optimized libraries for multi-core processors and the Communications System Toolbox designed for use with MATLAB, which uses GPU computing. However, there is not a library able to run on a heterogeneous platform using all the available resources. In view of the high computational requirements in MIMO application research and the shortage of tools able to satisfy them, we have made a special effort to develop a library to ease the development of adaptable parallel applications in accordance with the different architectures of the executing platform. The library, called MIMOPack, aims to implement efficiently using parallel computing, a set of functions to perform some of the critical stages of MIMO communication systems simulation. The main contribution of the thesis is the implementation of efficient Hard and Soft output detectors, since the detection stage is considered the most complex part of the communication process. These detectors are highly configurable and many of them include preprocessing techniques that reduce the computational cost and increase the performance. The proposed library shows three important features: portability, efficiency and easy of use. Current realease allows GPUs and multi-core computation, or even simultaneously, since it is designed to use on heterogeneous machines. The interface of the functions are common to all environments in order to simplify the use of the library. Moreover, some of the functions are callable from MATLAB increasing the portability of developed codes between different computing environments. According to the library design and the performance assessment, we consider that MIMOPack may facilitate industrial and academic researchers the implementation of scientific codes without having to know different programming languages and machine architectures. This will allow to include more complex algorithms in their simulations and obtain their results faster. This is particularly important in the industry, since the manufacturers work to analyze and to propose their own technologies with the aim that it will be approved as a standard. Thus allowing to enforce their intellectual property rights over their competitors, who should obtain the corresponding licenses to include these technologies into their products. / [ES] En la actualidad varios estándares de comunicación están surgiendo buscando velocidades de transmisión más altas y mayor fiabilidad. Esta expansión está impulsada por el aumento en el consumo de servicios multimedia debido a la aparición de nuevos dispositivos como los smartphones y las tabletas. Una de las técnicas empleadas más importantes es el uso de múltiples antenas de transmisión y recepción, conocida como sistemas MIMO, que permite aumentar la velocidad y la calidad de la transmisión. Las tecnologías MIMO se han convertido en una parte esencial en diferentes estándares tales como WLAN, WiMAX y LTE. Estas tecnologías se incorporarán también en futuros estándares, por lo tanto, se espera en los próximos años una gran cantidad de investigación en este campo. Está claro que el estudio de los sistemas MIMO es crítico en la investigación actual, sin embargo los problemas que surgen de esta tecnología son muy complejos. La sistemas de computación de alto rendimiento, y en concreto, las arquitecturas hardware actuales como multi-core y many-core (p. ej. GPUs) están jugando un papel clave en el desarrollo de algoritmos eficientes y de baja complejidad en las transmisiones MIMO. Prueba de ello es que el número de contribuciones científicas y proyectos de investigación relacionados con su uso se han incrementado en el últimos años. Algunas librerías de alto rendimiento se están utilizando como herramientas por investigadores en el desarrollo de futuros estándares. Dos de las librerías más destacadas son: IT++ que se basa en el uso de distintas librerías optimizadas para procesadores multi-core y el paquete Communications System Toolbox diseñada para su uso con MATLAB, que utiliza computación con GPU. Sin embargo, no hay una biblioteca capaz de ejecutarse en una plataforma heterogénea. En vista de los altos requisitos computacionales en la investigación MIMO y la escasez de herramientas capaces de satisfacerlos, hemos implementado una librería que facilita el desarrollo de aplicaciones paralelas adaptables de acuerdo con las diferentes arquitecturas de la plataforma de ejecución. La librería, llamada MIMOPack, implementa de manera eficiente un conjunto de funciones para llevar a cabo algunas de las etapas críticas en la simulación de un sistema de comunicación MIMO. La principal aportación de la tesis es la implementación de detectores eficientes de salida Hard y Soft, ya que la etapa de detección es considerada la parte más compleja en el proceso de comunicación. Estos detectores son altamente configurables y muchos de ellos incluyen técnicas de preprocesamiento que reducen el coste computacional y aumentan el rendimiento. La librería propuesta tiene tres características importantes: la portabilidad, la eficiencia y facilidad de uso. La versión actual permite computación en GPU y multi-core, incluso simultáneamente, ya que está diseñada para ser utilizada sobre plataformas heterogéneas que explotan toda la capacidad computacional. Para facilitar el uso de la biblioteca, las interfaces de las funciones son comunes para todas las arquitecturas. Algunas de las funciones se pueden llamar desde MATLAB aumentando la portabilidad de códigos desarrollados entre los diferentes entornos. De acuerdo con el diseño de la biblioteca y la evaluación del rendimiento, consideramos que MIMOPack puede facilitar la implementación de códigos sin tener que saber programar con diferentes lenguajes y arquitecturas. MIMOPack permitirá incluir algoritmos más complejos en las simulaciones y obtener los resultados más rápidamente. Esto es particularmente importante en la industria, ya que los fabricantes trabajan para proponer sus propias tecnologías lo antes posible con el objetivo de que sean aprobadas como un estándar. De este modo, los fabricantes pueden hacer valer sus derechos de propiedad intelectual frente a sus competidores, quienes luego deben obtener las correspon / [CA] En l'actualitat diversos estàndards de comunicació estan sorgint i evolucionant cercant velocitats de transmissió més altes i major fiabilitat. Aquesta expansió, està impulsada pel continu augment en el consum de serveis multimèdia a causa de l'aparició de nous dispositius portàtils com els smartphones i les tablets. Una de les tècniques més importants és l'ús de múltiples antenes de transmissió i recepció (MIMO) que permet augmentar la velocitat de transmissió i la qualitat de transmissió. Les tecnologies MIMO s'han convertit en una part essencial en diferents estàndards inalàmbrics, tals com WLAN, WiMAX i LTE. Aquestes tecnologies s'incorporaran també en futurs estàndards, per tant, s'espera en els pròxims anys una gran quantitat d'investigació en aquest camp. L'estudi dels sistemes MIMO és crític en la recerca actual, no obstant açó, els problemes que sorgeixen d'aquesta tecnologia són molt complexos. Els sistemes de computació d'alt rendiment com els multi-core i many-core (p. ej. GPUs)), estan jugant un paper clau en el desenvolupament d'algoritmes eficients i de baixa complexitat en les transmissions MIMO. Prova d'açò és que el nombre de contribucions científiques i projectes d'investigació relacionats amb el seu ús s'han incrementat en els últims anys. Algunes llibreries d'alt rendiment estan utilitzant-se com a eines per investigadors involucrats en el desenvolupament de futurs estàndards. Dos de les llibreries més destacades són: IT++ que és una llibreria basada en lús de diferents llibreries optimitzades per a processadors multi-core i el paquet Communications System Toolbox dissenyat per al seu ús amb MATLAB, que utilitza computació amb GPU. No obstant açò, no hi ha una biblioteca capaç d'executar-se en una plataforma heterogènia. Degut als alts requisits computacionals en la investigació MIMO i l'escacès d'eines capaces de satisfer-los, hem implementat una llibreria que facilita el desenvolupament d'aplicacions paral·leles adaptables d'acord amb les diferentes arquitectures de la plataforma d'ejecució. La llibreria, anomenada MIMOPack, implementa de manera eficient, un conjunt de funcions per dur a terme algunes de les etapes crítiques en la simulació d'un sistema de comunicació MIMO. La principal aportació de la tesi és la implementació de detectors eficients d'exida Hard i Soft, ja que l'etapa de detecció és considerada la part més complexa en el procés de comunicació. Estos detectors són altament configurables i molts d'ells inclouen tècniques de preprocessament que redueixen el cost computacional i augmenten el rendiment. La llibreria proposta té tres característiques importants: la portabilitat, l'eficiència i la facilitat d'ús. La versió actual permet computació en GPU i multi-core, fins i tot simultàniament, ja que està dissenyada per a ser utilitzada sobre plataformes heterogènies que exploten tota la capacitat computacional. Amb el fi de simplificar l'ús de la biblioteca, les interfaces de les funcions són comunes per a totes les arquitectures. Algunes de les funcions poden ser utilitzades des de MATLAB augmentant la portabilitat de còdics desenvolupats entre els diferentes entorns. D'acord amb el disseny de la biblioteca i l'evaluació del rendiment, considerem que MIMOPack pot facilitar la implementació de còdics a investigadors sense haver de saber programar amb diferents llenguatges i arquitectures. MIMOPack permetrà incloure algoritmes més complexos en les seues simulacions i obtindre els seus resultats més ràpid. Açò és particularment important en la industria, ja que els fabricants treballen per a proposar les seues pròpies tecnologies el més prompte possible amb l'objectiu que siguen aprovades com un estàndard. D'aquesta menera, els fabricants podran fer valdre els seus drets de propietat intel·lectual enfront dels seus competidors, els qui després han d'obtenir les corresponents llicències si vole / Ramiro Sánchez, C. (2015). MIMOPack: A High Performance Computing Library for MIMO Communication Systems [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/53930 / TESIS / Premios Extraordinarios de tesis doctorales HPC Library GPU Multi-core CUDA MIMO Sphere decoding Tree-Search detection. TEORIA DE LA SEÑAL Y COMUNICACIONES
112	Automatic methods for distribution of data-parallel programs on multi-device heterogeneous platforms Moreń, Konrad 07 February 2024 (has links) This thesis deals with the problem of finding effective methods for programming and distributing data-parallel applications for heterogeneous multiprocessor systems. These systems are ubiquitous today. They range from embedded devices with low power consumption to high performance distributed systems. The demand for these systems is growing steadily. This is due to the growing number of data-intensive applications and the general growth of digital applications. Systems with multiple devices offer higher performance but unfortunately add complexity to the software development for such systems. Programming heterogeneous multiprocessor systems present several unique challenges compared to single device systems. The first challenge is the programmability of such systems. Despite constant innovations in programming languages and frameworks, they are still limited. They are either platform specific, like CUDA which supports only NVIDIA GPUs, or applied at a low level of abstraction, such as OpenCL. Application developers that design OpenCL programs must manually distribute data to the different devices and synchronize the distributed computations. These capabilities have an impact on the productivity of the developers. To reduce the programming complexity and the development time, this thesis introduces two approaches that automatically distribute and synchronize the data-parallel workloads. Another challenge is the multi-device hardware utilization. In contrast to single-device platforms, the application optimization process for a multi-device system is even more complicated. The application designers need to apply not only optimization strategies specific for a single-device architecture. They need also focus on the careful workload balancing between all the platform processors. For the balancing problem, this thesis proposes a method based on the platform model. The platform model is created with machine learning techniques. Using machine learning, this thesis builds automatically a reliable platform model, which is portable and adaptable to different platform setups, with a minimum manual involvement of the programmers. info:eu-repo/classification/ddc/006 ddc:006
113	Parallel Viterbi Search For Continuous Speech Recognition On A Multi-Core Architecture Parihar, Naveen 11 December 2009 (has links) State-of-the-art speech-recognition systems can successfully perform simple tasks in real-time on most computers, when the tasks are performed in controlled and noiseree environments. However, current algorithms and processors are not yet powerful enough for real-time large-vocabulary conversational speech recognition in noisy, real-world environments. Parallel processing can improve the real-time performance of speech recognition systems and increase their applicability, and developing an effective approach to parallelization is especially important given the recent trend toward multi-core processor design. In this dissertation, we introduce methods for parallelizing a single-pass across-word n-gram lexical-tree based Viterbi recognizer, which is the most popular architecture for Viterbi-based large vocabulary continuous speech recognition. We parallelize two different open-source implementations of such a recognizer, one developed at Mississippi State University and the other developed at Rheinisch-Westfalische Technische Hochschule University in Germany. We describe three methods for parallelization. The first, called parallel fast likelihood computation, parallelizes likelihood computations by decomposing mixtures among CPU cores, so that each core computes the likelihood of the set of mixtures allocated to it. A second method, lexical-tree division, parallelizes the search management component of a speech recognizer by dividing the lexical tree among the cores. A third and alternative method for parallelizing the search-management component of a speech recognizer, called lexical-tree copies decomposition, dynamically distributes the active lexical-tree copies among the cores. All parallelization methods were tested on two and four cores of an Intel Core2 Quad processor and significantly improved real-time performance. Several challenges for parallelizing a lexical-tree based Viterbi speech recognizer are also identified and discussed. fast gaussian calculations fast likelihood computations prefix tree lexical tree parallel speech decoding parallel speech recognition multi-core processors
114	Power-Efficient Nanophotonic Architectures for Intra- and Inter-Chip Communication Kennedy, Matthew D. 15 July 2016 (has links) No description available. Computer Engineering Electrical Engineering network-on-chip silicon nanophotonics power-efficient intra-chip communication inter-chip communication multi-core processor
115	ANALYSIS AND MITIGATION OF THE NONLINEAR IMPAIRMENTS IN FIBER-OPTIC COMMUNICATION SYSTEMS NADERI, SHAHI SINA 10 1900 (has links) <p>Fiber-optic communication systems have revolutionized the telecommunications industry and have played a major role in the advent of the Information Age. Thousands of kilometers of optical fiber are used by telecommunications companies to transmit telephone signals, Internet communication, and cable television signals throughout the world. So, working in this area has always been interesting. This thesis analyzes the nonlinearity of fiber-optic systems and proposes a system to mitigate fiber nonlinear e®ects. The topics of this thesis can be categorized into two parts. In the first part of thesis (Chapters 2, 3, and 4), analytical models are developed for fiber-optic nonlinear effects. It is important to have an accurate analytical model so that the impact of a specific system/signal parameter on the performance can be assessed quickly without doing time-consuming Monte-Carlo simulations. In the second part (Chapters 5, and 6), a multi-core/fiber architecture is proposed to reduce the nonlinear effects.</p> <p>In Chapter 2, intrachannel nonlinear impairments are studied and an analytical model for the calculation of power spectral density (PSD) and variance of the non- linear distortion is obtained based on quadrature phase-shift keying (QPSK) signal. For QPSK signals, intrachannel four-wave mixing (IFWM) is the only stochastic non- linear distortion. To develop the analytical model, a first order perturbation theory is used. For a Gaussian pulse shape, a closed form formula is obtained for the PSD of IFWM. For non-Gaussian pulses, it is not possible to find the PSD analytically. However, using stationary phase approximation approach, convolutions become multiplications and a simple analytical expression for the PSD of the nonlinear distortion can be found. The total PSD is obtained by adding the PSD of amplified spontaneous emission (ASE) PSD to that of the nonlinear distortion. Using the total PSD, bit error ratio (BER) can be obtained analytically for a QPSK system. The analytically estimated BER is found to be in good agreement with numerical simulations. Significant computational effort can be saved using the analytical model as compared to numerical simulations, without sacrificing much accuracy.</p> <p>In Chapter 3, the same approach as that in Chapter 2 is used to find an analytical expression for the PSD of the intrachannel nonlinear distortion of a fiber-optic system based on quadrature amplitude modulation (QAM) signal. Unlike the QPSK signal, intrachannel cross-phase modulation (IXPM) is a stochastic process for the QAM signal which leads to the increase of the nonlinear distortion variance. In this chapter, analytical expressions for the PSDs of self-phase modulation (SPM), IXPM, IFWM, and their correlations are obtained for the QAM signal. Simulation results show good agreement between the analytical model and numerical simulation.</p> <p>In Chapter 4, inter-channel nonlinear impairment is studied. This time, a first order perturbation technique is used to develop an analytical model for SPM and cross-phase modulation (XPM) distortions in a wavelength division multiplexing (WDM) system based on QAM. In this case, SPM distortion is deterministic and does not contribute to the nonlinear noise variance. On the other hand, XPM is stochastic and contributes to the noise variance. In this chapter, effects of input launch power, fiber dispersion, system reach, and channel spacing on the nonlinear noise variance are investigated as well.</p> <p>In Chapter 5, a single-channel multi-core/fiber architecture is proposed to reduce intrachannel fiber nonlinear effects. Based on the analytical model obtained in the first part of thesis, the nonlinear distortion variance scales as P<sup>3</sup>, where P is the fiber input launch power, which suggests that decreasing the fiber input power can reduce the nonlinear distortion significantly. In this system, the input power is divided between multiple cores/fibers by a power splitter at the input of each span and a power combiner adds the output fields of multiple cores/fibers so that one amplifier can be used for each span. In this case, each core/fiber receives less power and hence adds less nonlinear distortion to the signal. In a practical system, individual fiber parameters are not identical; so the optical pulses propagating in the fibers undergo different amounts of phase shifts and timing delays due to the fluctuations of fibers' propagation constants and fibers' inverse group speeds. Optical and electrical equalizers are proposed to compensate for these inter-core/fiber dispersions. In the case of an optical equalizer, adaptive time shifters and phase shifters are adjusted such that the maximum power is obtained at the output of power combiner. Our numerical simulation results show that for unrepeatered systems, the performance (Q factor) is improved by 6.2 dB using 8-core/fiber configuration as compared to single- core fiber system. In addition, for multi-span system, the transmission reach at BER of 2.1*10<sup>-3</sup> is quadrupled in 8-core/fiber configuration.</p> <p>In Chapter 6, a multi-channel multi-core/fiber architecture is proposed to reduce the inter-channel nonlinear distortions. In this architecture, different channels of a WDM system are interleaved between multiple cores/fibers which increases the channel spacing in each core/fiber. Higher channel spacing decreases the inter-channel nonlinear impairments in each core/fiber which leads to system performance improvement. At the end of each span, a multiplexer adds the channels from different cores/fibers so that one amplifier can be used for all of the channels. Unlike the single-channel multi-core/fiber system, the WDM multi-core/fiber system does not require equalizers since different cores/fibers carry channels with different frequencies. Simulation results show that for a 39-span system, the 4-core/fiber system with negligible crosstalk outperforms the single-core system by 2.2 dBQ<sub>20</sub>. The impact of crosstalk between cores of a multi-core fiber (MCF) on the system performance is studied. The simulation results show that the performance of the multi-core WDM system is less sensitive to the crosstalk effect compared to conventional multi-core systems since the propagating channels in the cores are not correlated in frequency domain.</p> / Doctor of Philosophy (PhD) Fiber-optic communication system Fiber nonlinearity WDM system Multi-core fiber Signal Processing Systems and Communications Signal Processing
116	Quality-of-Service Aware Design and Management of Embedded Mixed-Criticality Systems Ranjbar, Behnaz 12 April 2024 (has links) Nowadays, implementing a complex system, which executes various applications with different levels of assurance, is a growing trend in modern embedded real-time systems to meet cost, timing, and power consumption requirements. Medical devices, automotive, and avionics industries are the most common safety-critical applications, exploiting these systems known as Mixed-Criticality (MC) systems. MC applications are real-time, and to ensure the correctness of these applications, it is essential to meet strict timing requirements as well as functional specifications. The correct design of such MC systems requires a thorough understanding of the system's functions and their importance to the system. A failure/deadline miss in functions with various criticality levels has a different impact on the system, from no effect to catastrophic consequences. Failure in the execution of tasks with higher criticality levels (HC tasks) may lead to system failure and cause irreparable damage to the system, while although Low-Criticality (LC) tasks assist the system in carrying out its mission successfully, their failure has less impact on the system's functionality and does not harm the system itself to fail. In order to guarantee the MC system safety, tasks are analyzed with different assumptions to obtain different Worst-Case Execution Times (WCETs) corresponding to the multiple criticality levels and the operation mode of the system. If the execution time of at least one HC task exceeds its low WCET, the system switches from low-criticality mode (LO mode) to high-criticality mode (HI mode). Then, all HC tasks continue executing by considering the high WCET to guarantee the system's safety. In this HI mode, all or some LC tasks are dropped/degraded in favor of HC tasks to ensure HC tasks' correct execution. Determining an appropriate low WCET for each HC task is crucial in designing efficient MC systems and ensuring QoS maximization. However, in the case where the low WCETs are set correctly, it is not recommended to drop/degrade the LC tasks in the HI mode due to its negative impact on the other functions or on the entire system in accomplishing its mission correctly. Therefore, how to analyze the task dropping in the HI mode is a significant challenge in designing efficient MC systems that must be considered to guarantee the successful execution of all HC tasks to prevent catastrophic damages while improving the QoS. Due to the continuous rise in computational demand for MC tasks in safety-critical applications, like controlling autonomous driving, the designers are motivated to deploy MC applications on multi-core platforms. Although the parallel execution feature of multi-core platforms helps to improve QoS and ensures the real-timeliness, high power consumption and temperature of cores may make the system more susceptible to failures and instability, which is not desirable in MC applications. Therefore, improving the QoS while managing the power consumption and guaranteeing real-time constraints is the critical issue in designing such MC systems in multi-core platforms. This thesis addresses the challenges associated with efficient MC system design. We first focus on application analysis by determining the appropriate WCET by proposing a novel approach to provide a reasonable trade-off between the number of scheduled LC tasks at design-time and the probability of mode switching at run-time to improve the system utilization and QoS. The approach presents an analytic-based scheme to obtain low WCETs based on the Chebyshev theorem at design-time. We also show the relationship between the low WCETs and mode switching probability, and formulate and solve the problem for improving resource utilization and reducing the mode switching probability. Further, we analyze the LC task dropping in the HI mode to improve QoS. We first propose a heuristic in which a new metric is defined that determines the number of allowable drops in the HI mode. Then, the task schedulability analysis is developed based on the new metric. Since the occurrence of the worst-case scenario at run-time is a rare event, a learning-based drop-aware task scheduling mechanism is then proposed, which carefully monitors the alterations in the behavior of MC systems at run-time to exploit the dynamic slacks for improving the QoS. Another critical design challenge is how to improve QoS using the parallel feature of multi-core platforms while managing the power consumption and temperature of these platforms. We develop a tree of possible task mapping and scheduling at design-time to cover all possible scenarios of task overrunning and reduce the LC task drop rate in the HI mode while managing the power and temperature in each scenario of task scheduling. Since the dynamic slack is generated due to the early execution of tasks at run-time, we propose an online approach to reduce the power consumption and maximum temperature by using low-power techniques like DVFS and task re-mapping, while preserving the QoS. Specifically, our approach examines multiple tasks ahead to determine the most appropriate task for the slack assignment that has the most significant effect on power consumption and temperature. However, changing the frequency and selecting a proper task for slack assignment and a suitable core for task re-mapping at run-time can be time-consuming and may cause deadline violation. Therefore, we analyze and optimize the run-time scheduler.:1. Introduction 1.1. Mixed-Criticality Application Design 1.2. Mixed-Criticality Hardware Design 1.3. Certain Challenges and Questions 1.4. Thesis Key Contributions 1.4.1. Application Analysis and Modeling 1.4.2. Multi-Core Mixed-Criticality System Design 1.5. Thesis Overview 2. Preliminaries and Literature Reviews 2.1. Preliminaries 2.1.1. Mixed-Criticality Systems 2.1.2. Fault-Tolerance, Fault Model and Safety Requirements 2.1.3. Hardware Architectural Modeling 2.1.4. Low-Power Techniques and Power Consumption Model 2.2. Related Works 2.2.1. Mixed-Criticality Task Scheduling Mechanisms 2.2.2. QoS Improvement Methods in Mixed-Criticality Systems 2.2.3. QoS-Aware Power and Thermal Management in Multi-Core Mixed-Criticality Systems 2.3. Conclusion 3. Bounding Time in Mixed-Criticality Systems 3.1. BOT-MICS: A Design-Time WCET Adjustment Approach 3.1.1. Motivational Example 3.1.2. BOT-MICS in Detail 3.1.3. Evaluation 3.2. A Run-Time WCET Adjustment Approach 3.2.1. Motivational Example 3.2.2. ADAPTIVE in Detail 3.2.3. Evaluation 3.3. Conclusion 4. Safety- and Task-Drop-Aware Mixed-Criticality Task Scheduling 4.1. Problem Objectives and Motivational Example 4.2. FANTOM in detail 4.2.1. Safety Quantification 4.2.2. MC Tasks Utilization Bounds Definition 4.2.3. Scheduling Analysis 4.2.4. System Upper Bound Utilization 4.2.5. A General Design Time Scheduling Algorithm 4.3. Evaluation 4.3.1. Evaluation with Real-Life Benchmarks 4.3.2. Evaluation with Synthetic Task Sets 4.4. Conclusion 5. Learning-Based Drop-Aware Mixed-Criticality Task Scheduling 5.1. Motivational Example and Problem Statement 5.2. Proposed Method in Detail 5.2.1. An Overview of the Design-Time Approach 5.2.2. Run-Time Approach: Employment of SOLID 5.2.3. LIQUID Approach 5.3. Evaluation 5.3.1. Evaluation with Real-Life Benchmarks 5.3.2. Evaluation with Synthetic Task Sets 5.3.3. Investigating the Timing and Memory Overheads of ML Technique 5.4. Conclusion 6. Fault-Tolerance and Power-Aware Multi-Core Mixed-Criticality System Design 6.1. Problem Objectives and Motivational Example 6.2. Design Methodology 6.3. Tree Generation and Fault-Tolerant Scheduling and Mapping 6.3.1. Making Scheduling Tree 6.3.2. Mapping and Scheduling 6.3.3. Time Complexity Analysis 6.3.4. Memory Space Analysis 6.4. Evaluation 6.4.1. Experimental Setup 6.4.2. Analyzing the Tree Construction Time 6.4.3. Analyzing the Run-Time Timing Overhead 6.4.4. Peak Power Management and Thermal Distribution for Real-Life and Synthetic Applications 6.4.5. Analyzing the QoS of LC Tasks 6.4.6. Analyzing the Peak Power Consumption and Maximum Temperature 6.4.7. Effect of Varying Different Parameters on Acceptance Ratio 6.4.8. Investigating Different Approaches at Run-Time 6.5. Conclusion 7. QoS- and Power-Aware Run-Time Scheduler for Multi-Core Mixed-Criticality Systems 7.1. Research Questions, Objectives and Motivational Example 7.2. Design-Time Approach 7.3. Run-Time Mixed-Criticality Scheduler 7.3.1. Selecting the Appropriate Task to Assign Slack 7.3.2. Re-Mapping Technique 7.3.3. Run-Time Management Algorithm 7.3.4. DVFS governor in Clustered Multi-Core Platforms 7.4. Run-Time Scheduler Algorithm Optimization 7.5. Evaluation 7.5.1. Experimental Setup 7.5.2. Analyzing the Relevance Between a Core Temperature and Energy Consumption 7.5.3. The Effect of Varying Parameters of Cost Functions 7.5.4. The Optimum Number of Tasks to Look-Ahead and the Effect of Task Re-mapping 7.5.5. The Analysis of Scheduler Timings Overhead on Different Real Platforms 7.5.6. The Latency of Changing Frequency in Real Platform 7.5.7. The Effect of Latency on System Schedulability 7.5.8. The Analysis of the Proposed Method on Peak Power, Energy and Maximum Temperature Improvement 7.5.9. The Analysis of the Proposed Method on Peak power, Energy and Maximum Temperature Improvement in a Multi-Core Platform Based on the ODROID-XU3 Architecture 7.5.10. Evaluation of Running Real MC Task Graph Model (Unmanned Air Vehicle) on Real Platform 7.6. Conclusion 8. Conclusion and Future Work 8.1. Conclusions 8.2. Future Work info:eu-repo/classification/ddc/006 ddc:006
117	Quantitative phase imaging through an ultra-thin lensless fiber endoscope Sun, Jiawei, Wu, Jiachen, Wu, Song, Goswami, Ruchi, Girardo, Salvatore, Cao, Liangcai, Guck, Jochen, Koukourakis, Nektarios, Czarske, Juergen W. 08 April 2024 (has links) Quantitative phase imaging (QPI) is a label-free technique providing both morphology and quantitative biophysical information in biomedicine. However, applying such a powerful technique to in vivo pathological diagnosis remains challenging. Multi-core fiber bundles (MCFs) enable ultra-thin probes for in vivo imaging, but current MCF imaging techniques are limited to amplitude imaging modalities. We demonstrate a computational lensless microendoscope that uses an ultra-thin bare MCF to perform quantitative phase imaging with microscale lateral resolution and nanoscale axial sensitivity of the optical path length. The incident complex light field at the measurement side is precisely reconstructed from the far-field speckle pattern at the detection side, enabling digital refocusing in a multi-layer sample without any mechanical movement. The accuracy of the quantitative phase reconstruction is validated by imaging the phase target and hydrogel beads through the MCF. With the proposed imaging modality, three-dimensional imaging of human cancer cells is achieved through the ultra-thin fiber endoscope, promising widespread clinical applications. info:eu-repo/classification/ddc/530 ddc:530
118	Quality-of-Service Aware Design and Management of Embedded Mixed-Criticality Systems Ranjbar, Behnaz 06 December 2022 (has links) Nowadays, implementing a complex system, which executes various applications with different levels of assurance, is a growing trend in modern embedded real-time systems to meet cost, timing, and power consumption requirements. Medical devices, automotive, and avionics industries are the most common safety-critical applications, exploiting these systems known as Mixed-Criticality (MC) systems. MC applications are real-time, and to ensure the correctness of these applications, it is essential to meet strict timing requirements as well as functional specifications. The correct design of such MC systems requires a thorough understanding of the system's functions and their importance to the system. A failure/deadline miss in functions with various criticality levels has a different impact on the system, from no effect to catastrophic consequences. Failure in the execution of tasks with higher criticality levels (HC tasks) may lead to system failure and cause irreparable damage to the system, while although Low-Criticality (LC) tasks assist the system in carrying out its mission successfully, their failure has less impact on the system's functionality and does not harm the system itself to fail. In order to guarantee the MC system safety, tasks are analyzed with different assumptions to obtain different Worst-Case Execution Times (WCETs) corresponding to the multiple criticality levels and the operation mode of the system. If the execution time of at least one HC task exceeds its low WCET, the system switches from low-criticality mode (LO mode) to high-criticality mode (HI mode). Then, all HC tasks continue executing by considering the high WCET to guarantee the system's safety. In this HI mode, all or some LC tasks are dropped/degraded in favor of HC tasks to ensure HC tasks' correct execution. Determining an appropriate low WCET for each HC task is crucial in designing efficient MC systems and ensuring QoS maximization. However, in the case where the low WCETs are set correctly, it is not recommended to drop/degrade the LC tasks in the HI mode due to its negative impact on the other functions or on the entire system in accomplishing its mission correctly. Therefore, how to analyze the task dropping in the HI mode is a significant challenge in designing efficient MC systems that must be considered to guarantee the successful execution of all HC tasks to prevent catastrophic damages while improving the QoS. Due to the continuous rise in computational demand for MC tasks in safety-critical applications, like controlling autonomous driving, the designers are motivated to deploy MC applications on multi-core platforms. Although the parallel execution feature of multi-core platforms helps to improve QoS and ensures the real-timeliness, high power consumption and temperature of cores may make the system more susceptible to failures and instability, which is not desirable in MC applications. Therefore, improving the QoS while managing the power consumption and guaranteeing real-time constraints is the critical issue in designing such MC systems in multi-core platforms. This thesis addresses the challenges associated with efficient MC system design. We first focus on application analysis by determining the appropriate WCET by proposing a novel approach to provide a reasonable trade-off between the number of scheduled LC tasks at design-time and the probability of mode switching at run-time to improve the system utilization and QoS. The approach presents an analytic-based scheme to obtain low WCETs based on the Chebyshev theorem at design-time. We also show the relationship between the low WCETs and mode switching probability, and formulate and solve the problem for improving resource utilization and reducing the mode switching probability. Further, we analyze the LC task dropping in the HI mode to improve QoS. We first propose a heuristic in which a new metric is defined that determines the number of allowable drops in the HI mode. Then, the task schedulability analysis is developed based on the new metric. Since the occurrence of the worst-case scenario at run-time is a rare event, a learning-based drop-aware task scheduling mechanism is then proposed, which carefully monitors the alterations in the behavior of MC systems at run-time to exploit the dynamic slacks for improving the QoS. Another critical design challenge is how to improve QoS using the parallel feature of multi-core platforms while managing the power consumption and temperature of these platforms. We develop a tree of possible task mapping and scheduling at design-time to cover all possible scenarios of task overrunning and reduce the LC task drop rate in the HI mode while managing the power and temperature in each scenario of task scheduling. Since the dynamic slack is generated due to the early execution of tasks at run-time, we propose an online approach to reduce the power consumption and maximum temperature by using low-power techniques like DVFS and task re-mapping, while preserving the QoS. Specifically, our approach examines multiple tasks ahead to determine the most appropriate task for the slack assignment that has the most significant effect on power consumption and temperature. However, changing the frequency and selecting a proper task for slack assignment and a suitable core for task re-mapping at run-time can be time-consuming and may cause deadline violation. Therefore, we analyze and optimize the run-time scheduler.:1. Introduction 1.1. Mixed-Criticality Application Design 1.2. Mixed-Criticality Hardware Design 1.3. Certain Challenges and Questions 1.4. Thesis Key Contributions 1.4.1. Application Analysis and Modeling 1.4.2. Multi-Core Mixed-Criticality System Design 1.5. Thesis Overview 2. Preliminaries and Literature Reviews 2.1. Preliminaries 2.1.1. Mixed-Criticality Systems 2.1.2. Fault-Tolerance, Fault Model and Safety Requirements 2.1.3. Hardware Architectural Modeling 2.1.4. Low-Power Techniques and Power Consumption Model 2.2. Related Works 2.2.1. Mixed-Criticality Task Scheduling Mechanisms 2.2.2. QoS Improvement Methods in Mixed-Criticality Systems 2.2.3. QoS-Aware Power and Thermal Management in Multi-Core Mixed-Criticality Systems 2.3. Conclusion 3. Bounding Time in Mixed-Criticality Systems 3.1. BOT-MICS: A Design-Time WCET Adjustment Approach 3.1.1. Motivational Example 3.1.2. BOT-MICS in Detail 3.1.3. Evaluation 3.2. A Run-Time WCET Adjustment Approach 3.2.1. Motivational Example 3.2.2. ADAPTIVE in Detail 3.2.3. Evaluation 3.3. Conclusion 4. Safety- and Task-Drop-Aware Mixed-Criticality Task Scheduling 4.1. Problem Objectives and Motivational Example 4.2. FANTOM in detail 4.2.1. Safety Quantification 4.2.2. MC Tasks Utilization Bounds Definition 4.2.3. Scheduling Analysis 4.2.4. System Upper Bound Utilization 4.2.5. A General Design Time Scheduling Algorithm 4.3. Evaluation 4.3.1. Evaluation with Real-Life Benchmarks 4.3.2. Evaluation with Synthetic Task Sets 4.4. Conclusion 5. Learning-Based Drop-Aware Mixed-Criticality Task Scheduling 5.1. Motivational Example and Problem Statement 5.2. Proposed Method in Detail 5.2.1. An Overview of the Design-Time Approach 5.2.2. Run-Time Approach: Employment of SOLID 5.2.3. LIQUID Approach 5.3. Evaluation 5.3.1. Evaluation with Real-Life Benchmarks 5.3.2. Evaluation with Synthetic Task Sets 5.3.3. Investigating the Timing and Memory Overheads of ML Technique 5.4. Conclusion 6. Fault-Tolerance and Power-Aware Multi-Core Mixed-Criticality System Design 6.1. Problem Objectives and Motivational Example 6.2. Design Methodology 6.3. Tree Generation and Fault-Tolerant Scheduling and Mapping 6.3.1. Making Scheduling Tree 6.3.2. Mapping and Scheduling 6.3.3. Time Complexity Analysis 6.3.4. Memory Space Analysis 6.4. Evaluation 6.4.1. Experimental Setup 6.4.2. Analyzing the Tree Construction Time 6.4.3. Analyzing the Run-Time Timing Overhead 6.4.4. Peak Power Management and Thermal Distribution for Real-Life and Synthetic Applications 6.4.5. Analyzing the QoS of LC Tasks 6.4.6. Analyzing the Peak Power Consumption and Maximum Temperature 6.4.7. Effect of Varying Different Parameters on Acceptance Ratio 6.4.8. Investigating Different Approaches at Run-Time 6.5. Conclusion 7. QoS- and Power-Aware Run-Time Scheduler for Multi-Core Mixed-Criticality Systems 7.1. Research Questions, Objectives and Motivational Example 7.2. Design-Time Approach 7.3. Run-Time Mixed-Criticality Scheduler 7.3.1. Selecting the Appropriate Task to Assign Slack 7.3.2. Re-Mapping Technique 7.3.3. Run-Time Management Algorithm 7.3.4. DVFS governor in Clustered Multi-Core Platforms 7.4. Run-Time Scheduler Algorithm Optimization 7.5. Evaluation 7.5.1. Experimental Setup 7.5.2. Analyzing the Relevance Between a Core Temperature and Energy Consumption 7.5.3. The Effect of Varying Parameters of Cost Functions 7.5.4. The Optimum Number of Tasks to Look-Ahead and the Effect of Task Re-mapping 7.5.5. The Analysis of Scheduler Timings Overhead on Different Real Platforms 7.5.6. The Latency of Changing Frequency in Real Platform 7.5.7. The Effect of Latency on System Schedulability 7.5.8. The Analysis of the Proposed Method on Peak Power, Energy and Maximum Temperature Improvement 7.5.9. The Analysis of the Proposed Method on Peak power, Energy and Maximum Temperature Improvement in a Multi-Core Platform Based on the ODROID-XU3 Architecture 7.5.10. Evaluation of Running Real MC Task Graph Model (Unmanned Air Vehicle) on Real Platform 7.6. Conclusion 8. Conclusion and Future Work 8.1. Conclusions 8.2. Future Work info:eu-repo/classification/ddc/006 ddc:006
119	Application du concept des transactions pour la modélisation et la simulation multicoeur des systèmes sur puce Anane, Amine 01 1900 (has links) Avec la complexité croissante des systèmes sur puce, de nouveaux défis ne cessent d’émerger dans la conception de ces systèmes en matière de vérification formelle et de synthèse de haut niveau. Plusieurs travaux autour de SystemC, considéré comme la norme pour la conception au niveau système, sont en cours afin de relever ces nouveaux défis. Cependant, à cause du modèle de concurrence complexe de SystemC, relever ces défis reste toujours une tâche difficile. Ainsi, nous pensons qu’il est primordial de partir sur de meilleures bases en utilisant un modèle de concurrence plus efficace. Par conséquent, dans cette thèse, nous étudions une méthodologie de conception qui offre une meilleure abstraction pour modéliser des composants parallèles en se basant sur le concept de transaction. Nous montrons comment, grâce au raisonnement simple que procure le concept de transaction, il devient plus facile d’appliquer la vérification formelle, le raffinement incrémental et la synthèse de haut niveau. Dans le but d’évaluer l’efficacité de cette méthodologie, nous avons fixé l’objectif d’optimiser la vitesse de simulation d’un modèle transactionnel en profitant d’une machine multicoeur. Nous présentons ainsi l’environnement de modélisation et de simulation parallèle que nous avons développé. Nous étudions différentes stratégies d’ordonnancement en matière de parallélisme et de surcoût de synchronisation. Une expérimentation faite sur un modèle du transmetteur Wi-Fi 802.11a a permis d’atteindre une accélération d’environ 1.8 en utilisant deux threads. Avec 8 threads, bien que la charge de travail des différentes transactions n’était pas importante, nous avons pu atteindre une accélération d’environ 4.6, ce qui est un résultat très prometteur. / With the increasing complexity of SoCs, new challenges continue to emerge in the design of these systems in terms of formal verification and high-level synthesis. Several research efforts around SystemC, considered the de facto standard for system-level design, are underway to meet these new challenges. However, because of the complex concurrency model of SystemC, these challenges remain difficult tasks. Thus, we believe it is important to continue on a better footing by using a more effective concurrency model. Therefore, in this thesis, we study a design methodology that provides a better abstraction for modeling parallel components based on the concept of transaction. We show how, through simple reasoning about transactions, it becomes easier to apply formal verification, incremental refinement and high-level synthesis. In order to evaluate the effectiveness of this methodology, we set the goal to optimize the simulation speed of a transactional model by taking advantage of a multicore machine. We present a modeling and parallel simulation environment that we developed. We study different scheduling strategies in terms of parallelism and synchronization overhead. An experiment made on a Wi-Fi 802.11a transmitter model achieved a speed up of about 1.8 using two threads. With 8 threads, although the workload of individual transactions was not significant, we could reach a speed up equal to 4.6 which is a very promising result. Modélisation SOC Design Simulation parallèle Parallel Simulation Transactions Multi-coeur Multi-core
120	Stratégie de placement et d'ordonnancement de taches logicielles pour architectures reconfigurables sous contrainte énergétique / Mapping and scheduling strategy of OS tasks into reconfigurable architectures under energy constraint Gammoudi, Aymen 26 June 2018 (has links) La conception de systèmes temps-réel embarqués se développe de plus en plus avec l’intégration croissante de fonctionnalités critiques pour les applications de surveillance, notamment dans le domaine biomédical, environnemental, domotique, etc. Le développement de ces systèmes doit relever divers défis en termes de minimisation de la consommation énergétique. Gérer de tels dispositifs embarqués, entièrement autonomes, nécessite cependant de résoudre différents problèmes liés à la quantité d’énergie disponible dans la batterie, à l’ordonnancement temps-réel des tâches qui doivent être exécutées avant leurs échéances, aux scénarios de reconfiguration, particulièrement dans le cas d’ajout de tâches, et à la contrainte de communication pour pouvoir assurer l’échange des messages entre les processeurs, de façon à assurer une autonomie durable jusqu’à la prochaine recharge et ce, tout en maintenant un niveau de qualité de service acceptable du système de traitement. Pour traiter cette problématique, nous proposons dans ces travaux une stratégie de placement et d’ordonnancement de tâches permettant d’exécuter des applications temps-réel sur une architecture contenant des cœurs hétérogènes. Dans cette thèse, nous avons choisi d’aborder cette problématique de façon incrémentale pour traiter progressivement les problèmes liés aux contraintes temps-réel, énergétique et de communications. Tout d’abord, nous nous intéressons particulièrement à l’ordonnancement des tâches sur une architecture mono-cœur. Nous proposons une stratégie d’ordonnancement basée sur le regroupement des tâches dans des packs pour pouvoir calculer facilement les nouveaux paramètres des tâches afin de réobtenir la faisabilité du système. Puis, nous l’avons étendu pour traiter le cas de l’ordonnancement sur une architecture multi-cœurs homogènes. Finalement, une extension de ce dernier sera réalisée afin d’arriver à l’objectif principal qui est l’ordonnancement des tâches pour les architectures hétérogènes. L’idée est de prendre progressivement en compte des contraintes d’exécution de plus en plus complexes. Nous formalisons tous les problèmes en utilisant la formulation ILP afin de pouvoir produire des résultats optimaux. L’idée est de pouvoir situer nos solutions proposées par rapport aux solutions optimales produites par un solveur et par rapport aux autres algorithmes de l’état de l’art. Par ailleurs, la validation par simulation des stratégies proposées montre qu’elles engendrent un gain appréciable vis-à-vis des critères considérés importants dans les systèmes embarqués, notamment le coût de la communication entre cœurs et le taux de rejet des tâches. / The design of embedded real-time systems is developing more and more with the increasing integration of critical functionalities for monitoring applications, particularly in the biomedical, environmental, home automation, etc. The developement of these systems faces various challenges particularly in terms of minimizing energy consumption. Managing such autonomous embedded devices, requires solving various problems related to the amount of energy available in the battery and the real-time scheduling of tasks that must be executed before their deadlines, to the reconfiguration scenarios, especially in the case of adding tasks, and to the communication constraint to be able to ensure messages exchange between cores, so as to ensure a lasting autonomy until the next recharge, while maintaining an acceptable level of quality of services for the processing system. To address this problem, we propose in this work a new strategy of placement and scheduling of tasks to execute real-time applications on an architecture containing heterogeneous cores. In this thesis, we have chosen to tackle this problem in an incremental manner in order to deal progressively with problems related to real-time, energy and communication constraints. First of all, we are particularly interested in the scheduling of tasks for single-core architecture. We propose a new scheduling strategy based on grouping tasks in packs to calculate the new task parameters in order to re-obtain the system feasibility. Then we have extended it to address the scheduling tasks on an homogeneous multi-core architecture. Finally, an extension of the latter will be achieved in order to realize the main objective, which is the scheduling of tasks for the heterogeneous architectures. The idea is to gradually take into account the constraints that are more and more complex. We formalize the proposed strategy as an optimization problem by using integer linear programming (ILP) and we compare the proposed solutions with the optimal results provided by the CPLEX solver. Inaddition, the validation by simulation of the proposed strategies shows that they generate a respectable gain compared with the criteria considered important in embedded systems, in particular the cost of communication between cores and the rate of new tasks rejection. Architecture multi-Cœur Reconfiguration Placement de tâches Ordonnancement temps-Réel Contrainte énergétique Multi-Core architecture Reconfiguration Task mapping Real-Time scheduling Energy constraint

Search results