Architectures adaptatives basse consommation pour les communications sans-fil / Low-power adaptive architectures for wireless communicationsLenoir, Vincent 28 September 2015 (has links)
Ces travaux de thèse s'inscrivent dans la thématique des objets connectés, désormais connue sous le nom de Internet of Things (IoT). Elle trouve son origine dans la démocratisation d'Internet depuis le début des années 2000 et la migration vers des appareils hautement mobiles, rendue possible grâce à la miniaturisation des systèmes embarqués. Dans ce contexte, l'efficacité énergétique est primordiale puisque les projections actuelles parlent de dizaines de milliards de composants connectés à l'horizon 2020. Or pour une question de facilité de déploiement et d'usage, une grande partie des échanges de données dans ces réseaux s'effectue via une liaison sans-fil dont l'implémentation représente une part importante de la consommation. Effectivement, la question de l'efficacité énergétique est en général considérée comme un problème de perfectionnement des architectures matérielles, souvent associé à une évolution favorable de la technologie. Toutefois, ce paradigme atteint rapidement ses limites puisqu'il implique nécessairement un dimensionnement fortement contraint pour être compatible avec les pires conditions d'utilisation, même si elles ne sont pas effectives la plupart du temps. C'est typiquement le cas avec les communications sans-fil puisque le canal radio est un milieu caractérisé par une forte variabilité en raison des phénomènes de propagation et de la présence d'interférences. Notre étude a donc porté sur la conception d'une chaîne de transmission dont le budget de liaison peut être dynamiquement modifié en fonction de l'atténuation réelle du signal, afin de réduire la consommation du système. La thèse a notamment contribué à la mise au point d'un récepteur auto-adaptatif spécifique à la norme IEEE 802.15.4, en proposant à la fois une architecture de modem numérique reconfigurable et à la fois une méthode de contrôle automatique du point de fonctionnement. Plus précisément, le travail s'est appuyé sur deux approches, l'échantillonnage compressif et l'échantillonnage partiel, pour réduire la taille des données à traiter, diminuant ainsi l'activité interne des opérateurs arithmétiques. En contrepartie, le processus de démodulation nécessite un SNR supérieur, dégradant la sensibilité du récepteur et donc le budget de liaison. Cette solution, portée sur une technologie STMicroelectronics CMOS 65 nm LP, offre une faible empreinte matérielle vis-à-vis d'une architecture classique avec seulement 23,4 kcellules. Grâce au modèle physique du circuit qui a été développé, la consommation pour la démodulation d'un paquet est estimée à 278 uW lorsque le modem est intégralement utilisé. Elle peut toutefois être abaissée progressivement jusqu'à 119 uW, correspondant à une baisse de la sensibilité de 10 dB. Ainsi, le modem implémenté et sa boucle de contrôle permettent d'économiser en moyenne 30 % d'énergie dans un cas d'utilisation typique. / This thesis work takes part in the connected objects theme, also known as the Internet of Things (IoT). It emerges from the Internet democratization since the early 2000's and the shift to highly mobile devices, made possible by the miniaturization of embedded systems. In this context, the energy efficiency is mandatory since today's projections are around tens of billions of connected devices in 2020. However for ease of deployment and usage, a large part of the data transfers in these networks is wireless, which implementation represents a significant part of the power consumption. Indeed, the energy efficiency question is addressed in general as a fine tuning of hardware architectures, which is often associated with a favorable technology evolution. Nevertheless, this design paradigm quickly reached its limits since it necessary implies a highly constrained sizing to be compatible with the worst operating conditions, even if they are not effective most of the time. It's typically the case with wireless communications since the radio channel is a medium characterized by a strong variability due to propagations effects and interferences. Thus, our study focused on the design of a communication chain whose link budget can be dynamically tuned depending on the actual signal attenuation, in order to reduce the system power consumption. The thesis has contributed to the design of a self-adaptive receiver dedicated to IEEE 802.15.4 standard, by proposing both a reconfigurable digital baseband architecture and an automatic control method of the operating mode. More precisely, the work relied on two approaches, the compressive sampling and the partial sampling, to reduce the data's size to process, decreasing the internal activity of arithmetics operators. In return, the demodulation processing needs a higher SNR, degrading in the same time the receiver sensitivity and thus the link budget. This solution, implemented in an STMicroelectronics CMOS 65 nm LP process, offers a low hardware overhead compared to conventional architecture with only 23,4 kgates. Thanks to the circuit physical model that has been developed, the power consumption for a packet demodulation is estimated to 278 uW when the baseband is fully activated. It can however be gradually decreased down to 119 uW, corresponding to a sensitivity reduction of 10 dB. Thus, the proposed digital baseband and its control loop save 30 % of energy in average in a typical use case.
Algoritmos e Arquiteturas VLSI para Detectores MIMO com Decisão Suave
Duarte, José Marcelo Lima 17 August 2012
Previous issue date: 2012-08-17 / The use of Multiple Input Multiple Output (MIMO) systems has permitted the
recent evolution of wireless communication standards. The Spatial Multiplexing
MIMO technique, in particular, provides a linear gain at the transmission capacity
with the minimum between the numbers of transmit and receive antennas. To
obtain a near capacity performance in SM-MIMO systems a soft decision Maximum
A Posteriori Probability MIMO detector is necessary. However, such detector is
too complex for practical solutions. Hence, the goal of a MIMO detector algorithm
aimed for implementation is to get a good approximation of the ideal detector while
keeping an acceptable complexity. Moreover, the algorithm needs to be mapped to
a VLSI architecture with small area and high data rate. Since Spatial Multiplexing
is a recent technique, it is argued that there is still much room for development of
related algorithms and architectures. Therefore, this thesis focused on the study
of sub optimum algorithms and VLSI architectures for broadband MIMO detector
with soft decision. As a result, novel algorithms have been developed starting from
proposals of optimizations for already established algorithms. Based on these results,
new MIMO detector architectures with configurable modulation and competitive
area, performance and data rate parameters are here proposed. The developed
algorithms have been extensively simulated and the architectures were synthesized
so that the results can serve as a reference for other works in the area / O uso de sistemas de M?ltiplas Entradas e M?ltiplas Sa?das (Multiple Input Multiple Output - MIMO) tem permitido a recente evolu??o dos novos padr?es de comunica??o m?vel. A t?cnica MIMO da Multiplexa??o Espacial, em particular, prov? um aumento linear na capacidade de transmiss?o com o m?nimo entre n?mero de antenas transmissoras e antenas receptoras. Para se obter um desempenho pr?ximo a capacidade em sistemas com Multiplexa??o Espacial faz-se necess?rio o
uso de um detector MIMO com decis?o suave do tipo Maximum A Posteriori Probability. Entretanto, tal detector e muito complexo para solu??es pr?ticas. Assim, o objetivo dos algoritmos de detec??o MIMO voltados para implementa??o e obter uma boa aproxima??o do detector ideal mantendo um n?vel de complexidade aceit?vel. Al?m disso, o algoritmo precisa ser mapeado para uma arquitetura VLSI de
?rea pequena e que atenda a taxa de transmiss?o exigida pelos padr?es de comunica??es m?veis. Sendo a multiplexa??o espacial uma t?cnica recente, defende-se que ainda h? muito espa?o para evolu??o dos algoritmos e arquiteturas relacionadas. Por isso, esta tese se focou no estudo de algoritmos sub-?timos e arquiteturas VLSI para
detectores MIMO de banda larga com decis?o suave. Como resultado, algoritmos in?ditos foram desenvolvidos partindo de propostas de otimiza??es para algoritmos j? estabelecidos. Baseado nesses resultados, novas arquiteturas de detectores MIMO com modula??o configur?vel e competitivos par?metros de ?rea, desempenho e taxa
de processamento s?o aqui propostas. Os algoritmos desenvolvidos foram extensivamente simulados e as arquiteturas sintetizadas para que os resultados pudessem servir como refer?ncia para outros trabalhos na ?rea.
Detection algorithms and architectures for wireless spatial multiplexing in MIMO-OFDM systems
Myllylä, M. (Markus) 17 May 2011
The development of wireless telecommunication systems has been rapid during the last two decades and the data rates as well as the quality of service (QoS) requirements are continuously growing. Multiple-input multiple-output (MIMO) techniques in combination with orthogonal frequency-division multiplexing (MIMO–OFDM) have been identified as a promising approach for high spectral efficiency wideband systems.
The optimal detection method for a coded MIMO–OFDM system with spatial multiplexing (SM) is the maximum a posteriori (MAP) detector, which is often too complex for systems with high order modulation. Suboptimal linear detectors, such as the linear minimum mean square error (LMMSE) criterion based detection, offer low complexity solutions, but have poor performance in correlated fading channels. A list sphere detector (LSD) is a tree search based soft output detector that can be used to approximate the MAP detector with a lower computational complexity. The benefits of the more advanced detectors can be realized especially in a low SNR environment by, e.g., increasing the cell coverage. In this thesis, we consider the linear minimum mean square error (LMMSE) criterion based detectors and more advanced LSDs for detection of SM transmission.
The LSD algorithms are not as such feasible for hardware implementation. Therefore, we identify the design choices that relate to the performance and implementation complexity of the LSD algorithms. We give guidelines to the LSD algorithm design and propose the proper trade-off solutions for practical wireless systems. The more stringent requirements call for further research on architectures and implementation. In particular, it is important to address the parallelism and pipelining factors in the architecture design to enable an optimal trade-off between used resources and operating speed. We design pipelined systolic array architecture for LMMSE detector algorithms and efficient architectures with given algorithm properties for the LSD algorithms.
We consider the VLSI implementation of the algorithms to study the true performance and complexity. The designed architectures are implemented on a field programmable gate array (FPGA) chip and CMOS application specific integrated circuit (ASIC) technology. Finally, we present some measurement results with a hardware testbed to verify the performance of the considered algorithms. / Tiivistelmä
Langattoman tietoliikenteen kehitys on ollut nopeaa viimeisien vuosikymmenien aikana ja järjestelmiltä vaaditaan yhä suurempia datanopeuksia ja luotettavuutta. Multiple-input multiple-output (MIMO) tekniikka yhdistettynä monikantoaaltomodulointiin (MIMO-OFDM) on tunnistettu lupaavaksi järjestelmäksi, joka mahdollistaa tehokkaan taajuusalueen hyödyntämisen.
Optimaalinen ilmaisumenetelmä tilakanavoituun (SM) ja koodattuun MIMO-OFDM järjestelmään on maximum a posteriori (MAP) ilmaisin, joka on tyypillisesti liian kompleksinen toteuttaa laajakaistajärjestelmissä, joissa käytetään korkean asteen modulointia. Alioptimaaliset lineaariset ilmaisimet, kuten pienimpään keskineliövirheeseen (LMMSE) perustuvat ilmaisimet, ovat suhteellisen yksinkertaisia toteuttaa nykyteknologialla, mutta niiden suorituskyky on varsin heikko korreloivassa radiokanavassa. Listapalloilmaisin (LSD) on puuhakualgoritmiin perustuva pehmeän ulostulon ilmaisin, joka pystyy jäljittelemään MAP ilmaisinta sitä pienemmällä kompleksisuudella. Kehittyneemmät ilmaisimet, kuten LSD, voivat parantaa langattoman verkon suorituskykyä erityisesti ympäristössä, jossa on matala signaalikohinasuhde, esimerkiksi mahdollistamalla suuremman toiminta-alueen. Tässä väitöskirjassa on tutkittu kahta LMMSE ilmaisinta ja kolmea LSD ilmaisinta SM lähetyksen ilmaisuun.
Yleisesti LSD algoritmit eivät ole sellaisenaan toteutuskelpoisia kaupallisiin järjestelmiin. Väitöskirjassa on tämän vuoksi tutkittu LSD:n toteutukseen liittyviä haasteita ja toteutusmenetelmiä ja annetaan suosituksia LSD algorithmien suunnitteluun sekä ehdotetaan sopivia toteutuskompromisseja käytännön langattomiin järjestelmiin. Haastavammat suorituskyky- ja latenssivaatimukset edellyttävät lisätutkimuksia toteutusarkkitehtuureihin ja toteutuksiin. Erityisesti rinnakkaisten resurssien käyttö ja liukuhihnatekniikka toteutusarkkitehtuureissa mahdollistavat optimaalisen kompromissin löytämisen toteutuksessa käytettyjen resurssien ja laskentanopeuden väliltä. Väitöskirjassa suunnitellaan tehokkaat arkkitehtuurit tutkituille LMMSE ja LSD algoritmeille ottaen huomioon niiden ominaisuudet.
Väitöskirjassa tutkitaan algoritmien toteutusta VLSI tekniikalla ja pyritään saamaan realistinen arvio algoritmien kompleksisuudesta ja suorituskyvystä. Algoritmeille suunnitellut arkkitehtuurit on toteutettu sekä FPGA piirille että erillisenä toteutuksena ASIC teknologialla. Väitöskirjassa esitetään myös testilaitteistolla tehtyjä mittaustuloksia ja varmistetaan toteutettujen algoritmien suorituskyky.
Optimisation de blocs constitutifs d'un convertisseur A/N pipeline entechnologie CMOS 0.18 µm pour utilisation en environnement spatial / Optimization of building blocks of a pipeline ADC in CMOS 0.18µm technology for space applications
Perbet, Lucas 26 April 2017
L’imagerie constitue un axe majeur de l’exploration de l’univers et de la Terre depuis l’espace, que l’on se trouve dans le domaine du visible ou non. Ainsi dans le domaine spatial, les données sont le plus souvent récupérées par un capteur CCD (Charge-Coupled Device, ou Dispositif à Transfert de Charge (DTC)) qui fournit des tensions analogiques vers un convertisseur analogique-numérique (CAN), dont la sortie sera transmise à une chaîne de traitement, puis envoyée sur terre. Ainsi, les CAN sont des éléments clés dans l’imagerie par satellite. De leur précision et de leur vitesse va dépendre la qualité de la représentativité de la chaîne de signaux binaires. Il est donc crucial de réaliser une conversion de données de grande qualité (vitesse, précision) tout en s’assurant de la résistance du CAN à l’environnement radiatif. L’objectif de cette thèse est d’améliorer la robustesse à l’environnement spatial, tout en optimisant les performances, de plusieurs fonctions élémentaires d’un convertisseur analogique-numérique de type pipeline 14bits,5MS/s, réalisées en technologie XFAB 0,18µm. Les trois fonctions ciblées sont les interrupteurs (notamment la résolution des problèmes liés au phénomène d’injection de charges en environnement spatial), les comparateurs (durcissement) et l’amplificateur à capacités commutées (amélioration du gain par une technique prédictive sans pénaliser la puissance consommée). / Imaging is a major issue in the observation of the Universe and the Earth from space, whether in the visible domain or not. Thus, in the spatial field, data is often gathered by a CCD (charge-Coupled Device) sensor, that supplies analog voltages to an Analog-to-Digital Converter (ADC), which outputs will be delivered to a processing chain, and then sent to earth. Consequently, ADCs are key elements in satellite imaging. Their precision and speed will indeed define the quality and the representativeness of the binary signal. It is then crucial to perform a high quality (speed & precision) conversion of the data, while making sure that the ADC can cope with the harsh irradiative environment. The purpose of this thesis is to improve the robustness to the space environment (hardening), while optimizing the performances, of several elementary devices that compose a 14 bits, 5MS/s pipeline ADC, made with the XFAB 180nm technology. The three targeted functions are the switches (especially the problems linked to coping with the charge injection problems in a space environment), the comparators (hardening) and the switched-capacitor amplifier (gain boosting through a predictive architecture with no penalty on the power consumption).
ASIC implementation of LSTM neural network algorithm
Paschou, Michail January 2018
LSTM neural networks have been used for speech recognition, image recognition and other artificial intelligence applications for many years. Most applications perform the LSTM algorithm and the required calculations on cloud computers. Off-line solutions include the use of FPGAs and GPUs but the most promising solutions include ASIC accelerators designed for this purpose only. This report presents an ASIC design capable of performing the multiple iterations of the LSTM algorithm on a unidirectional and without peepholes neural network architecture. The proposed design provides arithmetic level parallelism options as blocks are instantiated based on parameters. The internal structure of the design implements pipelined, parallel or serial solutions depending on which is optimal in every case. The implications concerning these decisions are discussed in detail in the report. The design process is described in detail and the evaluation of the design is also presented to measure accuracy and error of the design output.This thesis work resulted in a complete synthesizable ASIC design implementing an LSTM layer, a Fully Connected layer and a Softmax layer which can perform classification of data based on trained weight matrices and bias vectors. The design primarily uses 16-bit fixed point format with 5 integer and 11 fractional bits but increased precision representations are used in some blocks to reduce error output. Additionally, a verification environment has also been designed and is capable of performing simulations, evaluating the design output by comparing it with results produced from performing the same operations with 64-bit floating point precision on a SystemVerilog testbench and measuring the encountered error. The results concerning the accuracy and the design output error margin are presented in this thesis report. The design went through Logic and Physical synthesis and successfully resulted in a functional netlist for every tested configuration. Timing, area and power measurements on the generated netlists of various configurations of the design show consistency and are reported in this report. / LSTM neurala nätverk har använts för taligenkänning, bildigenkänning och andra artificiella intelligensapplikationer i många år. De flesta applikationer utför LSTM-algoritmen och de nödvändiga beräkningarna i digitala moln. Offline lösningar inkluderar användningen av FPGA och GPU men de mest lovande lösningarna inkluderar ASIC-acceleratorer utformade för endast dettaändamål. Denna rapport presenterar en ASIC-design som kan utföra multipla iterationer av LSTM-algoritmen på en enkelriktad neural nätverksarkitetur utan peepholes. Den föreslagna designed ger aritmetrisk nivå-parallellismalternativ som block som är instansierat baserat på parametrar. Designens inre konstruktion implementerar pipelinerade, parallella, eller seriella lösningar beroende på vilket anternativ som är optimalt till alla fall. Konsekvenserna för dessa beslut diskuteras i detalj i rapporten. Designprocessen beskrivs i detalj och utvärderingen av designen presenteras också för att mäta noggrannheten och felmarginal i designutgången. Resultatet av arbetet från denna rapport är en fullständig syntetiserbar ASIC design som har implementerat ett LSTM-lager, ett fullständigt anslutet lager och ett Softmax-lager som kan utföra klassificering av data baserat på tränade viktmatriser och biasvektorer. Designen använder huvudsakligen 16bitars fast flytpunktsformat med 5 heltal och 11 fraktions bitar men ökade precisionsrepresentationer används i vissa block för att minska felmarginal. Till detta har även en verifieringsmiljö utformats som kan utföra simuleringar, utvärdera designresultatet genom att jämföra det med resultatet som produceras från att utföra samma operationer med 64-bitars flytpunktsprecision på en SystemVerilog testbänk och mäta uppstådda felmarginal. Resultaten avseende noggrannheten och designutgångens felmarginal presenteras i denna rapport.Designen gick genom Logisk och Fysisk syntes och framgångsrikt resulterade i en funktionell nätlista för varje testad konfiguration. Timing, area och effektmätningar på den genererade nätlistorna av olika konfigurationer av designen visar konsistens och rapporteras i denna rapport.
Validation of Power Dissipation of SerDes IPs
Kas, Adem January 2021
Post-Silicon validation of a designed ASIC is an essential step in the product development process. During the validation process, all specifications of the ASICs have to be controlled in a lab environment. Serializer/Deserialiser(SerDes) blocks in an ASIC are used to perform high-speed serial data communication between distinct integrated circuits. The goal of the thesis is to validate the power consumption of SerDes IP blocks provided by different vendors in an ASIC. To validate power consumption, current and voltage values are read from power supply lines. Then these values are digitized and stored on a Raspberry Pi. To perform these operations, the initial firmware provided by vendors is improved to control SerDes operations, and software is developed to control the Raspberry Pi. Power measured operation is performed for every possible data rate for each SerDes modules. Power measurement is also performed for different temperature range in industry standards with the highest possible data rate for each SerDes IP block. As a final step, measured power consumption values are compared to vendors’ data. / Validering av en designad ASIC efter kisel är ett viktigt steg i produktutvecklingsprocessen. Under valideringsprocessen måste alla specifikationer för ASIC kontrolleras i en laboratoriemiljö. Serializer / Deserialiser (SerDes) -block i en ASIC används för att utföra höghastighets seriell datakommunikation mellan distinkta integrerade kretsar. Målet med avhandlingen är att validera strömförbrukningen för SerDes IP-block som tillhandahålls av olika leverantörer i en ASIC. För att validera strömförbrukningen läses strömoch spänningsvärden från strömförsörjningsledningarna. Sedan digitaliseras dessa värden och lagras på en Raspberry Pi. För att utföra dessa operationer förbättras den inledande firmware som tillhandahålls av leverantörer för att styra SerDesoperationer och programvara utvecklas för att styra Raspberry Pi. Effektmätt operation utförs för varje möjlig datahastighet för varje SerDes-modul. Mätoperationer utförs också för olika temperaturintervall i branschstandarder med högsta möjliga datahastighet för varje SerDes IP-block. Som ett sista steg jämförs uppmätta energiförbrukningsvärden med leverantörens data.
Low Density Parity Check Encoder and Decoder on SiLago Coarse Grain Reconfigurable Architecture
Kong, Weijiang January 2019
Low density parity check (LDPC) code is an error correction code that has been widely adopted as an optional error correcting operation in most of today’s communication protocols. Current design of ASIC or FPGA based LDPC accelerators can reach Gbit/s data rate. However, the hardware cost of ASIC based methods and related interface is considerably high to be integrated into coarse grain reconfigurable architectures (CGRA). Moreover, for platforms aiming at high level synthesis or system level synthesis, they don’t provide flexibility under low-performance low-cost design scenarios. In this degree project, we establish connectivity between SiLago CGRA and a typical QC-LDPC code defined in IEEE 802.11n standard. We design lightweight LDPC encoder and decoder blocks using FSM+Datapath design pattern. The encoder provides sufficient throughput and consumes very little area and power. The decoder provides sufficient performance for low speed modulations while consuming significantly lower hardware resources. Both encoder and decoder are capable of cooperating with SiLago based DRRA through standard Network on Chip (NOC) based shared memory, DiMArch. And extra hardware for interface is no longer necessary. We verified our design through RTL simulation and synthesis. Encoder went through logic and physical synthesis while decoder went through only logic synthesis. The result acquired proves that our design is closely coupled with the SiLago CGRA while provides a solution with lowperformance and low-cost. / LDPC-kod med låg densitet är en felkorrigeringskod som har vidtagits i stor utsträckning som en valfri felsökande operation i de flesta av dagens kommunikationsprotokoll. Nuvarande design av ASICeller FPGAbaserade LDPC-acceleratorer kan nå Gbit / s datahastighet. Hårdvarukostnaden för ASIC-baserade metoder och relaterade gränssnitt är emellertid avsevärt hög för att integreras i grova kornkonfigurerbara arkitekturer (CGRA). Dessutom ger plattformar som syftar till syntese på hög nivå eller syntes på systemnivå inte flexibilitet under lågprestanda med låg kostnadsscenarier. I detta examensarbete upprättar vi anslutning mellan SiLago CGRA och en typisk QC-LDPC-kod definierad i IEEE 802.11n-standarden. Vi designar lätta LDPC-kodare och avkodarblock med FSM + Datapathdesignmönster. Kodaren ger tillräcklig genomströmning och förbrukar mycket lite areal och effekt. Avkodaren ger tillräckligt med prestanda för moduleringar med låg hastighet medan den förbrukar betydligt lägre hårdvaruressurser. Både kodare och avkodare kan samarbeta med SiLago-baserade DRRA genom standard Network on Chip (NOC) baserat delat minne, DiMArch. Och extra hårdvara för gränssnittet är inte längre nödvändigt. Vi verifierade vår design genom RTL-simulering och syntes. Kodaren genomgick logik och fysisk syntes medan avkodare genomgick endast logisk syntes. Det förvärvade resultatet bevisar att vår design är nära kopplad till SiLago CGRA och ger en lösning med låg prestanda och låg kostnad.
Algorithmic Multi-Ported Memories Enabled Power-Efficient Pre-Distorter Design in ASIC / Algorithmiska multi-portad minnen möjliggjorde energieffektiv design av förvrängningskompenserare i ASIC
Shen, Xuying January 2023
The transition from the 5G to the 6G era is a pivotal juncture in contemporary wireless communication. Under such a circumstance, Digital Pre-Distortion (DPD) technology has established its significance as an effective method to linearize Power Amplifiers. However, DPD is facing a series of challenges, notably the increased bandwidth which necessitates more complex modeling techniques. This thesis focuses on the fact that the DPD requires multi-ported memories for the Look-Up-Tables to store correction coefficients, where two research questions are identified. Firstly, this thesis analyses the power, area, and delay-performance trade-offs with an increase in the number of read and write ports of Flip-Flop (FF)-based memories. Secondly, this thesis evaluates and compares the performance of the conventional FF-based multi-ported memories and algorithmic FF-based multi-ported memories. As a Master’s thesis project, this research utilizes the knowledge and practice skills expected of a Master’s student specializing in Embedded Systems. In this thesis, conventional and algorithmic multi-ported memories are implemented and evaluated after studying related works. Subsequently, an industrial Application-Specific Integrated Circuit (ASIC) design flow is executed, undergoing iterative refinements. And in the end, the conclusions are drawn based on an analysis of the software reports. The results underscore that area and power consumption exhibit linear growth alongside increased port numbers within conventional multi-ported memories. Also, the algorithmic multi-ported memory presents a promising alternative, engendering improvements across all three dimensions of delay, area, and power consumption. The implemented memories can be integrated into DPD forward path with customized port numbers in the future, offering adaptability in terms of port configuration and better performance in terms of timing, area and power. Additionally, these implemented memories stand as a valuable point of reference for engineers engaged in the development of FF-based multi-ported memories within the context of ASIC. / Övergången från den 5G- till den 6G- eran är en avgörande tidpunkt inom samtida trådlös kommunikation. Under sådana omständigheter har DPDtekniken etablerat sin betydelse som en effektiv metod för att linjärisera effektförstärkare. Dock står DPD inför en rad utmaningar, särskilt den ökade bandbredden som kräver mer komplexa modelleringstekniker. Denna avhandling fokuserar på det faktum att DPD kräver flerportsminnen för att Look-Up-Tables ska lagra korrigeringskoefficienter, där två forskningsfrågor identifieras. För det första analyserar denna avhandling effekt- , area- och fördröjningsprestanda-avvägningar med en ökning av antalet läs- och skrivportar för FF-baserade minnen. För det andra utvärderar och jämför denna avhandling prestandan hos konventionella FF-baserade multiportade minnen och algoritmiska FF-baserade multiportade minnen. Som ett masteruppsatsprojekt använder denna forskning de kunskaper och övningsfärdigheter som förväntas av en masterstudent som specialiserar sig på inbyggda system. I denna avhandling implementeras och utvärderas konventionella och algoritmiska flerportade minnen efter att ha studerat relaterade arbeten. Nästa steg är att genomföra en industriell ASIC-designflöde som genomgår iterativa förbättringar. Och till slut dras slutsatserna baserat på en analys av mjukvarurapporterna. Denna avhandling understryker att area och strömförbrukning ökar linjärt med ökade portnummer inom konventionella flerportade minnen. Å andra sidan presenterar det algoritmiska flerportade minnet ett lovande alternativ och ger förbättringar inom alla tre dimensioner av fördröjning, area och strömförbrukning. De implementerade minnena kan integreras i DPD-signalförloppet med anpassade portnummer i framtiden och erbjuda anpassningsbarhet när det gäller portkonfiguration och bättre prestanda vad gäller tid, area och ström. Dessutom utgör dessa implementerade minnen en värdefull referenspunkt för ingenjörer som är engagerade i utvecklingen av FF-baserade flerportade minnen inom ramen för ASIC.
Efficient Side-Channel Aware Elliptic Curve Cryptosystems over Prime Fields
Karakoyunlu, Deniz 08 August 2010
"Elliptic Curve Cryptosystems (ECCs) are utilized as an alternative to traditional public-key cryptosystems, and are more suitable for resource limited environments due to smaller parameter size. In this dissertation we carry out a thorough investigation of side-channel attack aware ECC implementations over finite fields of prime characteristic including the recently introduced Edwards formulation of elliptic curves, which have built-in resiliency against simple side-channel attacks. We implement Joye's highly regular add-always scalar multiplication algorithm both with the Weierstrass and Edwards formulation of elliptic curves. We also propose a technique to apply non-adjacent form (NAF) scalar multiplication algorithm with side-channel security using the Edwards formulation. Our results show that the Edwards formulation allows increased area-time performance with projective coordinates. However, the Weierstrass formulation with affine coordinates results in the simplest architecture, and therefore has the best area-time performance as long as an efficient modular divider is available."
Elastic circuits in FPGA
Silva, Thiago de Oliveira
January 2017
O avanço da microeletrônica nas últimas décadas trouxe maior densidade aos circuitos integrados, possibilitando a implementação de funções de alta complexidade em uma menor área de silício. Como efeito desta integração em larga escala, as latências dos fios passaram a representar uma maior fração do atraso de propagação de dados em um design, tornando a tarefa de “timing closure” mais desafiadora e demandando mais iterações entre etapas do design. Por meio de uma revisão na teoria dos circuitos insensíveis a latência (Latency-Insensitive theory), este trabalho explora a metodologia de designs elásticos (Elastic Design methodology) em circuitos síncronos, com o objetivo de solucionar o impacto que a latência adicional dos fios insere no fluxo de design de circuitos integrados, sem demandar uma grande mudança de paradigma por parte dos designers. A fim de exemplificar o processo de “elasticização”, foi implementada uma versão síncrona da arquitetura do microprocessador Neander que posteriormente foi convertida a um Circuito Elástico utilizando um protocolo insensível a latência nas transferências de dados entre os processos computacionais do design. Ambas as versões do Neander foram validadas em uma plataforma FPGA utilizando ferramentas e fluxo de design síncrono bem estabelecidos. A comparação das características de timing e área entre os designs demonstra que a versão Elástica pode apresentar ganhos de performance para sistemas complexos ao custo de um aumento da área necessária. Estes resultados mostram que a metodologia de designs elásticos é uma boa candidata para projetar circuitos integrados complexos sem demandar custosas iterações entre fases de design e reutilizando as já estabelecidas ferramentas de design síncrono, resultando em uma alternativa economicamente vantajosa para os designers. / The advance of microelectronics brought increased density to integrated circuits, allowing high complexity functions to be implemented in smaller silicon areas. As a side effect of this large-scale integration, the wire latencies became a higher fraction of a design’s data propagation latency, turning timing closure into a challenging task that often demand several iterations among design phases. By reviewing the Latency-Insensitive theory, this work presents the exploration of the Elastic Design methodology in synchronous circuits, with the objective of solving the increased wire latency impact on integrated circuits design flow without requiring a big paradigm change for designers. To exemplify the elasticization process, the educational Neander microprocessor architecture is synchronously implemented and turned into an Elastic Circuit by using a latency-insensitive protocol in the design’s computational processes data transfers. Both designs are validated in an FPGA platform, using well known synchronous design tools and flow. The timing and area comparison between the designs demonstrates that the Elastic version can present performance advantages for more complex systems at the price of increased area. These results show that the Elastic Design methodology is a good candidate for designing complex integrated circuits without costly iterations between design phases. This methodology also leverages the reuse of the mostly adopted synchronous design tools, resulting in a cost-effective alternative for designers.
Page generated in 0.0249 seconds