Global ETD Search

41	COMPILER OPTIMIZATIONS FOR POWER ON HIGH PERFORMANCE PROCESSORS RELE, SIDDHARTH N. 11 October 2001 (has links) No description available. Computer Science power optimizations power savings software techniques static power leakage power
42	Optimizations on Estimation and Positioning Techniques in Intelligent Wireless Systems Myeung Suk Oh (18429750) 28 April 2024 (has links) <p dir="ltr">Wireless technologies across various applications aim to improve further by developing intelligent systems, where the performance is optimized through adaptive policy selections that efficiently adjust to the environment dynamics. As a result, accurate observation on the surrounding conditions, such as wireless channel quality and relative target location, becomes an important task. Although both channel estimation and wireless positioning problems have been well studied, with advanced wireless communications relying on complex technologies and being applied to diverse environments, optimization strategies tailored to their unique architectures and scenarios need to be further investigated. In this dissertation, four key research problems related to channel estimation and wireless positioning tasks for intelligent wireless systems are identified and studied. First, a channel denoising problem in multiple-input multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) systems is addressed, and a Q-learning-based successive denoising scheme, which utilizes a channel curvature magnitude threshold to recover unreliable channel estimates, is proposed. Second, a pilot assignment problem in scalable open radio access network (O-RAN) cell-free massive MIMO (CFmMIMO) systems is studied, where a low-complexity pilot assignment scheme based on a multi-agent deep reinforcement learning (MA-DRL) framework along with a codebook search strategy is proposed. Third, sensor selection/placement problems for wireless positioning are addressed, and dynamic and robust sensor selection schemes that minimize the Cramér-Rao lower bound (CRLB) are proposed. Lastly, a feature selection problem for deep learning-based wireless positioning is studied, and a unique feature size selection method, which weights over the expected information gain and classification capability, along with a multi-channel positioning neural network is proposed.</p> channel estimation wireless positioning machine learning optimizations
43	<b>Compiler and Architecture Co-design for Reliable Computing</b> Jianping Zeng (19199323) 24 July 2024 (has links) <p dir="ltr">Reliability against errors, such as soft errors—transient bit flips in transistors caused by energetic particle strikes—and crash inconsistency arising from power failure, is as crucial as performance and power efficiency for a wide range of computing devices, from embedded systems to datacenters. If not properly addressed, these errors can lead to massive financial losses and even endanger human lives. Furthermore, the dynamic nature of modern computing workloads complicates the implementation of reliable systems as the likelihood and impact of these errors increase. Consequently, system designers often face a dilemma: sacrificing reliability for performance and cost-effectiveness or incurring high manufacturing and/or run-time costs to maintain high system dependability. This trade-off can result in reduced availability and increased vulnerability to errors when reliability is not prioritized or escalated costs when it is.</p><p dir="ltr">In light of this, this dissertation, for the first time, demonstrates that with a synergistic compiler and architecture co-design, it is possible to achieve reliability while maintaining high performance and low hardware cost. We begin by illustrating how compiler/architecture co-design achieves near-zero-overhead soft error resilience for embedded cores (Chapter 2). Next, we introduce ReplayCache (Chapter 3), a software-only approach that ensures crash consistency for energy harvesting systems (backed with embedded cores) and outperforms the state-of-the-art by 9x. Apart from embedded cores, reliability for server-class cores is more vital due to their widespread adoption in performance-critical environments. With that in mind, we then propose VeriPipe (Chapter 4), which showcases how a straightforward microarchitectural technique can achieve near-zero-overhead soft error resilience for server-class cores with a storage overhead of just three registers and one countdown timer. Finally, we present two approaches to achieving performant crash consistency for server-class cores by leveraging existing dynamic register renaming in out-of-order cores (Chapter 5) and repurposing Intel’s non-temporal path (Chapter 6), respectively. Through these innovations, this dissertation paves the way for more reliable and efficient computing systems, ensuring that reliability does not come at the cost of performance degradation or hardware complexity.</p> Compiler optimizations Computer architecture. Reliability and Crash Consistency
44	Predictive Modeling of Novel Mutations to DNA-Editing Metalloenzymes and Development of Improved QM/MM Methods Hix, Mark Alan 12 1900 (has links) Molecular dynamics simulations and QM/MM calculations can provide insights into the structure and function of enzymes as well as changes due to mutations of the protein sequence. Molecular Dynamics Protein Engineering Cancer-related mutations Minimum Free Energy Path Optimizations Chemistry, Physical Chemistry, Biochemistry
45	Isothermality: making speculative optimizations affordable Pereira, David John 22 December 2007 (has links) Partial Redundancy Elimination (PRE) is a ubiquitous optimization used by compilers to remove repeated computations from programs. Speculative PRE (SPRE), which uses program profiles (statistics obtained from running a program), is more cognizant of trends in run time behaviour and therefore produces better optimized programs. Unfortunately, the optimal version of SPRE is a very expensive algorithm, of high-order polynomial time complexity, and unlike most compiler optimizations, which run effectively in linear time complexity over the size of the program that they are optimizing. This dissertation uses the concept of “isothermality”—the division of a program into a hot region and a cold region—to create the Isothermal SPRE (ISPRE) optimization, an approximation to optimal SPRE. Unlike SPRE, which creates and solves a flow network for each program expression being optimized—a very expensive operation—ISPRE uses two simple bit-vector analyses, optimizing all expressions simultaneously. We show, experimentally, that the ISPRE algorithm works, on average, nine times faster than the SPRE algorithm, while producing programs that are optimized competitively. This dissertation also harnesses the power of isothermality to empower another kind of ubiquitous compiler optimization, Partial Dead Code Elimination (PDCE), which removes computations whose values are not used. Isothermal Speculative PDCE (ISPDCE) is a new, simple, and efficient optimization which requires only three bit-vector analyses. We show, experimentally, that ISPDCE produces superior optimization than PDCE, while keeping a competitive running time. On account of their small analysis costs, ISPRE and ISPDCE are especially appropriate for use in Just-In-Time (JIT) compilers. compiler optimizations code motion partial redudancy elimination partial dead code elimination
46	Τεχνικές μεταγλωττιστών για βελτιστοποίηση ειδικών πυρήνων λογισμικού Σιουρούνης, Κωνσταντίνος 16 June 2011 (has links) Με την ολοένα και αυξανόμενη τάση για ενσωματωμένα (embedded) και φορητά υπολογιστικά συστήματα της σύγχρονης εποχής, έχειδημιουργηθεί ένας ολόκληρος επιστημονικός κλάδος γύρω από τεχνικές βελτιστοποίησης μεταγλωττιστών για ειδικούς πυρήνες λογισμικού που εκτελούνται στα συστήματα αυτά. Κάνοντας χρήση τεχνικών βελτιστοποίησης τα κέρδη είναι πολλαπλά. Καταρχήν οι πυρήνες μπορούν να ολοκληρώσουν το χρόνο που απαιτείται για να ολοκληρωθεί η εκτέλεση τους σε πολύ μικρότερο διάστημα, έχοντας πολύ μικρότερες απαιτήσεις μνήμης. Επίσης μειώνονται οι ανάγκες τους σε επεξεργαστική ισχύ κάτι το οποίο άμεσα οδηγεί στη μείωση κατανάλωσης ενέργειας, στην αύξηση αυτονομίας τους σε περίπτωση που μιλάμε για φορητά συστήματα και στις ανάγκες για ψύξη των συστημάτων αυτών καθώς εκλύονται πολύ μικρότερα ποσά ενέργειας. Έτσι λοιπόν επιτυγχάνονται κέρδη σε πολλούς τομείς (χρόνος εκτέλεσης, ανάγκες μνήμης, αυτονομία, έκλυση θερμότητας) καθιστώντας τον κλάδο των βελτιστοποιήσεων ένα από τους πιο ταχέως αναπτυσσόμενους κλάδους. Εκτός όμως από την σκοπιά της αύξησης επιδόσεων, στην περίπτωση των ενσωματωμένων συστημάτων πραγματικού χρόνου (real time operations) που όταν ξεπερνιούνται οι διορίες χρόνου εκτέλεσης οδηγούνται σε υποβαθμισμένες επιδόσεις (soft real time) και ειδικότερα στην περίπτωση αυτών που οδηγούνται σε αποτυχία όταν ξεπερνιούνται οι διορίες αυτές (hard real time operations), οι τεχνικές αυτές αποτελούν ουσιαστικά μονόδρομο για την υλοποίηση των συστημάτων αυτών σε λογικά επίπεδα κόστους. Η διαδικασία όμως της ανάπτυξης βελτιστοποιήσεων δεν είναι αρκετή καθώς είναι εξίσου σημαντικό το κατά πόσο οι βελτιστοποιήσεις αυτές ταιριάζουν στην εκάστοτε αρχιτεκτονική του συστήματος. Εάν δε ληφθεί υπόψη η αρχιτεκτονική του συστήματος που θα εφαρμοστούν, τότε οι βελτιστοποιήσεις μπορούν να οδηγήσουν σε αντίθετα αποτελέσματα υποβαθμίζοντας την απόδοση του συστήματος. Στην παρούσα διπλωματική εργασία βελτιστοποιείται η διαδικασία πολλαπλασιασμού διανύσματος με πίνακα toeplitz. Κατά την εκπόνηση της αναπτύχθηκε πληθώρα χρονοπρογραμματισμών που στοχεύουν στην βελτιστοποίηση της διαδικασίας αυτής. Μετά από μια εις βάθους μελέτη της ιεραρχίας μνήμης και των τεχνικών βελτιστοποίησης που προσφέρονται για αποδοτικότερη εκμετάλλευσή της, αλλά και των κυριότερων τεχνικών βελτιστοποίησης μεταγλωττιστών, παρουσιάζονται οι κυριότεροι χρονοπρογραμματισμοί, από όσους αναπτύχθηκαν, με τον κάθε ένα να προσφέρει κέρδος σε διαφορετικές αρχιτεκτονικές συστημάτων. Κατά αυτό τον τρόπο αναπτύσσεται ένα εργαλείο που δέχεται σαν είσοδο την αρχιτεκτονική του συστήματος πάνω στο οποίο πρόκειται να γίνει βελτιστοποίηση του εν λόγω πυρήνα, αποκλείονται αρχικά οι χρονοπρογραμματισμοί που δεν είναι κατάλληλοι για την συγκεκριμένη αρχιτεκτονική, ενώ για τους υποψήφιους πιο αποδοτικούς γίνεται εξερεύνηση ούτως ώστε να επιλεγεί ο αποδοτικότερος. / -- Μεταγλωττιστές Διαχείριση μνήμης 005.453 Compilers Embedded systems Compiler optimizations Memory management Computer architecture
47	Adapting the polytope model for dynamic and speculative parallelization / Adaptation du modèle polyhédrique à la parallélisation dynamique et spéculatice Jimborean, Alexandra 14 September 2012 (has links) Dans cette thèse, nous décrivons la conception et l'implémentation d'une plate-forme logicielle de spéculation de threads, ou fils d'exécution, appelée VMAD, pour "Virtual Machine for Advanced Dynamic analysis and transformation", et dont la fonction principale est d'être capable de paralléliser de manière spéculative un nid de boucles séquentiel de différentes façons, en ré-ordonnançant ses itérations. La transformation à appliquer est sélectionnée au cours de l'exécution avec pour objectifs de minimiser le nombre de retours arrières et de maximiser la performance. Nous effectuons des transformations de code en appliquant le modèle polyédrique que nous avons adapté à la parallélisation spéculative au cours de l'exécution. Pour cela, nous construisons au préalable un patron de code qui est "patché" par notre "runtime", ou support d'exécution logiciel, selon des informations de profilage collectées sur des échantillons du temps d'exécution. L'adaptabilité est assurée en considérant des tranches de code de tailles différentes, qui sont exécutées successivement, chacune étant parallélisée différemment, ou exécutée en séquentiel, selon le comportement des accès à la mémoire observé. Nous montrons, sur plusieurs programmes que notre plate-forme offre de bonnes performances, pour des codes qui n'auraient pas pu être traités efficacement par les systèmes spéculatifs de threads proposés précédemment. / In this thesis, we present a Thread-Level Speculation (TLS) framework whose main feature is to speculatively parallelize a sequential loop nest in various ways, to maximize performance. We perform code transformations by applying the polyhedral model that we adapted for speculative and runtime code parallelization. For this purpose, we designed a parallel code pattern which is patched by our runtime system according to the profiling information collected on some execution samples. We show on several benchmarks that our framework yields good performance on codes which could not be handled efficiently by previously proposed TLS systems. Programmation parallèle Speculative parallelization Runtime system Compiler Polyhedral model Dynamic optimizations Loops Partial parallelism LLVM Automatic parallelization 005
48	Contribuição para a otimização de traços de concreto utilizados na produção de blocos estruturais / Felipe, Alexsandro dos Santos. January 2010 (has links) Resumo: Diante do grande crescimento da alvenaria estrutural no país, muitas fábricas de blocos de concreto vieram por necessidade, buscar por otimizações do seu processo produtivo, visto que, projetos mais arrojados, acabam exigindo maiores controles de qualidade. A proposta deste estudo é melhorar a produção destes artefatos de concreto por meio de otimizações simples que reduzem o custo e garantem uma produção eficiente na fábrica. Estudar a fundo os vários parâmetros de formação de um traço de concreto seco, tais como coesão, textura, energias de adensamento e resistência à compressão axial, todos os fenômenos dependentes entre si, torna-se muito complexo se avaliado em um único trabalho. No entanto, propor um estudo que colete informações apresentadas por vários autores, facilita na otimização e criação de uma pesquisa que possa contribuir na dosagem para concretos secos, em especial, na fabricação de blocos estruturais. Neste estudo, adaptaram-se para laboratório alguns equipamentos de uso comum para confecção destes artefatos de concreto, possibilitando a correlação direta de corpos de prova cilíndricos com os blocos. Uma das adaptações é o estudo baseado na padronização da energia de compactação, proporcionada pelo equipamento para ensaios de mini-proctor, simulando assim, a máquina vibro-prensa. Outras correlações como coesão e resistência à compressão, também foram possíveis de obter no laboratório, reduzindo então, as interferências constantes no processo produtivo da fábrica, ocasião verificada em vários outros estudos. Diante do exposto, foi possível avaliar com boa segurança os resultados. O estudo foi desenvolvido em três etapas, sempre buscando a maior massa específica seca compactada da mistura de agregados, inicialmente na primeira etapa, foi utilizado somente dois agregados (areia fina e pedrisco), conforme... (Resumo completo, clicar acesso eletrônico abaixo) / Abstract: Given the substantial growth of structural masonry in Brazil, many concrete block companies have seen the need to optimize their production process, since more challenging projects require greater quality control. This study proposes to improve the production of concrete artifacts by means of simple optimizations that reduce costs and ensure the company's efficient production. Studying in depth the various parameters of the formation of dry concrete trace, such as cohesion, texture, density energy and axial compressive strength, all the particularities interdependent on one another become very complex when assessed in a single study. However, proposing a study that collects the information submitted by various authors, expedites optimizing and creating a research study which may assist in improving the dosage for dried concrete, particularly in the manufacture of structural blocks. In this study, some commonly used manufacturing equipment, such as concrete artifacts, were laboratory-adapted, enabling a direct correlation of the cylindrical specimens with the blocks. One of the adjustments is the study based on the standardization of compaction energy, provided by the mini-proctor testing equipment, thus simulating the vibro press machine. Other correlations such as cohesion and compressive strength were also obtained in the laboratory, therefore reducing the constant interferences in the plant's production process, observed in many other studies. In this manner, it was possible to reliably assess the results. The study was conducted in three stages, always seeking the highest compressed dry specific mass of the mixture of aggregates. In the first stage, only two aggregates were used (fine sand and pebbles), commonly used at the plant. The second stage included adding the coarse sand and stone powder to correct the lack of resistance promoted by the high amount of fine sand from... (Complete abstract click electronic access below) / Orientador: Jefferson Sidney Camacho / Coorientador: Maria da Consolação F. de Albuquerque / Banca: Jorge Luís Akasaki / Banca: Paulo César Primo Agostinho / Mestre Blocos de concreto. Structural masonry. eng Optimizations of concrete traces. eng Blocks of concrete. eng Productive process. eng Concrete "dry". eng
49	Otimização e planejamento da alocação de capital em instituições financeiras considerando os requisitos do acordo de Basiléia II para o risco de crédito / OPTIMIZATION AND PLANNING THE ALLOCATION OF CAPITAL CONSIDERING THE REQUIREMENTS OF FINANCIAL INSTITUTIONS AGREEMENT FOR BASEL II CREDIT RISK. Barros Filho, Cicero Venancio 28 April 2010 (has links) The regulation has influenced the attitudes of financial institutions when related to Risk assumption. The global financial market admits the necessity to quantify and control risks inherent to banking activities, processes which are converging to a standard in almost all countries. Following recommendations of Basel Committee on Banking Supervision (BCBS), regulatory agencies apply, in many countries, rules to size the minimum capital requirement in financial institutions. Aiming to offer a link of ideas, the first chapter will reference concepts, definitions and approaches about risk in financial institutions, related to the evolution of perception, theories for analysis and considerations about the interrelationship between these risks. The creation of central banks linked to historical framework of global financial crisis is addressed in the second chapter, focusing the creation process of bank regulation. In the third chapter are presented, as the main object of this study, parameters projections relating to capital allocation, in order to attend new managerial requirements in financial institutions, considering the optimization of returns on the risk-weighted assets and strategic planning for expansion capital and credit lines, in accordance with the requirements standardized by the banking supervisory agencies. The final considerations demonstrate the applicability of these projections as support for risk management policies and capital allocation in financial institutions. / A regulamentação tem influenciado as atitudes das instituições financeiras quando relacionada à assunção de riscos. O mercado financeiro mundial reconhece a necessidade da quantificação e do controle dos riscos inerentes às atividades bancárias, processos esses que vêm convergindo para uma padronização em praticamente todos os países. Seguindo recomendações do Comitê de Supervisão Bancária de Basiléia (BCBS - Basel Committee on Banking Supervision), órgãos reguladores na maioria dos países aplicam regras de dimensionamento de capital mínimo obrigatório às instituições financeiras. Com o propósito de oferecer um encadeamento de idéias são referenciados no primeiro capítulo, conceitos, definições e abordagens sobre riscos em instituições financeiras, relacionados com a evolução da percepção, marcos teóricos para análise e considerações sobre a inter-relação entre esses riscos. A criação dos bancos centrais vinculada ao arcabouço histórico de crises financeiras mundiais é abordada no segundo capítulo como ênfase para o processo de geração da regulamentação bancária. No terceiro capítulo são apresentadas, como o objeto principal desse trabalho, projeções de parâmetros relativos à alocação de capital, com o propósito de atender a novos requisitos gerenciais em instituições financeiras, considerando a otimização dos retornos dos ativos ponderados aos riscos e o planejamento estratégico de expansão do capital e linhas de crédito, em conformidasde com os requisitos normatizados pelos órgãos de supervisão bancária. As considerações finais evidenciam a aplicabilidade dessas projeções como subsídio para políticas de gerenciamento de riscos e alocação de capital em instituições financeiras. Riscos Regulamentação Basiléia Otimização de retornos Ativos Planejamento Risks Regulation Basel Return optimizations Assets Planning
50	Contributions on approximate computing techniques and how to measure them / Contributions sur les techniques de computation approximée et comment les mesurer Rodriguez Cancio, Marcelino 19 December 2017 (has links) La Computation Approximée est basée dans l'idée que des améliorations significatives de l'utilisation du processeur, de l'énergie et de la mémoire peuvent être réalisées, lorsque de faibles niveaux d'imprécision peuvent être tolérés. C'est un concept intéressant, car le manque de ressources est un problème constant dans presque tous les domaines de l'informatique. Des grands superordinateurs qui traitent les big data d'aujourd'hui sur les réseaux sociaux, aux petits systèmes embarqués à contrainte énergétique, il y a toujours le besoin d'optimiser la consommation de ressources. La Computation Approximée propose une alternative à cette rareté, introduisant la précision comme une autre ressource qui peut à son tour être échangée par la performance, la consommation d'énergie ou l'espace de stockage. La première partie de cette thèse propose deux contributions au domaine de l'informatique approximative: Aproximate Loop Unrolling : optimisation du compilateur qui exploite la nature approximative des données de séries chronologiques et de signaux pour réduire les temps d'exécution et la consommation d'énergie des boucles qui le traitent. Nos expériences ont montré que l'optimisation augmente considérablement les performances et l'efficacité énergétique des boucles optimisées (150% - 200%) tout en préservant la précision à des niveaux acceptables. Primer: le premier algorithme de compression avec perte pour les instructions de l'assembleur, qui profite des zones de pardon des programmes pour obtenir un taux de compression qui surpasse techniques utilisées actuellement jusqu'à 10%. L'objectif principal de la Computation Approximée est d'améliorer l'utilisation de ressources, telles que la performance ou l'énergie. Par conséquent, beaucoup d'efforts sont consacrés à l'observation du bénéfice réel obtenu en exploitant une technique donnée à l'étude. L'une des ressources qui a toujours été difficile à mesurer avec précision, est le temps d'exécution. Ainsi, la deuxième partie de cette thèse propose l'outil suivant : AutoJMH : un outil pour créer automatiquement des microbenchmarks de performance en Java. Microbenchmarks fournissent l'évaluation la plus précis de la performance. Cependant, nécessitant beaucoup d'expertise, il subsiste un métier de quelques ingénieurs de performance. L'outil permet (grâce à l'automatisation) l'adoption de microbenchmark par des non-experts. Nos résultats montrent que les microbencharks générés, correspondent à la qualité des manuscrites par des experts en performance. Aussi ils surpassent ceux écrits par des développeurs professionnels dans Java sans expérience en microbenchmarking. / Approximate Computing is based on the idea that significant improvements in CPU, energy and memory usage can be achieved when small levels of inaccuracy can be tolerated. This is an attractive concept, since the lack of resources is a constant problem in almost all computer science domains. From large super-computers processing today’s social media big data, to small, energy-constraint embedded systems, there is always the need to optimize the consumption of some scarce resource. Approximate Computing proposes an alternative to this scarcity, introducing accuracy as yet another resource that can be in turn traded by performance, energy consumption or storage space. The first part of this thesis proposes the following two contributions to the field of Approximate Computing :Approximate Loop Unrolling: a compiler optimization that exploits the approximative nature of signal and time series data to decrease execution times and energy consumption of loops processing it. Our experiments showed that the optimization increases considerably the performance and energy efficiency of the optimized loops (150% - 200%) while preserving accuracy to acceptable levels. Primer: the first ever lossy compression algorithm for assembler instructions, which profits from programs’ forgiving zones to obtain a compression ratio that outperforms the current state-of-the-art up to a 10%. The main goal of Approximate Computing is to improve the usage of resources such as performance or energy. Therefore, a fair deal of effort is dedicated to observe the actual benefit obtained by exploiting a given technique under study. One of the resources that have been historically challenging to accurately measure is execution time. Hence, the second part of this thesis proposes the following tool : AutoJMH: a tool to automatically create performance microbenchmarks in Java. Microbenchmarks provide the finest grain performance assessment. Yet, requiring a great deal of expertise, they remain a craft of a few performance engineers. The tool allows (thanks to automation) the adoption of microbenchmark by non-experts. Our results shows that the generated microbencharks match the quality of payloads handwritten by performance experts and outperforms those written by professional Java developers without experience in microbenchmarking. Computation Approximée Optimisation des compilateurs Diversité logicielle Compression avec perte Microbenchmark Approximate computing Compiler optimizations Software diversity Lossy compression Microbenchmark

Search results