Global ETD Search

1	Optimized SIMD architecture exploration and implementation for ultra-low energy processors / Εξερεύνηση και υλοποίηση βελτιστοποιημένης SIMD αρχιτεκτονικής για επεξεργαστές πολύ χαμηλής κατανάλωσης Δακουρού, Στεφανία 19 July 2012 (has links) On-line monitoring is an important challenge in future biotechnology applications, for instance in the domain of precision livestock farming where a strong need is present for low-cost intelligent sensors to monitor animal welfare. On-line poultry monitoring can significantly improve living conditions of hens in industrial farms. A very low-cost low-energy solution needs to be provided though due to the stringent battery limitations. Domain-specific ASIPs can be an ideal solution when they cover enough submarkets to increase the production volume (reducing the price) and ultra-low energy concepts are used for their realization. This work is a part of a larger project and aiming to high energy-efficiency. The current study implements data parallelization, using a recently introduced software-controlled SIMD realization in an innovative way. The approaches that have been employed for the determination of the final instruction set of the architecture that has been created for the mapping of the critical Gauss loop of the detection application, are thoroughly explored. The re-design of the data-parallel data path, also referred to as Soft-SIMD architecture, has been necessary in order to achieve instruction encoding optimization. Furthermore, we have explored the capabilities that a commercial compiler retargetable Tool, like Target, can offer for our target design and we have suggested some potential modifications that would help the tool to become more efficient and useful for a designer’s needs in such architecture. Thereby, this study also demonstrates the promising results obtained by experimenting with detours around the current Target tool design limitations. Finding the right balance between efficiency and flexibility requires the ability to quickly evaluate alternative architectures through simulations and testing techniques. The methods developed for exactly this purpose, with the help of Target’s IP Designer retargetable tool-suite, are discussed in detail. By exploiting the profiling information produced by the ISS, and by reading the assembly code produced by the C compiler, it is possible to identify the instructions in the critical loop, and optimize them by using a number of techniques discussed. The main purpose of this optimization is to reduce the cycle count of the application, in order to reduce the overall power consumption. VHDL files of the optimized and un-optimized processor are automatically generated using the HDL generation tool. However, examining a bio-imaging application, instantiated from the ULP-ASIP architectural template [FEENECS book], many other issues are present too. In particular, the way that these kinds of implementations have to be tested should be taken into consideration. Preferably, the testability has not only to be sufficient and efficient but also reusable, in the sense that test patterns should be able to be generated not only for a specific application or for a group of applications but for the entire architectural template. Therefore, this study also illustrates a Systematic Test Vector generation process for the ULP-ASIP template. Our goal is to make generalized principles, because such principles are reusable and can be applied to any instances, such as our present processor for the Gauss Filter. Finally, this study is completed by presenting some realistic power numbers based on layout back-annotation, which concern the data path components of the processor. Based on all the advanced optimizations and broad search space explorations that are presented in this thesis, a heavily optimized ASIP architecture has been fully implemented which results in a low-cost ultra low-energy consumption while still meeting all the performance requirements. / Η αυτόματη μέθοδος παρακολούθησης ζωντανών οργανισμών, όπως έχει ερευνηθεί και δημοσιευθεί από το Τμήμα Biosystems (BIOSYST) του K.U. Leuven [1], συνίσταται από μια εϕαρμογή με «υπολογιστική όραση», η οποία, βασιζόμενη στις αποκρίσεις τους, κατηγοριοποιεί τη συμπεριϕορά τους. Η βιοτεχνολογική αυτή εϕαρμογή αναπτύσσει ένα πλήρως αυτοματοποιημένο σύστημα «υπολογιστικής όρασης» σε μεμονωμένες και υπό περιορισμό όρνι- θες.Η εϕαρμογή χωρίζεται σε δύο αλγόριθμους, εκ των οποίων ο πρώτος ανιχνεύει το αντι- κείμενο παρακολούθησης (detection algorithm) και ο δεύτερος το εντοπίζει (tracking algorithm). Η παρούσα μελέτη αποτελεί κομμάτι ενός μεγαλυτέρου project και συνέχεια της προηγούμενης δουλείας που αναπτύχθηκε στον τομέα αυτό.Ο σκοπός αυτής της μελέτης είναι η εξερεύνηση της αρχιτεκτονικής που έχει δημιουργηθεί για την αντιστοίχιση του κρίσιμου βρόχου Gauss του αλγόριθμου ανίχνευσης προκειμένου να καθοριστεί το τελικό σύνολο εντολών του ULP-ASIP SIMD επεξεργαστή. Οι τεχνικές και οι προσεγγίσεις που χρησιμοποιούνται για την υποστήριξη της διαδικασίας βελτιστοποίησης της κωδικοποίησης του συνόλου εντολών παρουσιάζονται εκτεταμένα στο κεϕάλαιο 2. Επιπλέον, κατά τη διάρκεια της εξερεύνησης της αρχιτεκτονικής, το σύνολο εντολών που ορίστηκε και οι τεχνικές αντιστοίχισης επανεξετάζονται, προκειμένου να μειωθεί το συνολικό κόστος εκτέλεσης. Η εύρεση της σωστής ισορροπίας μεταξύ της αποτελεσματικότητας και της ευελιξίας απαιτεί την ικανότητα να αξιολογούνται γρήγορα εναλλακτικές αρχιτεκτονικές μέσω εξομοιώσεων και τεχνικών δοκιμών. Το Κεϕάλαιο 3 επεξηγεί τις μεθόδους που αναπτύχθηκαν ακριβώς για το σκοπό αυτό, με τη βοήθεια του περιβάλλοντος σχεδίασης IP των TARGET Compiler Τεχνολογιών η οποία προσϕέρει ένα πλήρες reTARGETable εργαλείο. Ωστόσο, μια πιο συστηματική διαδικασία παραγωγής διανυσμάτων δοκιμής για ολόκληρη την πλατϕόρμα ULP-ASIP κατέληξε να είναι ένα πολύ σημαντικό πλεονέκτημα για την επικύρωση της λειτουργίας του επεξεργαστή ULP-ASIP. Ως εκ τούτου, μια τέτοια μέθοδος, αναλύεται και παρουσιάζεται εκτεταμένα στο κεϕάλαιο 4. Τέλος, το Κεϕάλαιο 5 παρουσιάζει την εκτίμηση της ενέργειας του data path του επεξεργαστή. Με βάση όλες τις προηγμένες βελτιστοποιήσεις και τις ευρείες εξερευνήσεις του χώρου αναζήτησης που παρουσιάζονται στα προηγούμενα κεϕάλαια, μια ισχυρά βελτιστοποιημένη συνθέσιμη αρχιτεκτονική ASIP υλοποιείται πλήρως η οποία οδηγεί σε μια χαμηλού κόστους, πολύ χαμηλής κατανάλωσης ενέργειας πλατϕόρμα, καλύπτοντας συγχρόνως όλες τις απαιτήσεις επιδόσεων. SIMD Ultra-low energy processors ASIP Instruction set processors 621.395
2	Energy efficient instruction decoding in application: Specific instruction - set processors / Αποκωδικοποίηση εντολών για χαμηλή κατανάλωση ενέργειας σε επεξεργαστές συνόλου εντολών ειδικού σκοπού Κάργας, Χρήστος 04 September 2013 (has links) With commercial processor design tools, a designer can quickly design a C- programmable ASIP for a specific application domain. There are several such ASIPs available for both wireless (UWB baseband processing), encryption, and biomedical processing (particularly for ECG beat detection). In traditional CPUs and DSPs the impact of the instruction-set definition and the complexity of the instruction decoder can be substantial, especially in terms of power consumption. Fully orthogonal VLIW processors, do not incur the cost of an instruction decoder that severely. Instead the instruction word becomes very large, thereby shifting the (power-)cost to the program memory or instruction cache. For the purposes of this thesis a SIMD processor is developed and is compared to a soft-SIMD to observe its area, performance and energy efficiency for a bioimaging benchmark and how the processor description in the ASIP language nML, defines the generated HDL. This SIMD processor is turned into orthogonal and using iterative experiments it is investigated, what is the impact on power while manipulating the instruction-set architecture in combination with the program memory size. It is also investigated how instruction-set re-configuration can be exploited to improve power efficiency. Using this investigation guidelines for low-power ASIP design can be produced. / Με τη σύγχρονη τεχνολογία σχεδιασμού επεξεργαστών, ο σχεδιαστής μπορεί με ευκολία να σχεδιάσει ένα προγραμματιζόμενο Επεξεργαστή Συνόλου Εντολών Ειδικού Σκοπού (ASIP - Application-Specific Instruction-set Processor) για ένα συγκεκριμένο εύρος εφαρμογών. Υπάρχουν διάφοροι τέτοιοι επεξεργαστές διαθέσιμοι για ασύρματες εφαρμογές, κρυπτογράφηση και βιοϊατρικές εφαρμογές (π.χ. στον αλγόριθμο εντοπισμού χτύπου ηλεκτροκαρδιογραφήματος). Στους παραδοσιακούς επεξεργαστές και επεξεργαστές σήματος (DSP - Digital Signal Processor) ο ορισμός του συνόλου εντολών και η πολυπλοκότητα έχουν μεγάλη επίδραση, ειδικά στην κατανάλωση ισχύος. Μία πιθανή λύση σε αυτό το πρόβλημα είναι οι ορθογώνιοι επεξεργαστές μεγάλου μεγέθους λέξης εντολής (VLIW - Very Large Instruction Word). Με τον όρο ορθογώνιο επεξεργαστή, ορίζεται ένας επεξεργαστής οριζόντιου σύνολου εντολών, άρα ένας επεξεργαστής στον οποίο μπορεί να υπάρξει κάθε διαθέσιμος συνδυασμός μεταξύ των διαθέσιμων εντολών και των μεθόδων διευθυνσιοδότησης για πρόσβαση στη μνήμη και το αρχείο καταχωρητών. Οι ορθογώνιοι επεξεργαστές δεν επιβαρύνουν τόσο τον αποκωδικοποιητή εντολών. Αντί αυτού το μέγεθος της λέξης της εντολής γίνεται πολύ μεγάλο, και έτσι μετατίθεται το ενεργειακό κόστος στην μνήμη εντολών προγράμματος (program memory )ή την κρυφή μνήμη εντολών προγράμματος (instruction cache). Για τους σκοπούς αυτής της διπλωματικής εργασίας, αναπτύχθηκε ένας επεξεργαστής SIMD, ο οποίος συγκρίνεται με έναν soft-SIMD για να μελετηθούν η απαιτούμενη περιοχή στο ενσωματωμένο, επιδόσεις και κατανάλωση ενέργειας για μία βιοϊατρική εφαρμογή, καθώς και το πως η περιγραφή ενός επεξεργαστή στη γλώσσα περιγραφής επεξεργαστών ASIP nML ορίζει την παραγούμενη γλώσσα περιγραφής υλικού (HDL - Hardware Description Language). Ο επεξεργαστής αυτός μετατρέπεται σε ορθογώνιο, και με τη χρήση επαναληπτικών πειραμάτων μελετάται η επίδραση στην κατανάλωση ενέργειας κατά τη διάρκεια αλλαγών στην αρχιτεκτονική του συνόλου εντολών και του μεγέθους της μνήμης εντολών προγράμματος. Ακόμη μελετάται πως μπορεί να εκμεταλλευτεί ο σχεδιαστής την αναδιάρθρωση του συνόλου εντολών για να βελτιώσει την κατανάλωση ενέργειας. Processors Low energy consumption Orthogonal instruction-set VLIW ASIP 621.391 Επεξεργαστές
3	Διαχείριση κοινόχρηστων πόρων σε πολυεπεξεργαστικά συστήματα ενός ολοκληρωμένου Πετούμενος, Παύλος 06 October 2011 (has links) Στην παρούσα διατριβή προτείνονται μέθοδοι διαχείρισης των κοινόχρηστων πόρων σε υπολογιστικά συστήματα όπου πολλαπλοί επεξεργαστές μοιράζονται το ίδιο ολοκληρωμένο (Chip Multiprocessors – CMPs). Ενώ μέχρι πρόσφατα ο σχεδιασμός ενός υπολογιστικού συστήματος στόχευε στην ικανοποίηση των απαιτήσεων μόνο μίας εφαρμογής ανά χρονική περίοδο, τώρα πια απαιτείται και η εξισορρόπηση των απαιτήσεων διαφορετικών εφαρμογών που ανταγωνίζονται για την κατοχή των ίδιων πόρων. Σε πολλές περιπτώσεις, όμως, αυτό δεν αρκεί από μόνο του. Ακόμη και αν επιτευχθεί κάποιος ιδανικός διαμοιρασμός του πόρου, αν δεν βελτιστοποιηθεί ο τρόπος με τον οποίο χρησιμοποιούν οι επεξεργαστές τον κοινόχρηστο πόρο, δεν θα καταφέρει να εξυπηρετήσει ικανοποιητικά το αυξημένο φορτίο. Για να αντιμετωπιστούν τα προβλήματα που πηγάζουν από τον διαμοιρασμό των κοινόχρηστων πόρων, στην παρούσα εργασία προτείνονται τρεις εναλλακτικοί μηχανισμοί διαχείρισης. Η πρώτη μεθοδολογία εισάγει μία νέα θεωρητική μοντελοποίηση του διαμοιρασμού της κρυφής μνήμης, η οποία μπορεί να χρησιμοποιηθεί παράλληλα με την εκτέλεση των προγραμμάτων που διαμοιράζονται την κρυφή μνήμη. Η μεθοδολογία αξιοποιεί στην συνέχεια αυτήν την μοντελοποίηση, για να ελέγξει τον διαμοιρασμό της κρυφής μνήμης και να επιτύχει δικαιοσύνη στο πως κατανέμεται ο χώρος της κρυφής μνήμης μεταξύ των επεξεργαστών. Η δεύτερη μεθοδολογία παρουσιάζει μία νέα τεχνική για την πρόβλεψη της τοπικότητας των προσπελάσεων της κρυφής μνήμης. Καθώς η τοπικότητα είναι η βασική παράμετρος που καθορίζει την χρησιμότητα των δεδομένων της κρυφής μνήμης, χρησιμοποιώντας αυτήν την τεχνική πρόβλεψης μπορούν να οδηγηθούν μηχανισμοί διαχείρισης που βελτιώνουν την αξιοποίηση του χώρου της κρυφής μνήμης. Στα πλαίσια της μεθοδολογίας παρουσιάζουμε έναν τέτοιο μηχανισμό, ο οποίος στοχεύει στην ελαχιστοποίηση των αστοχιών της κρυφής μνήμης μέσω μίας νέας πολιτικής αντικατάστασης. Η τελευταία μεθοδολογία που παρουσιάζεται είναι μία μεθοδολογία για την μείωση της κατανάλωσης ενέργειας της ουράς εντολών, που είναι μία από τις πιο ενεργειακά απαιτητικές δομές του επεξεργαστή. Στα πλαίσια της μεθοδολογίας, δείχνεται ότι το κλειδί για την αποδοτική μείωση της κατανάλωσης ενέργειας της ουράς εντολών βρίσκεται στην αλληλεπίδραση της με το υποσύστημα μνήμης. Με βάση αυτό το συμπέρασμα, παρουσιάζουμε έναν νέο μηχανισμό δυναμικής διαχείρισης του μεγέθους της ουράς εντολών, ο οποίος συνδυάζει επιθετική μείωση της κατανάλωσης ενέργειας του επεξεργαστή με διατήρηση της υψηλής απόδοσής του. / This dissertation proposes methodologies for the management of shared resources in chip multi-processors (CMP). Until recently, the design of a computing system had to satisfy the computational and storage needs of a single program during each time period. Now instead, the designer has to balance the, perhaps conflicting, needs of multiple programs competing for the same resources. But, in many cases, even this is not enough. Even if we could invent a perfect way to manage sharing, without optimizing the way that each processor uses the shared resource, the resource could not deal efficiently with the increased load. In order to handle the negative effects of resource sharing, this dissertation proposes three management mechanisms. The first one introduces a novel theoretical model of the sharing of the shared cache, which can be used at run-time. Furthermore, out methodology uses the model to control sharing and to achieve a sense of justice in the way the cache is shared among the processors. Our second methodology presents a new technique for predicting the locality of cache accesses. Since locality determines, almost entirely, the usefulness of cache data, our technique can be used to drive any management mechanism which strives to improve the efficiency of the cache. As part of our methodology, we present such a mechanism, a new cache replacement policy which tries to minimize cache misses by near-optimal replacement decisions. The last methodology presented in this dissertation, targets the energy consumption of the processor. To that end, our methodology shows that the key to reducing the power consumption of the Issue Queue, without disproportional performance degradation, lies at the interaction of the Issue Queue with the memory subsystem: as long as the management of the Issue Queue doesn’t reduce the utilization of the memory subsystem, the effects of the management on the processor’s performance will be minimal. Based on this conclusion, we introduce a new mechanism for dynamically resizing the Issue Queue, which achieves aggressive downsizing and energy savings with almost no performance degradation. Κρυφές μνήμες Ουρά εντολών Μηχανισμοί πρόβλεψης Κατανάλωση ισχύος 004.35 Computer architecture Chip multiprocessors CPU caches Instruction queue Prediction mechanisms Power-aware management techniques
4	Σχεδιασμός και ανάπτυξη λογισμικού ΕΛ/ΛΑΚ (open source) για διαχείριση οποιασδήποτε ενσωματωμένης (embedded) και μη συσκευής / Extending and customizing OpenRSM for wireless embedded devices and LINUX Κουμούτσος, Κωνσταντίνος 25 May 2011 (has links) Οι ενσωματωμένες συσκευές αποτελούν μια κατηγόρια υπολογιστών ειδικού σκοπού με ραγδαία αύξηση τα τελευταία χρόνια. Σε αντίθεση με τους γνωστούς σε όλους υπολογιστές γενικού σκοπού που μπορούν να επιτελέσουν σχεδόν οποιαδήποτε λειτουργία, οι ενσωματωμένες συσκευές επιτελούν μόνο συγκεκριμένες λειτουργίες, οι οποίες είναι προκαθορισμένες κατά τον σχεδιασμό τους. Η διαχείριση τέτοιων και μη συσκευών αποτελεί ένα τεράστιο κεφάλαιο αφού η διαφορετικότητα των λειτουργιών τους, δημιουργεί ένα διαφορετικό τρόπο αντιμετώπισης τους κατά την πρακτική της διαχείρισης. Υπάρχει περιορισμός στα διαθέσιμα εργαλεία για την διαχείριση όλων των ενσωματωμένων συστημάτων με ένα εργαλείο, αλλά η ερεύνα μας επικεντρώνεται στη διαχείριση οικογενειών τέτοιων συσκευών με κριτήριο τη λειτουργία του ειδικού σκοπού που επιτελούν. Σκοπός λοιπόν της εργασίας είναι ο σχεδιασμός και η ανάπτυξη λογισμικού για την ομαδική διαχείριση οικογένειας ενσωματωμένων συσκευών και κοινών υπολογιστών γενικού σκοπού οπουδήποτε λειτουργικού συστήματος. Η συνεισφορά της υπάρχουσας εργασίας συνοψίζεται στις εξής συνιστώσες: 1. Οι ενσωματωμένες συσκευές στις οποίες επικεντρωθήκαμε αφορούν δικτυακές συσκευές (ασύρματες ή ενσύρματες) πολλών λειτουργιών (Access Points, Clients, Repeaters,Points to Points, WDS, Transparent Clients, Routers). 2. Το λογισμικό για υπολογιστές ειδικού σκοπού που δημιουργήθηκε μπορεί να εκτελεστεί τόσο σε λειτουργικά συστήματα MS Windows όσο και σε ΝΙΧ. 3. Η ανάπτυξη του λογισμικού έγινε βάση του συστήματος ORSM, το οποίο είναι ένα εργαλείο ανοικτού κώδικα για την απομακρυσμένη διαχείριση συστημάτων και δικτύων. (Με αστερίσκο τόσο στα περιεχόμενα όσο και στο κύριο μέρος της εργασίας δείχνουμε τις δυνατότητες του νέου λογισμικού σε σχέση με το σύστημα ORSM). Συνοπτικά οι δυνατότητες διαχείρισης αφορά τις παρακάτω λειτουργίες: • Ανακάλυψη περιουσιακών στοιχείων (Inventory Process). • Παρακολούθηση απόδοσης συστημάτων (Monitoring). • Εγκατάσταση και απεγκατάσταση λογισμικού (Software Deployment). • Απομακρυσμένο έλεγχο (Remote Desktop). • Εκτέλεση εντολών κελύφους (Remote Command). / An embedded system is a special-purpose computer system designed to perform one or a few dedicated functions, often with real-time computing constraints. It is usually embedded as a part of a complete device including hardware and mechanical parts. In contrast, a general-purpose computer, such as a personal computer, can do many different tasks depending on programming. Embedded systems control many of the common devices in use today. Managing infrastructure with such devices (embedded and general purpose computers) is usually demanding and expensive but nevertheless essential for organizations. There is a limit in tools which achieve effective management to those infrastructure topologies. At present, open management solutions are few and immature however there are tools such as OpenRSM aiming to deliver lightweight, remote and customizable management, easily customizable to cover the needs of small organizations. OpenRSM implements a generic management framework that models generalized use cases that can be exploited by users to adapt the tool to their needs. However, given maturity of the tool, it is unclear how easy it would be for users to extend it in order to include management of new types of devices. As network environments grow to digital ecosystems, the management targets increase in number and diversity. Wireless active elements, handheld systems or embedded devices are becoming common and need to be brought under standard management practices in the same manner as routers or workstations. This paper describes how the OpenRSM management functionality can be extended in order to provide customizable management of embedded devices and more specifically of wireless access points (the symbol * shows the new extension of ORSM). In general the management capabilities which are embed to OPENRSM system and target to wireless active elements are: (Inventory process, monitoring, firmware upgrade, save/reload configuration settings, remote commands, and discovery process). 005.26 Embedded device management General purpose computers Access point management Remote desktop Monitoring Software deployment Inventory process Remote command Infrastructure management OpenRSM

1

Page generated in 0.0295 seconds