31 |
No Hypervisor Is an Island : System-wide Isolation Guarantees for Low Level CodeSchwarz, Oliver January 2016 (has links)
The times when malware was mostly written by curious teenagers are long gone. Nowadays, threats come from criminals, competitors, and government agencies. Some of them are very skilled and very targeted in their attacks. At the same time, our devices – for instance mobile phones and TVs – have become more complex, connected, and open for the execution of third-party software. Operating systems should separate untrusted software from confidential data and critical services. But their vulnerabilities often allow malware to break the separation and isolation they are designed to provide. To strengthen protection of select assets, security research has started to create complementary machinery such as security hypervisors and separation kernels, whose sole task is separation and isolation. The reduced size of these solutions allows for thorough inspection, both manual and automated. In some cases, formal methods are applied to create mathematical proofs on the security of these systems. The actual isolation solutions themselves are carefully analyzed and included software is often even verified on binary level. The role of other software and hardware for the overall system security has received less attention so far. The subject of this thesis is to shed light on these aspects, mainly on (i) unprivileged third-party code and its ability to influence security, (ii) peripheral devices with direct access to memory, and (iii) boot code and how we can selectively enable and disable isolation services without compromising security. The papers included in this thesis are both design and verification oriented, however, with an emphasis on the analysis of instruction set architectures. With the help of a theorem prover, we implemented various types of machinery for the automated information flow analysis of several processor architectures. The analysis is guaranteed to be both sound and accurate. / Förr skrevs skadlig mjukvara mest av nyfikna tonåringar. Idag är våra datorer under ständig hot från statliga organisationer, kriminella grupper, och kanske till och med våra affärskonkurrenter. Vissa besitter stor kompetens och kan utföra fokuserade attacker. Samtidigt har tekniken runtomkring oss (såsom mobiltelefoner och tv-apparater) blivit mer komplex, uppkopplad och öppen för att exekvera mjukvara från tredje part. Operativsystem borde egentligen isolera känslig data och kritiska tjänster från mjukvara som inte är trovärdig. Men deras sårbarheter gör det oftast möjligt för skadlig mjukvara att ta sig förbi operativsystemens säkerhetsmekanismer. Detta har lett till utveckling av kompletterande verktyg vars enda funktion är att förbättra isolering av utvalda känsliga resurser. Speciella virtualiseringsmjukvaror och separationskärnor är exempel på sådana verktyg. Eftersom sådana lösningar kan utvecklas med relativt liten källkod, är det möjligt att analysera dem noggrant, både manuellt och automatiskt. I några fall används formella metoder för att generera matematiska bevis på att systemet är säkert. Själva isoleringsmjukvaran är oftast utförligt verifierad, ibland till och med på assemblernivå. Dock så har andra komponenters påverkan på systemets säkerhet hittills fått mindre uppmärksamhet, både när det gäller hårdvara och annan mjukvara. Den här avhandlingen försöker belysa dessa aspekter, huvudsakligen (i) oprivilegierad kod från tredje part och hur den kan påverka säkerheten, (ii) periferienheter med direkt tillgång till minnet och (iii) startkoden, samt hur man kan aktivera och deaktivera isolationstjänster på ett säkert sätt utan att starta om systemet. Avhandlingen är baserad på sex tidigare publikationer som handlar om både design- och verifikationsaspekter, men mest om säkerhetsanalys av instruktionsuppsättningar. Baserat på en teorembevisare har vi utvecklat olika verktyg för den automatiska informationsflödesanalysen av processorer. Vi har använt dessa verktyg för att tydliggöra vilka register oprivilegierad mjukvara har tillgång till på ARM- och MIPS-maskiner. Denna analys är garanterad att vara både korrekt och precis. Så vitt vi vet är vi de första som har publicerat en lösning för automatisk analys och bevis av informationsflödesegenskaper i standardinstruktionsuppsättningar. / <p>QC 20160919</p> / PROSPER / HASPOC
|
32 |
Design of Programmable Baseband ProcessorsTell, Eric January 2005 (has links)
The world of wireless communications is under constant change. Radio standards evolve and new standards emerge. More and more functionality is put into wireless terminals. E.g. mobile phones need to handle both second and third generation mobile telephony as well as Bluetooth, and will soon also support wireless LAN functionality, reception of digital audio and video broadcasting, etc. These developments have lead to an increased interest in software defined radio (SDR), i.e. radio devices that can be reconfigured via software. SDR would provide benefits such as low cost for multi-mode devices, reuse of the same hardware in different products, and increased product life time via software updates. One essential part of any software defined radio is a programmable baseband processor that is flexible enough to handle different types of modulation, different channel coding schemes, and different trade-offs between data rate and mobility. So far, programmable baseband solutions have mostly been used in high end systems such as mobile telephony base stations since the cost and power consumption have been considered too high for handheld terminals. In this work a new low power and low silicon area programmable baseband processor architecture aimed for multi-mode terminals is presented. The architecture is based on a customized DSP core and a number of hardware accelerators connected via a configurable network. The architecture offers a good tradeoff between flexibility and performance through an optimized instruction set, efficient hardware acceleration of carefully selected functions, low memory cost, and low control overhead. One main contribution of this work is a study of important issues in programmable baseband processing such as software-hardware partitioning, instruction level acceleration, low power design, and memory issues. Further contributions are a unique optimized instruction set architecture, a unique architecture for efficient integration of hardware accelerators in the processor, and mapping of complete baseband applications to the presented architecture. The architecture has been proven in a manufactured demonstrator chip for wireless LAN applications. Wireless LAN firmware has been developed and run on the chip at full speed. Silicon area and measured power consumption have proven to be similar to that of a non-programmable ASIC solution.
|
33 |
Scheduling Algorithms for Instruction Set Extended Symmetrical Homogeneous Multiprocessor Systems-on-ChipMontcalm, Michael R. 10 June 2011 (has links)
Embedded system designers face multiple challenges in fulfilling the runtime requirements of programs. Effective scheduling of programs is required to extract as much parallelism as possible. These scheduling algorithms must also improve speedup after instruction-set extensions have occurred. Scheduling of dynamic code at run time is made more difficult when the static components of the program are scheduled inefficiently. This research aims to optimize a program’s static code at compile time. This is achieved with four algorithms designed to schedule code at the task and instruction level. Additionally, the algorithms improve scheduling using instruction set extended code on symmetrical homogeneous multiprocessor systems. Using these algorithms, we achieve speedups up to 3.86X over sequential execution for a 4-issue 2-processor system, and show better performance than recent heuristic techniques for small programs. Finally, the algorithms generate speedup values for a 64-point FFT that are similar to the test runs.
|
34 |
Automatic Generation of Hardware for Custom InstructionsNecsulescu, Philip I 12 August 2011 (has links)
The Software/Hardware Implementation and Research Architecture (SHIRA) is a C to hardware toolchain developed by the Computer Architecture Research Group (CARG) of the University of Ottawa. The framework and algorithms to generate the hardware from an Intermediate Representation (IR) of the C code is needed. This dissertation presents the conceiving, design, and development of a module that generates the hardware for custom instructions identified by specialized SHIRA components without the need for any user interaction. The module is programmed in Java and takes a Data Flow Graph (DFG) as an IR for input. It then generates VHDL code that targets the Altera FPGAs. It is possible to use separate components for each operation or to set a maximum number for each component which leads to component reuse and reduces chip area use. The performance improvement of the generated code is compared to using only the processor’s standard instruction set.
|
35 |
Scheduling Algorithms for Instruction Set Extended Symmetrical Homogeneous Multiprocessor Systems-on-ChipMontcalm, Michael R. 10 June 2011 (has links)
Embedded system designers face multiple challenges in fulfilling the runtime requirements of programs. Effective scheduling of programs is required to extract as much parallelism as possible. These scheduling algorithms must also improve speedup after instruction-set extensions have occurred. Scheduling of dynamic code at run time is made more difficult when the static components of the program are scheduled inefficiently. This research aims to optimize a program’s static code at compile time. This is achieved with four algorithms designed to schedule code at the task and instruction level. Additionally, the algorithms improve scheduling using instruction set extended code on symmetrical homogeneous multiprocessor systems. Using these algorithms, we achieve speedups up to 3.86X over sequential execution for a 4-issue 2-processor system, and show better performance than recent heuristic techniques for small programs. Finally, the algorithms generate speedup values for a 64-point FFT that are similar to the test runs.
|
36 |
Automatic Generation of Hardware for Custom InstructionsNecsulescu, Philip I 12 August 2011 (has links)
The Software/Hardware Implementation and Research Architecture (SHIRA) is a C to hardware toolchain developed by the Computer Architecture Research Group (CARG) of the University of Ottawa. The framework and algorithms to generate the hardware from an Intermediate Representation (IR) of the C code is needed. This dissertation presents the conceiving, design, and development of a module that generates the hardware for custom instructions identified by specialized SHIRA components without the need for any user interaction. The module is programmed in Java and takes a Data Flow Graph (DFG) as an IR for input. It then generates VHDL code that targets the Altera FPGAs. It is possible to use separate components for each operation or to set a maximum number for each component which leads to component reuse and reduces chip area use. The performance improvement of the generated code is compared to using only the processor’s standard instruction set.
|
37 |
Integration of virtual platform models into a system-level design frameworkSalinas Bomfim, Pablo E. 24 November 2010 (has links)
The fields of System-On-Chip (SOC) and Embedded Systems Design have received a lot of attention in the last years. As part of an effort to increase productivity and reduce the time-to-market of new products, different approaches for Electronic System-Level Design frameworks have been proposed. These different methods promise a transparent co-design of hardware and software without having to focus on the final hardware/software split.
In our work, we focused on enhancing the component database, modeling and synthesis capabilities of the System-On-Chip Environment (SCE). We investigated two different virtual platform emulators (QEMU and OVP) for integration into SCE. Based on a comparative analysis, we opted on integrating the Open Virtual Platforms (OVP) models and tested the enhanced SCE simulation, design and synthesis capabilities with a JPEG encoder application, which uses both custom hardware and software as part of the system.
Our approach proves not only to provide fast functional verification support for designers (10+ times faster than cycle accurate models), but also to offer a good speed/accuracy relationship when compared against integration of cycle accurate or behavioral (host-compiled) models. / text
|
38 |
Scheduling Algorithms for Instruction Set Extended Symmetrical Homogeneous Multiprocessor Systems-on-ChipMontcalm, Michael R. 10 June 2011 (has links)
Embedded system designers face multiple challenges in fulfilling the runtime requirements of programs. Effective scheduling of programs is required to extract as much parallelism as possible. These scheduling algorithms must also improve speedup after instruction-set extensions have occurred. Scheduling of dynamic code at run time is made more difficult when the static components of the program are scheduled inefficiently. This research aims to optimize a program’s static code at compile time. This is achieved with four algorithms designed to schedule code at the task and instruction level. Additionally, the algorithms improve scheduling using instruction set extended code on symmetrical homogeneous multiprocessor systems. Using these algorithms, we achieve speedups up to 3.86X over sequential execution for a 4-issue 2-processor system, and show better performance than recent heuristic techniques for small programs. Finally, the algorithms generate speedup values for a 64-point FFT that are similar to the test runs.
|
39 |
Automatic Generation of Hardware for Custom InstructionsNecsulescu, Philip I 12 August 2011 (has links)
The Software/Hardware Implementation and Research Architecture (SHIRA) is a C to hardware toolchain developed by the Computer Architecture Research Group (CARG) of the University of Ottawa. The framework and algorithms to generate the hardware from an Intermediate Representation (IR) of the C code is needed. This dissertation presents the conceiving, design, and development of a module that generates the hardware for custom instructions identified by specialized SHIRA components without the need for any user interaction. The module is programmed in Java and takes a Data Flow Graph (DFG) as an IR for input. It then generates VHDL code that targets the Altera FPGAs. It is possible to use separate components for each operation or to set a maximum number for each component which leads to component reuse and reduces chip area use. The performance improvement of the generated code is compared to using only the processor’s standard instruction set.
|
40 |
Optimized SIMD architecture exploration and implementation for ultra-low energy processors / Εξερεύνηση και υλοποίηση βελτιστοποιημένης SIMD αρχιτεκτονικής για επεξεργαστές πολύ χαμηλής κατανάλωσηςΔακουρού, Στεφανία 19 July 2012 (has links)
On-line monitoring is an important challenge in future biotechnology applications,
for instance in the domain of precision livestock farming where a strong need is
present for low-cost intelligent sensors to monitor animal welfare. On-line poultry
monitoring can significantly improve living conditions of hens in industrial farms.
A very low-cost low-energy solution needs to be provided though due to the stringent
battery limitations. Domain-specific ASIPs can be an ideal solution when
they cover enough submarkets to increase the production volume (reducing the
price) and ultra-low energy concepts are used for their realization.
This work is a part of a larger project and aiming to high energy-efficiency.
The current study implements data parallelization, using a recently introduced
software-controlled SIMD realization in an innovative way. The approaches that
have been employed for the determination of the final instruction set of the architecture
that has been created for the mapping of the critical Gauss loop of the
detection application, are thoroughly explored. The re-design of the data-parallel
data path, also referred to as Soft-SIMD architecture, has been necessary in order
to achieve instruction encoding optimization.
Furthermore, we have explored the capabilities that a commercial compiler retargetable
Tool, like Target, can offer for our target design and we have suggested
some potential modifications that would help the tool to become more efficient and
useful for a designer’s needs in such architecture. Thereby, this study also demonstrates
the promising results obtained by experimenting with detours around the
current Target tool design limitations.
Finding the right balance between efficiency and flexibility requires the ability to
quickly evaluate alternative architectures through simulations and testing techniques.
The methods developed for exactly this purpose, with the help of Target’s
IP Designer retargetable tool-suite, are discussed in detail. By exploiting the profiling
information produced by the ISS, and by reading the assembly code produced
by the C compiler, it is possible to identify the instructions in the critical loop, and
optimize them by using a number of techniques discussed. The main purpose of
this optimization is to reduce the cycle count of the application, in order to reduce
the overall power consumption. VHDL files of the optimized and un-optimized
processor are automatically generated using the HDL generation tool.
However, examining a bio-imaging application, instantiated from the ULP-ASIP
architectural template [FEENECS book], many other issues are present too. In
particular, the way that these kinds of implementations have to be tested should
be taken into consideration. Preferably, the testability has not only to be sufficient
and efficient but also reusable, in the sense that test patterns should be able to
be generated not only for a specific application or for a group of applications
but for the entire architectural template. Therefore, this study also illustrates a
Systematic Test Vector generation process for the ULP-ASIP template. Our goal
is to make generalized principles, because such principles are reusable and can be
applied to any instances, such as our present processor for the Gauss Filter.
Finally, this study is completed by presenting some realistic power numbers based
on layout back-annotation, which concern the data path components of the processor.
Based on all the advanced optimizations and broad search space explorations
that are presented in this thesis, a heavily optimized ASIP architecture has been
fully implemented which results in a low-cost ultra low-energy consumption while
still meeting all the performance requirements. / Η αυτόματη μέθοδος παρακολούθησης ζωντανών οργανισμών, όπως έχει
ερευνηθεί και δημοσιευθεί από το Τμήμα Biosystems (BIOSYST) του K.U.
Leuven [1], συνίσταται από μια εϕαρμογή με «υπολογιστική όραση», η οποία,
βασιζόμενη στις αποκρίσεις τους, κατηγοριοποιεί τη συμπεριϕορά τους. Η
βιοτεχνολογική αυτή εϕαρμογή αναπτύσσει ένα πλήρως αυτοματοποιημένο
σύστημα «υπολογιστικής όρασης» σε μεμονωμένες και υπό περιορισμό όρνι-
θες.Η εϕαρμογή
χωρίζεται σε δύο αλγόριθμους, εκ των οποίων ο πρώτος ανιχνεύει το αντι-
κείμενο παρακολούθησης (detection algorithm) και ο δεύτερος το εντοπίζει
(tracking algorithm).
Η παρούσα μελέτη αποτελεί κομμάτι ενός μεγαλυτέρου project και συνέχεια
της προηγούμενης δουλείας που αναπτύχθηκε στον τομέα αυτό.Ο σκοπός αυτής της μελέτης είναι η εξερεύνηση της αρχιτεκτονικής που έχει δημιουργηθεί για την αντιστοίχιση του κρίσιμου βρόχου
Gauss του αλγόριθμου ανίχνευσης προκειμένου να καθοριστεί το τελικό
σύνολο εντολών του ULP-ASIP SIMD επεξεργαστή. Οι τεχνικές και οι προσεγγίσεις που χρησιμοποιούνται για την υποστήριξη της διαδικασίας βελτιστοποίησης της κωδικοποίησης του συνόλου εντολών παρουσιάζονται εκτεταμένα
στο κεϕάλαιο 2. Επιπλέον, κατά τη διάρκεια της εξερεύνησης της αρχιτεκτονικής, το σύνολο εντολών που ορίστηκε και οι τεχνικές αντιστοίχισης επανεξετάζονται, προκειμένου να μειωθεί το συνολικό κόστος εκτέλεσης. Η εύρεση
της σωστής ισορροπίας μεταξύ της αποτελεσματικότητας και της ευελιξίας
απαιτεί την ικανότητα να αξιολογούνται γρήγορα εναλλακτικές αρχιτεκτονικές μέσω εξομοιώσεων και τεχνικών δοκιμών. Το Κεϕάλαιο 3 επεξηγεί τις
μεθόδους που αναπτύχθηκαν ακριβώς για το σκοπό αυτό, με τη βοήθεια του
περιβάλλοντος σχεδίασης IP των TARGET Compiler Τεχνολογιών η οποία
προσϕέρει ένα πλήρες reTARGETable εργαλείο. Ωστόσο, μια πιο συστηματική διαδικασία παραγωγής διανυσμάτων δοκιμής για ολόκληρη την πλατϕόρμα ULP-ASIP κατέληξε να είναι ένα πολύ σημαντικό πλεονέκτημα για την
επικύρωση της λειτουργίας του επεξεργαστή ULP-ASIP. Ως εκ τούτου, μια
τέτοια μέθοδος, αναλύεται και παρουσιάζεται εκτεταμένα στο κεϕάλαιο 4.
Τέλος, το Κεϕάλαιο 5 παρουσιάζει την εκτίμηση της ενέργειας του data
path του επεξεργαστή. Με βάση όλες τις προηγμένες βελτιστοποιήσεις και
τις ευρείες εξερευνήσεις του χώρου αναζήτησης που παρουσιάζονται στα
προηγούμενα κεϕάλαια, μια ισχυρά βελτιστοποιημένη συνθέσιμη αρχιτεκτονική ASIP υλοποιείται πλήρως η οποία οδηγεί σε μια χαμηλού κόστους, πολύ
χαμηλής κατανάλωσης ενέργειας πλατϕόρμα, καλύπτοντας συγχρόνως όλες
τις απαιτήσεις επιδόσεων.
|
Page generated in 0.1131 seconds