Spelling suggestions: "subject:"eprocessor architectures"" "subject:"eprocessor rchitectures""
11 |
Efficient architectures for error control using low-density parity-check codesHaley , David January 2004 (has links)
Recent designs for low-density parity-check (LDPC) codes have exhibited capacity approaching performance for large block length, overtaking the performance of turbo codes. While theoretically impressive, LDPC codes present some challenges for practical implementation. In general, LDPC codes have higher encoding complexity than turbo codes both in terms of computational latency and architecture size. Decoder circuits for LDPC codes have a high routing complexity and thus demand large amounts of circuit area. There has been recent interest in developing analog circuit architectures suitable for decoding. These circuits offer a fast, low-power alternative to the digital approach. Analog decoders also have the potential to be significantly smaller than digital decoders. In this thesis we present a novel and efficient approach to LDPC encoder / decoder (codec) design. We propose a new algorithm which allows the parallel decoder architecture to be reused for iterative encoding. We present a new class of LDPC codes which are iteratively encodable, exhibit good empirical performance, and provide a flexible choice of code length and rate. Combining the analog decoding approach with this new encoding technique, we design a novel time-multiplexed LDPC codec, which switches between analog decode and digital encode modes. In order to achieve this behaviour from a single circuit we have developed mode-switching gates. These logic gates are able to switch between analog (soft) and digital (hard) computation, and represent a fundamental circuit design contribution. Mode-switching gates may also be applied to built-in self-test circuits for analog decoders. Only a small overhead in circuit area is required to transform the analog decoder into a full codec. The encode operation can be performed two orders of magnitude faster than the decode operation, making the circuit suitable for full-duplex applications. Throughput of the codec scales linearly with block size, for both encode and decode operations. The low power and small area requirements of the circuit make it an attractive option for small portable devices.
|
12 |
Efficient Cache Organization For Application Specific And General Purpose ProcessorsRajan, Kaushik 05 1900 (has links)
The performance gap between processor and memory continues to remain a major performance bottleneck in both application specific and general purpose processors. This thesis strives to ease the above bottleneck by exploiting the characteristics of the application domain to improve the cache organization for two distinct processor architectures:
(1) application specific processors for packet forwarding, (2) general purpose processors.
Packet forwarding algorithms make use of a trie data structure to determine the forwarding route. We observe that the locality characteristics of the nodes at various levels of such a trie are different. Nodes that are closer to the root node, especially those that are immediate children of the root node (level-one nodes), exhibit higher temporal locality than nodes lower down the trie. Based on this observation we propose a novel Heterogeneously Segmented Cache Architecture (HSCA) that uses separate caches for level-one and lower-level nodes, each with carefully chosen sizes. We also propose a new replacement policy to enhance the performance of HSCA. Performance evaluation indicates that HSCA results in up to 32% reduction in average memory access time over a unified cache that shares the same cache space among all levels of the trie. HSCA also outperforms a previously proposed results cache.
The use of a large root branching factor in a forwarding trie forcefully introduces a large number of nodes at level-one. Among these, only nodes that cover prefixes from the routing table are useful while the rest, are superfluous. We find that as many as 75% of the level-one nodes are superfluous. This leads to a skewed distribution of useful nodes among the cache sets of the level-one nodes cache. We propose a novel two-level mapping framework that achieves a better nodes to cache set mapping and hence incurs fewer conflict misses. Two-level mapping first aggregates nodes into Initial Partitions (IPs) using lower order bits and then remaps them from IPs into Refined Partitions (RPs), that form sets, based on some higher order bits. It provides flexibility in placement by allowing each IP to choose a different remap function. We propose three schemes conforming to the framework. A speedup in average memory access time of as much as 16% is gained over HSCA.
In general purpose processor architectures, the design objectives of caches at various levels of the hierarchy are different. To ensure low access latencies, L1 caches are small and have low associativities, making them more susceptible to conflict misses. The extent of conflict misses incurred is governed by the placement function and the memory access patterns exhibited by the program. We propose a mechanism to learn the access characteristics of the program at runtime by analyzing the repetitive phases of program. We then make use of the two-level mapping framework to dynamically adapt the placement function. Further, we elegantly incorporate two-level mapping into the cache organization without increasing the cache access latency. Performance evaluation reveals that the proposed adaptive placement mechanism eliminates 32—36% of misses on average over a range of cache sizes.
To prevent expensive off-chip accesses, L2 caches are larger and have higher associativities. Hence, the replacement policy plays a significant role in determining L2 cache performance. Further, as the inherent temporal locality in memory accesses is filtered out by the L1 cache, an L2 cache using the widely prevalent LRU replacement policy incurs significantly higher misses than the optimal replacement policy (OPT). We propose to bridge this gap through a novel replacement strategy that mimics the replacement decisions of OPT. The L2 cache is logically divided into two components, a Shepherd Cache (SC) with a simple FIFO replacement and a Main Cache (MC) with an emulation of optimal replacement. The SC plays the dual role of caching lines and shepherding the replacement decisions close to optimal for MC. Our proposed organization can cover 40% of the gap between LRU and OPT, resulting in 7% overall speedup.
|
13 |
Efficient architectures for error control using low-density parity-check codesHaley , David January 2004 (has links)
Recent designs for low-density parity-check (LDPC) codes have exhibited capacity approaching performance for large block length, overtaking the performance of turbo codes. While theoretically impressive, LDPC codes present some challenges for practical implementation. In general, LDPC codes have higher encoding complexity than turbo codes both in terms of computational latency and architecture size. Decoder circuits for LDPC codes have a high routing complexity and thus demand large amounts of circuit area. There has been recent interest in developing analog circuit architectures suitable for decoding. These circuits offer a fast, low-power alternative to the digital approach. Analog decoders also have the potential to be significantly smaller than digital decoders. In this thesis we present a novel and efficient approach to LDPC encoder / decoder (codec) design. We propose a new algorithm which allows the parallel decoder architecture to be reused for iterative encoding. We present a new class of LDPC codes which are iteratively encodable, exhibit good empirical performance, and provide a flexible choice of code length and rate. Combining the analog decoding approach with this new encoding technique, we design a novel time-multiplexed LDPC codec, which switches between analog decode and digital encode modes. In order to achieve this behaviour from a single circuit we have developed mode-switching gates. These logic gates are able to switch between analog (soft) and digital (hard) computation, and represent a fundamental circuit design contribution. Mode-switching gates may also be applied to built-in self-test circuits for analog decoders. Only a small overhead in circuit area is required to transform the analog decoder into a full codec. The encode operation can be performed two orders of magnitude faster than the decode operation, making the circuit suitable for full-duplex applications. Throughput of the codec scales linearly with block size, for both encode and decode operations. The low power and small area requirements of the circuit make it an attractive option for small portable devices.
|
14 |
COMPASS - A Guide For Selection Of Compression Strategies For Embedded ProcessorsMenon, Sreejith K 07 1900 (has links) (PDF)
No description available.
|
15 |
The Design of an Oncology Knowledge Base from an Online Health ForumOmar Ramadan (12446526) 22 April 2022 (has links)
<p>Knowledge base completion is an important task that allows scientists to reason over knowledge bases and discover new facts. In this thesis, a patient-centric knowledge base</p>
<p>is designed and constructed using medical entities and relations extracted from the health forum r/cancer. The knowledge base stores information in binary relation triplets. It is enhanced with an is-a relation that is able to represent the hierarchical relationship between different medical entities. An enhanced Neural Tensor Network that utilizes the frequency of occurrence of relation triplets in the dataset is then developed to infer new facts from</p>
<p>the enhanced knowledge base. The results show that when the enhanced inference model uses the enhanced knowledge base, a higher accuracy (73.2 %) and recall@10 (35.4%) are obtained. In addition, this thesis describes a methodology for knowledge base and associated</p>
<p>inference model design that can be applied to other chronic diseases.</p>
|
16 |
SENSOR FUSION IN NEURAL NETWORKS FOR OBJECT DETECTIONSheetal Prasanna (12447189) 12 July 2022 (has links)
<p>Object detection is an increasingly popular tool used in many fields, especially in the<br>
development of autonomous vehicles. The task of object detections involves the localization<br>
of objects in an image, constructing a bounding box to determine the presence and loca-<br>
tion of the object, and classifying each object into its appropriate class. Object detection<br>
applications are commonly implemented using convolutional neural networks along with the<br>
construction of feature pyramid networks to extract data.<br>
Another commonly used technique in the automotive industry is sensor fusion. Each<br>
automotive sensor – camera, radar, and lidar – have their own advantages and disadvantages.<br>
Fusing two or more sensors together and using the combined information is a popular method<br>
of balancing the strengths and weakness of each independent sensor. Together, using sensor<br>
fusion within an object detection network has been found to be an effective method of<br>
obtaining accurate models. Accurate detections and classifications of images is a vital step<br>
in the development of autonomous vehicles or self-driving cars.<br>
Many studies have proposed methods to improve neural networks or object detection<br>
networks. Some of these techniques involve data augmentation and hyperparameter opti-<br>
mization. This thesis achieves the goal of improving a camera and radar fusion network by<br>
implementing various techniques within these areas. Additionally, a novel idea of integrating<br>
a third sensor, the lidar, into an existing camera and radar fusion network is explored in this<br>
research work.<br>
The models were trained on the Nuscenes dataset, one of the biggest automotive datasets<br>
available today. Using the concepts of augmentation, hyperparameter optimization, sensor<br>
fusion, and annotation filters, the CRF-Net was trained to achieve an accuracy score that<br>
was 69.13% higher than the baseline</p>
|
17 |
MULTI-SOURCE AND SOURCE-PRIVATE CROSS-DOMAIN LEARNING FOR VISUAL RECOGNITIONQucheng Peng (12426570) 12 July 2022 (has links)
<p>Domain adaptation is one of the hottest directions in solving annotation insufficiency problem of deep learning. General domain adaptation is not consistent with the practical scenarios in the industry. In this thesis, we focus on two concerns as below.</p>
<p> </p>
<p> First is that labeled data are generally collected from multiple domains. In other words, multi-source adaptation is a more common situation. Simply extending these single-source approaches to the multi-source cases could cause sub-optimal inference, so specialized multi-source adaptation methods are essential. The main challenge in the multi-source scenario is a more complex divergence situation. Not only the divergence between target and each source plays a role, but the divergences among distinct sources matter as well. However, the significance of maintaining consistency among multiple sources didn't gain enough attention in previous work. In this thesis, we propose an Enhanced Consistency Multi-Source Adaptation (EC-MSA) framework to address it from three perspectives. First, we mitigate feature-level discrepancy by cross-domain conditional alignment, narrowing the divergence between each source and target domain class-wisely. Second, we enhance multi-source consistency via dual mix-up, diminishing the disagreements among different sources. Third, we deploy a target distilling mechanism to handle the uncertainty of target prediction, aiming to provide high-quality pseudo-labeled target samples to benefit the previous two aspects. Extensive experiments are conducted on several common benchmark datasets and demonstrate that our model outperforms the state-of-the-art methods.</p>
<p> </p>
<p> Second is that data privacy and security is necessary in practice. That is, we hope to keep the raw data stored locally while can still obtain a satisfied model. In such a case, the risk of data leakage greatly decreases. Therefore, it is natural for us to combine the federated learning paradigm with domain adaptation. Under the source-private setting, the main challenge for us is to expose information from the source domain to the target domain while make sure that the communication process is safe enough. In this thesis, we propose a method named Fourier Transform-Assisted Federated Domain Adaptation (FTA-FDA) to alleviate the difficulties in two ways. We apply Fast Fourier Transform to the raw data and transfer only the amplitude spectra during the communication. Then frequency space interpolations between these two domains are conducted, minimizing the discrepancies while ensuring the contact of them and keeping raw data safe. What's more, we make prototype alignments by using the model weights together with target features, trying to reduce the discrepancy in the class level. Experiments on Office-31 demonstrate the effectiveness and competitiveness of our approach, and further analyses prove that our algorithm can help protect privacy and security.</p>
|
18 |
COMPARING PSO-BASED CLUSTERING OVER CONTEXTUAL VECTOR EMBEDDINGS TO MODERN TOPIC MODELINGSamuel Jacob Miles (12462660) 26 April 2022 (has links)
<p>Efficient topic modeling is needed to support applications that aim at identifying main themes from a collection of documents. In this thesis, a reduced vector embedding representation and particle swarm optimization (PSO) are combined to develop a topic modeling strategy that is able to identify representative themes from a large collection of documents. Documents are encoded using a reduced, contextual vector embedding from a general-purpose pre-trained language model (sBERT). A modified PSO algorithm (pPSO) that tracks particle fitness on a dimension-by-dimension basis is then applied to these embeddings to create clusters of related documents. The proposed methodology is demonstrated on three datasets across different domains. The first dataset consists of posts from the online health forum r/Cancer. The second dataset is a collection of NY Times abstracts and is used to compare</p>
<p>the proposed model to LDA. The third is a standard benchmark dataset for topic modeling which consists of a collection of messages posted to 20 different news groups. It is used to compare state-of-the-art generative document models (i.e., ETM and NVDM) to pPSO. The results show that pPSO is able to produce interpretable clusters. Moreover, pPSO is able to capture both common topics as well as emergent topics. The topic coherence of pPSO is comparable to that of ETM and its topic diversity is comparable to NVDM. The assignment parity of pPSO on a document completion task exceeded 90% for the 20News-Groups dataset. This rate drops to approximately 30% when pPSO is applied to the same Skip-Gram embedding derived from a limited, corpus specific vocabulary which is used by ETM and NVDM.</p>
|
19 |
EXPLORATION OF NOVEL EDUCATIONAL TOOLS BASED ON VISUALIZATIONAbel Andres Reyes Angulo (11237160) 06 August 2021 (has links)
<div>The dynamic on how teaching is performed has changed abruptly in the past few years. Even before the COVID-19 pandemic, class modalities were changing. Instructors were adopting new modalities for lectures, like online and hybrid classes, and the use of collaborative resources were getting more popular over time. The current situation was just a catalyst of an event that was already started, which is the beginning of a new era for education.</div><div><br></div><div>This new education era implies new areas of study and the implementation of tools that promote an efficient learning process by adapting to everything involved in this change. Sciences, technology, engineering, and mathematics education (STEM) and healthcare fields are areas with noticeable demand for professionals in industry around the world. Therefore, the need to have more people academically prepared in these areas is highly prioritized. New tools to be used for learning to complement the mentioned field must show features related to the adoption of new technologies as well as the fact that this is currently a digital era. Emergent specialities like artificial intelligence and data science are traditionally being taught at the university level, due to the complexity of some concepts and the background needed to develop skills related to these areas. However, with the current technology available, tools can be used as complementary learning resources for complex subjects. Visualization helps the users to learn by sharpening the sense of sight and making evident things that are hard to illustrate by words or numbers. Therefore, the use of software for education based on visualization could be the new tools needed for these emergent specialities aligned to this new educational era. Features like intractability, gaming, and multimedia resources can help to make these tools more robust and completed.</div><div><br></div><div>In this work, the implementations of novel educational tools based on visualization for emergent specialization areas like machine learning in STEM and pathophysiology in heathcare were explored. This work summarizes the implementation of three different projects to illustrate the general purpose of this work, showing the relevance of the mentioned areas and proposes educational tools based on visualization, adapting the proposal for each speciality and having in mind different target populations. The projects related to each of the proposed tools includes the analysis to elaborate the content within the tool, the review of the software development, and the testing sessions to identify strengths and weaknesses of the tools. The tools are intended to be designed as frameworks in such a way that the deliverable content could be customized over the time and cover different educational needs.</div>
|
20 |
DATA-CENTRIC DECISION SUPPORT SYSTEM FRAMEWORK FOR SELECTED APPLICATIONSXiang Gu (11090106) 15 December 2021 (has links)
<p>The web and digital technologies have
been continuously growing in the recent five years. The data generated from the
Internet of Things (IoT) devices are heterogeneous, increasing data storage and
management difficulties. The thesis developed user-friendly data management
system frameworks in the local environment and cloud platform. The two frameworks
applied to two applications in the industrial field: the agriculture
informatics system and the personal healthcare management system. The systems
are capable of information management and two-way communication through a
user-friendly interface. </p>
|
Page generated in 0.0813 seconds