Spelling suggestions: "subject:"cofficient computing"" "subject:"cofficient acomputing""
61 |
Communications with 1-Bit Quantization and Oversampling at the Receiver: Benefiting from Inter-Symbol-InterferenceKrone, Stefan, Fettweis, Gerhard January 2012 (has links)
1-bit analog-to-digital conversion is very attractive for low-complexity communications receivers. A major drawback is, however, the small spectral efficiency when sampling at symbol rate. This can be improved through oversampling by exploiting the signal distortion caused by the transmission channel. This paper analyzes the achievable data rate of band-limited communications channels that are subject to additive noise and inter-symbol-interference with 1-bit quantization and oversampling at the receiver. It is shown that not only the channel noise but also the inter-symbol-interference can be exploited to benefit from oversampling.
|
62 |
Reduced Complexity Window Decoding Schedules for Coupled LDPC CodesHassan, Najeeb ul, Pusane, Ali E., Lentmaier, Michael, Fettweis, Gerhard P., Costello, Daniel J. January 2012 (has links)
Window decoding schedules are very attractive for message passing decoding of spatially coupled LDPC codes. They take advantage of the inherent convolutional code structure and allow continuous transmission with low decoding latency and complexity. In this paper we show that the decoding complexity can be further reduced if suitable message passing schedules are applied within the decoding window. An improvement based schedule is presented that easily adapts to different ensemble structures, window sizes, and channel parameters. Its combination with a serial (on-demand) schedule is also considered. Results from a computer search based schedule are shown for comparison.
|
63 |
Non-regenerative Two-Hop Wiretap Channels using Interference NeutralizationGerbracht, Sabrina, Jorswieck, Eduard A., Zheng, Gan, Ottersten, Björn January 2012 (has links)
In this paper, we analyze the achievable secrecy rates in the two-hop wiretap channel with four nodes, where the transmitter and the receiver have multiple antennas while the relay and the eavesdropper have only a single antenna each. The relay is operating in amplify-and-forward mode and all the channels between the nodes are known perfectly by the transmitter. We discuss different transmission and protection schemes like artificial noise (AN). Furthermore, we introduce interference neutralization (IN) as a new protection scheme. We compare the different schemes regarding the high-SNR slope and the high-SNR power offset and illustrate the performance by simulation results. It is shown analytically as well as by numerical simulations that the high SNR performance of the proposed IN scheme is better than the one of AN.
|
64 |
ENERGY EFFICIENT EDGE INFERENCE SYSTEMSSoumendu Kumar Ghosh (14060094) 07 August 2023 (has links)
<p>Deep Learning (DL)-based edge intelligence has garnered significant attention in recent years due to the rapid proliferation of the Internet of Things (IoT), embedded, and intelligent systems, collectively termed edge devices. Sensor data streams acquired by these edge devices are processed by a Deep Neural Network (DNN) application that runs on the device itself or in the cloud. However, the high computational complexity and energy consumption of processing DNNs often limit their deployment on these edge inference systems due to limited compute, memory and energy resources. Furthermore, high costs, strict application latency demands, data privacy, security constraints, and the absence of reliable edge-cloud network connectivity heavily impact edge application efficiency in the case of cloud-assisted DNN inference. Inevitably, performance and energy efficiency are of utmost importance in these edge inference systems, aside from the accuracy of the application. To facilitate energy- efficient edge inference systems running computationally complex DNNs, this dissertation makes three key contributions.</p>
<p><br></p>
<p>The first contribution adopts a full-system approach to Approximate Computing, a design paradigm that trades off a small degradation in application quality for significant energy savings. Within this context, we present the foundational concepts of AxIS, the first approximate edge inference system that jointly optimizes the constituent subsystems leading to substantial energy benefits compared to optimization of the individual subsystem. To illustrate the efficacy of this approach, we demonstrate multiple versions of an approximate smart camera system that executes various DNN-based unimodal computer vision applications, showcasing how the sensor, memory, compute, and communication subsystems can all be synergistically approximated for energy-efficient edge inference.</p>
<p><br></p>
<p>Building on this foundation, the second contribution extends AxIS to multimodal AI, harnessing data from multiple sensor modalities to impart human-like cognitive and perceptual abilities to edge devices. By exploring optimization techniques for multiple sensor modalities and subsystems, this research reveals the impact of synergistic modality-aware optimizations on system-level accuracy-efficiency (AE) trade-offs, culminating in the introduction of SysteMMX, the first AE scalable cognitive system that allows efficient multimodal inference at the edge. To illustrate the practicality and effectiveness of this approach, we present an in-depth case study centered around a multimodal system that leverages RGB and Depth sensor modalities for image segmentation tasks.</p>
<p><br></p>
<p>The final contribution focuses on optimizing the performance of an edge-cloud collaborative inference system through intelligent DNN partitioning and computation offloading. We delve into the realm of distributed inference across edge devices and cloud servers, unveiling the challenges associated with finding the optimal partitioning point in DNNs for significant inference latency speedup. To address these challenges, we introduce PArtNNer, a platform-agnostic and adaptive DNN partitioning framework capable of dynamically adapting to changes in communication bandwidth and cloud server load. Unlike existing approaches, PArtNNer does not require pre-characterization of underlying edge computing platforms, making it a versatile and efficient solution for real-world edge-cloud scenarios.</p>
<p><br></p>
<p>Overall, this thesis provides novel insights, innovative techniques, and intelligent solutions to enable energy-efficient AI at the edge. The contributions presented herein serve as a solid foundation for future researchers to build upon, driving innovation and shaping the trajectory of research in edge AI.</p>
|
65 |
HAEC NewsJanuary 2012 (has links)
No description available.
|
66 |
HAEC NewsJanuary 2012 (has links)
No description available.
|
67 |
Information Leakage Neutralization for the Multi-Antenna Non-Regenerative Relay-Assisted Multi-Carrier Interference ChannelHo, Zuleita, Jorswieck, Eduard, Engelmann, Sabrina January 2013 (has links)
In heterogeneous dense networks where spectrum is shared, users' privacy remains one of the major challenges. On a multi-antenna relay-assisted multi-carrier interference channel, each user shares the spectral and spatial resources with all other users. When the receivers are not only interested in their own signals but also in eavesdropping other users' signals, the cross talk on the spectral and spatial channels becomes information leakage. In this paper, we propose a novel secrecy rate enhancing relay strategy that utilizes both spectral and spatial resources, termed as information leakage neutralization. To this end, the relay matrix is chosen such that the effective channel from the transmitter to the colluding eavesdropper is equal to the negative of the effective channel over the relay to the colluding eavesdropper and thus the information leakage to zero. Interestingly, the optimal relay matrix in general is not block-diagonal which encourages users' encoding over the frequency channels. We proposed two information leakage neutralization strategies, namely efficient information leakage neutralization (EFFIN) and local-optimized information leakage neutralization (LOPTIN). EFFIN provides a simple and efficient design of relay processing matrix and precoding matrices at the transmitters in the scenario of limited power and computational resources. LOPTIN, despite its higher complexity, provides a better sum secrecy rate performance by optimizing the relay processing matrix and the precoding matrices jointly. The proposed methods are shown to improve the sum secrecy rates over several state-of-the-art baseline methods.
|
68 |
ACCELERATING SPARSE MACHINE LEARNING INFERENCEAshish Gondimalla (14214179) 17 May 2024 (has links)
<p>Convolutional neural networks (CNNs) have become important workloads due to their<br>
impressive accuracy in tasks like image classification and recognition. Convolution operations<br>
are compute intensive, and this cost profoundly increases with newer and better CNN models.<br>
However, convolutions come with characteristics such as sparsity which can be exploited. In<br>
this dissertation, we propose three different works to capture sparsity for faster performance<br>
and reduced energy. </p>
<p><br></p>
<p>The first work is an accelerator design called <em>SparTen</em> for improving two-<br>
sided sparsity (i.e, sparsity in both filters and feature maps) convolutions with fine-grained<br>
sparsity. <em>SparTen</em> identifies efficient inner join as the key primitive for hardware acceleration<br>
of sparse convolution. In addition, <em>SparTen</em> proposes load balancing schemes for higher<br>
compute unit utilization. <em>SparTen</em> performs 4.7x, 1.8x and 3x better than dense architecture,<br>
one-sided architecture and SCNN, the previous state of the art accelerator. The second work<br>
<em>BARISTA</em> scales up SparTen (and SparTen like proposals) to large-scale implementation<br>
with as many compute units as recent dense accelerators (e.g., Googles Tensor processing<br>
unit) to achieve full speedups afforded by sparsity. However at such large scales, buffering,<br>
on-chip bandwidth, and compute utilization are highly intertwined where optimizing for<br>
one factor strains another and may invalidate some optimizations proposed in small-scale<br>
implementations. <em>BARISTA</em> proposes novel techniques to balance the three factors in large-<br>
scale accelerators. <em>BARISTA</em> performs 5.4x, 2.2x, 1.7x and 2.5x better than dense, one-<br>
sided, naively scaled two-sided and an iso-area two-sided architecture, respectively. The last<br>
work, <em>EUREKA</em> builds an efficient tensor core to execute dense, structured and unstructured<br>
sparsity with losing efficiency. <em>EUREKA</em> achieves this by proposing novel techniques to<br>
improve compute utilization by slightly tweaking operand stationarity. <em>EUREKA</em> achieves a<br>
speedup of 5x, 2.5x, along with 3.2x and 1.7x energy reductions over Dense and structured<br>
sparse execution respectively. <em>EUREKA</em> only incurs area and power overheads of 6% and<br>
11.5%, respectively, over Ampere</p>
|
69 |
HAEC News06 September 2013 (has links) (PDF)
No description available.
|
70 |
HAEC NewsJanuary 2013 (has links)
No description available.
|
Page generated in 0.11 seconds