Spelling suggestions: "subject:"systolic"" "subject:"cystolic""
51 |
Hardware Acceleration of Video analytics on FPGA using OpenCLJanuary 2019 (has links)
abstract: With the exponential growth in video content over the period of the last few years, analysis of videos is becoming more crucial for many applications such as self-driving cars, healthcare, and traffic management. Most of these video analysis application uses deep learning algorithms such as convolution neural networks (CNN) because of their high accuracy in object detection. Thus enhancing the performance of CNN models become crucial for video analysis. CNN models are computationally-expensive operations and often require high-end graphics processing units (GPUs) for acceleration. However, for real-time applications in an energy-thermal constrained environment such as traffic management, GPUs are less preferred because of their high power consumption, limited energy efficiency. They are challenging to fit in a small place.
To enable real-time video analytics in emerging large scale Internet of things (IoT) applications, the computation must happen at the network edge (near the cameras) in a distributed fashion. Thus, edge computing must be adopted. Recent studies have shown that field-programmable gate arrays (FPGAs) are highly suitable for edge computing due to their architecture adaptiveness, high computational throughput for streaming processing, and high energy efficiency.
This thesis presents a generic OpenCL-defined CNN accelerator architecture optimized for FPGA-based real-time video analytics on edge. The proposed CNN OpenCL kernel adopts a highly pipelined and parallelized 1-D systolic array architecture, which explores both spatial and temporal parallelism for energy efficiency CNN acceleration on FPGAs. The large fan-in and fan-out of computational units to the memory interface are identified as the limiting factor in existing designs that causes scalability issues, and solutions are proposed to resolve the issue with compiler automation. The proposed CNN kernel is highly scalable and parameterized by three architecture parameters, namely pe_num, reuse_fac, and vec_fac, which can be adapted to achieve 100% utilization of the coarse-grained computation resources (e.g., DSP blocks) for a given FPGA. The proposed CNN kernel is generic and can be used to accelerate a wide range of CNN models without recompiling the FPGA kernel hardware. The performance of Alexnet, Resnet-50, Retinanet, and Light-weight Retinanet has been measured by the proposed CNN kernel on Intel Arria 10 GX1150 FPGA. The measurement result shows that the proposed CNN kernel, when mapped with 100% utilization of computation resources, can achieve a latency of 11ms, 84ms, 1614.9ms, and 990.34ms for Alexnet, Resnet-50, Retinanet, and Light-weight Retinanet respectively when the input feature maps and weights are represented using 32-bit floating-point data type. / Dissertation/Thesis / Masters Thesis Electrical Engineering 2019
|
52 |
Advanced techniques for improving radar performanceShoukry, Mohammed Adel 03 December 2019 (has links)
Wideband beamforming have been widely used in modern radar systems. One of the powerful wideband beamforming techniques that is capable of achieving a high selectivity over a wide bandwidth is the nested array (NA) beamformer. Such a beamformer consists of nested antenna arrays, 2-D spatio-temporal filters, and multirate filterbanks. Speed of operation is bounded by the speed of the hardware implementation.
This dissertation presents the use of a systematic methodology for design space exploration of the NA beamformer basic building blocks. The efficient systolic array design in terms of the highest possible clock speed of each block was selected for hardware implementation. The proposed systolic array designs and the conventional designs were implemented in FPGA hardware to verify their functionality and compare their erformance. The implementations results confirm that the proposed systolic array implementations are faster and requires less hardware resources than the published designs. The overall beamformer FPGA implementation is constructed based on the analysis of efficient systolic arrays designs of the beamformer building blocks. The implemented overall structure is then validated to ensure its proper operation. Further, the implementation performance is evaluated in terms of accuracy and error analysis in comparison to the MATLAB simulations. The new methodology is based on the systematic methodology to close the gap between the modern wideband radar I/O rates and the silicon operating speed. This new metodology is applied to the interpolator block as an example. The proposed methodology is simulated and tested using MATLAB object oriented programming (OOP) to ensure the proper operation. / Graduate / 2020-11-17
|
53 |
Lack of Osteopontin Induces Systolic and Diastolic Dysfunction in the Heart Following Myocardial Ischemia/Reperfusion InjuryJames, Caytlin 01 May 2020 (has links)
Ischemic heart disease is a leading cause of death worldwide. Osteopontin (OPN), a cell-secreted extracellular matrix protein, is suggested to play a cardioprotective role in mouse models of ischemic heart disease. The objective of this study was to examine the role of OPN in modulation of systolic and diastolic functional parameters of the heart following mouse ischemia/reperfusion (I/R) injury. For this, wild-type (WT) and OPN-knockout (KO) mice aged approximately 4 months were subjected to cardiac ischemia for 45 minutes by the ligation of the left anterior descending coronary artery (LAD) followed by reperfusion of LAD by snipping the ligature. Heart function was measured using echocardiography at baseline, 1, 3, 7, 14, and 27 days post-I/R injury. M-mode echocardiographic images were used to calculate % fractional shortening [%FS], % ejection fraction [%EF], end-systolic volume [ESV], and end-diastolic volume [EDV], while pulsed wave Doppler images were used to measure aortic ejection time [AET], isovolumic relaxation time [IVRT], and total systolic time [TST]. Velocity of circumferential fiber shortening (Vcf) was calculated using FS and AET. I/R injury significantly decreased %EF and %FS in both WT and KO groups at all time points (1, 3, 7, 14, and 27 days post-I/R) versus the baseline. However, the decrease in % EF and %FS was significantly greater in KO-I/R group versus WT-I/R at 3, 7, 14 and 27 days post-I/R. I/R-mediated increase in ESV and EDV were significantly greater in KO-MI group versus WT-MI 3 day post-I/R. AET was significantly higher in WT-I/R group 27 days post-I/R versus baseline. However, AET was significantly lower in KO-I/R group 3 and 27 days post-I/R versus WT-I/R. IVRT was significantly higher in KO-I/R group 27 days post-I/R vs baseline. However, IVRT was significantly lower in KO-I/R group 1 day post-I/R vs WT-I/R. TST remained unchanged in WT and KO groups post-I/R versus their respective baseline groups. However, TST was significantly lower in KO-I/R group versus WT-I/R at 3 days post-I/R. Vcf was significantly higher at basal levels in the KO versus WT mice. I/R injury decreased Vcf in both groups versus their baseline at all time-points. These data provide evidence that lack of OPN deteriorates systolic and diastolic functional parameters of the heart following I/R injury, suggesting a cardioprotective role of OPN in myocardial remodeling post-IR.
|
54 |
Microalbuminuria, Macroalbuminuria and Uncontrolled Blood Pressure Among Diagnosed Hypertensive Patients: The Aspect of Racial Disparity in the Nhanes StudyLiu, Xuefeng, Wang, Kesheng, Wang, Liang, Tsilimingras, Dennis 01 December 2013 (has links)
Accumulating evidence reveals that albuminuria may exacerbate uncontrolled blood pressure (BP) in hypertensive patients. However, racial differences in the associations of albuminuria with uncontrolled BP among diagnosed hypertensives have not been evaluated. A total of 6147 diagnosed hypertensive subjects aged ≥18 years were collected from the National Health and Nutrition Examination Survey 1999-2008 with stratified multistage sampling designs. Odds ratios (ORs), relative ORs and 95% confidence intervals (CIs) in uncontrolled BP, and the different effects of microalbuminuria and macroalbuminuria on continuous BP were estimated using weighted logistic models and linear regression models. Hypertensive subjects with microalbuminuria and macroalbuminuria were more likely to have uncontrolled BP and higher average systolic BP (SBP) in all individual racial groups. Microalbuminuria was associated with isolated uncontrolled SBP in non-Hispanic blacks and whites, and macroalbuminuria was associated with isolated uncontrolled SBP and diastolic BP (DBP) and high average DBP only in non-Hispanic blacks. Compared with non-Hispanic whites, non-Hispanic blacks and Mexicans had lower associations of microalbuminuria with uncontrolled BP (relative OR=0.68, 95% CI=0.48-0.97 for blacks vs whites; relative OR=0.62, 95% CI=0.42-0.93 for Mexicans vs. whites) and isolated uncontrolled SBP (relative OR=0.62, 95% CI=0.43-0.90 for blacks vs. whites; relative OR=0.45, 95% CI=0.29-0.71 for Mexicans vs. whites). The association of microalbuminuria with uncontrolled BP was lower in non-Hispanic blacks and Mexicans than in non-Hispanic whites. Health providers need to improve care for mildly elevated albumin excretion rates in non-Hispanic white hypertensive patients while maintaining the quality of care in non-Hispanic blacks and Mexicans.
|
55 |
Microalbuminuria, Macroalbuminuria and Uncontrolled Blood Pressure Among Diagnosed Hypertensive Patients: The Aspect of Racial Disparity in the Nhanes StudyLiu, Xuefeng, Wang, Kesheng, Wang, Liang, Tsilimingras, Dennis 01 December 2013 (has links)
Accumulating evidence reveals that albuminuria may exacerbate uncontrolled blood pressure (BP) in hypertensive patients. However, racial differences in the associations of albuminuria with uncontrolled BP among diagnosed hypertensives have not been evaluated. A total of 6147 diagnosed hypertensive subjects aged ≥18 years were collected from the National Health and Nutrition Examination Survey 1999-2008 with stratified multistage sampling designs. Odds ratios (ORs), relative ORs and 95% confidence intervals (CIs) in uncontrolled BP, and the different effects of microalbuminuria and macroalbuminuria on continuous BP were estimated using weighted logistic models and linear regression models. Hypertensive subjects with microalbuminuria and macroalbuminuria were more likely to have uncontrolled BP and higher average systolic BP (SBP) in all individual racial groups. Microalbuminuria was associated with isolated uncontrolled SBP in non-Hispanic blacks and whites, and macroalbuminuria was associated with isolated uncontrolled SBP and diastolic BP (DBP) and high average DBP only in non-Hispanic blacks. Compared with non-Hispanic whites, non-Hispanic blacks and Mexicans had lower associations of microalbuminuria with uncontrolled BP (relative OR=0.68, 95% CI=0.48-0.97 for blacks vs whites; relative OR=0.62, 95% CI=0.42-0.93 for Mexicans vs. whites) and isolated uncontrolled SBP (relative OR=0.62, 95% CI=0.43-0.90 for blacks vs. whites; relative OR=0.45, 95% CI=0.29-0.71 for Mexicans vs. whites). The association of microalbuminuria with uncontrolled BP was lower in non-Hispanic blacks and Mexicans than in non-Hispanic whites. Health providers need to improve care for mildly elevated albumin excretion rates in non-Hispanic white hypertensive patients while maintaining the quality of care in non-Hispanic blacks and Mexicans.
|
56 |
Prevalence and Trends of Isolated Systolic Hypertension Among Untreated Adults in the United StatesLiu, Xuefeng, Rodriguez, Carlos J., Wang, Kesheng 01 January 2015 (has links)
The prevalence and long-term trends of isolated systolic hypertension (ISH) among untreated adults have not been reported. Data from 24,653 participants aged ≥18 years were selected from the National Health and Nutrition Examination Survey 1999-2010. The prevalence and 95% confidence intervals (CIs) of untreated ISH were estimated by conducting the independent survey t-test. The prevalence of untreated ISH was 9.4% and decreased from 10.3% in 1999-2004 to 8.5% in 2005-2010 (P =.00248). Old persons, females, and non-Hispanic blacks had higher prevalence of untreated ISH. Compared with 1999-2004, the prevalence of untreated ISH in 2005-2010 decreased among older (33.6%; 95% CI, 30.9%-36.3% vs. 25.1%; 95% CI, 22.7%-27.5%) and female individuals (8.3%; 95% CI, 7.5-9.2% vs. 11.4%; 95% CI, 10.4-12.3%). The stratified prevalence of untreated ISH declined in 2005-2010 (vs. 1999-2004) for older non-Hispanic whites (24.6% vs. 32.8%; P <.0001) and blacks (27.7% vs. 40.8%; P =.0013), non-Hispanic white females (7.5% vs. 10.8%; P <.0001), older individuals with higher education (21.0% vs. 30.6%; P =.0024), and females with lower education (10.1% vs. 13.1%; P =.006). Untreated ISH is more prevalent in older adults and females. Significant decreases in untreated ISH prevalence over time among these groups suggest that public health measures and/or treatment patterns are trending in the right direction.
|
57 |
Analysis of Field Programmable Gate Array-Based Kalman Filter ArchitecturesSudarsanam, Arvind 01 December 2010 (has links)
A Field Programmable Gate Array (FPGA)-based Polymorphic Faddeev Systolic Array (PolyFSA) architecture is proposed to accelerate an Extended Kalman Filter (EKF) algorithm. A system architecture comprising a software processor as the host processor, a hardware controller, a cache-based memory sub-system, and the proposed PolyFSA as co-processor, is presented. PolyFSA-based system architecture is implemented on a Xilinx Virtex 4 family of FPGAs. Results indicate significant speed-ups for the proposed architecture when compared against a space-based software processor. This dissertation proposes a comprehensive architecture analysis that is comprised of (i) error analysis, (ii) performance analysis, and (iii) area analysis. Results are presented in the form of 2-D pareto plots (area versus error, area versus time) and a 3-D plot (area versus time versus error). These plots indicate area savings obtained by varying any design constraints for the PolyFSA architecture. The proposed performance model can be reused to estimate the execution time of EKF on other conventional hardware architectures. In this dissertation, the performance of the proposed PolyFSA is compared against the performance of two conventional hardware architectures. The proposed architecture outperforms the other two in most test cases.
|
58 |
Biofeedback Treatment of Systolic and Diastolic Blood Pressure Under Stress and No-Stress ConditionsDafter, Roger E. (Roger Edwin) 05 1900 (has links)
This study compares the relative efficacy of systolic and diastolic biofeedback in lowering the systolic and diastolic blood pressures of normotensives. The importance of testing these biofeedback procedures lies in assessment of their potential as blood pressure self-control techniques for the treatment of essential hypertension.
|
59 |
Physiological Role of the α<sub>2</sub>-Isoform of the Na, K-ATPase in the Regulation of Cardiovascular FunctionRindler, Tara N. January 2012 (has links)
No description available.
|
60 |
Temperature-aware 3D-integrated systolic array DNN acceleratorsShukla, Prachi 17 January 2023 (has links)
Deep neural networks (DNNs) are extensively used for inference in a wide range of emerging mobile and edge application domains, including autonomous vehicles, drones, augmented and virtual reality (AR/VR), etc. Due to the increasing popularity of these applications, there has been an increasing demand for mobile/edge DNN accelerators to achieve low inference latency and high efficiency. Furthermore, these mobile/edge applications also need to execute multi-DNN workloads, where multiple independent DNNs execute subtasks to complete one large task.
This thesis aims to optimize the efficiency of systolic arrays for DNN acceleration because they are among the most popular architectures for DNN inference in mobile/edge systems due to their straightforward design and dataflow. Systolic arrays provide several degrees of freedom to co-optimize performance, power, area, and temperature–namely, die/chiplet architecture (number of processing elements, on-chip memory capacity and its architecture), quantity, placement, and dataflow.
While recent works have focused on 2D DNN systolic arrays, 2D scaling has been saturating and, thus, improving the performance and power characteristics of computing systems is becoming increasingly challenging. To overcome traditional scaling bottlenecks, 3D integration has emerged as a promising integration technology. 3D technology provides several benefits over 2D systems such as high integration density, high bandwidth, high energy efficiency, and footprint savings. This thesis focuses on two 3D integration technologies: (i) die-stacked 3D (TSV3D), and (ii) monolithic 3D (MONO3D).
Both of these 3D technologies provide significant performance and power benefits over 2D systems and thus, are potent technologies for energy efficient design of systolic arrays for DNNs. However, the dense integration in 3D causes high power densities and inter-tier thermal coupling, further escalating thermal issues and resulting in hot spots across tiers. Furthermore, mobile/edge devices have tight area, power, and thermal constraints due to the absence of heat sinks and fans. Thus, temperature is a critical design concern in 3D DNN accelerators for mobile/edge devices.
This thesis states that to glean the benefits of 3D technology in mobile/edge devices to improve energy efficiency and satisfy performance and power constraints, it is imperative to design thermally-aware 3D systolic arrays for DNNs. To realize this statement, this thesis makes the following contributions: (i) it designs a thermally-aware optimization flow to select a near-optimal MONO3D DNN systolic array for a given DNN and an optimization goal under a performance constraint. The optimizer is facilitated by circuit and architecture-level cross-layer performance/power models that are developed as part of this thesis. (ii) It introduces thermal awareness in tuning a given TSV3D systolic array chiplet architecture and the chiplet’s placement in a multi-chip module (MCM) executing a multi-DNN workload to balance both cost and power of the MCM, while satisfying latency, area, power, thermal packaging, and workload constraints. (iii) It optimizes a dataflow implementation by utilizing the massive bandwidth available in MONO3D systolic arrays with a dense on-chip resistive RAM to improve energy efficiency while satisfying the thermal and performance constraints. Results demonstrate 81% improvement in inference per second per watt over 2D systolic arrays due to high-density and high-bandwidth resistive RAM interface using monolithic inter-tier vias (MIVs). We also demonstrate up to 44% MCM cost savings and 63% DRAM power savings over temperature-unaware optimization at iso-frequency and iso-MCM area for TSV3D MCMs. In addition, we show that optimization without thermal awareness leads to over-estimation of efficiency gains and thermal violations in both MONO3D and TSV3D systolic arrays. / 2025-01-16T00:00:00Z
|
Page generated in 0.0499 seconds