Global ETD Search

661	Optimalizace sledování síťových toků / Optimization of network flow monitoring Žádník, Martin January 2013 (has links) The thesis deals with optimization of network flow monitoring. Flow-based network traffic processing, that is, processing packets based on some state information associated to the flows which the packets belong to, is a key enabler for a variety of network services and applications. The number of simultaneous flows increases with the growing number of new services and applications. It has become a challenge to keep a state per each flow in a network device processing high speed traffic. A flow table, a structure with flow states, must be stored in a memory hierarchy. The memory closest to the processing is known as a flow cache. Flow cache management plays an important role in terms of its effective utilization, which affects the performance of the whole system. This thesis focuses on an automated design of cache replacement policy optimized to a deployment on particular networks. A genetic algorithm is proposed to automate this process. The genetic algorithm generates and evaluates evolved replacement policies by a simulation on obtained traffic traces. The proposed algorithm is evaluated by designing replacement policies for two variations of the cache management problem. The first variation is an evolution of the replacement policy with an overall low number of state evictions from the flow cache. The second variation represents an evolution of the replacement policy with a low number of evictions belonging to large flows only. Optimized replacement policies for both variations are found while experimenting with various encoding of the replacement policy and genetic operators. The newly evolved replacement policies achieve better results than other tested policies. The evolved replacement policy lowers the overall amount of evictions by ten percent in comparison with the best compared policy. The evolved replacement policy focusing on large flows lowers the amount of their evictions two times. Moreover, no eviction occurs for most of the large flows (over 90%). The evolved replacement policy offers better resilience against flooding the flow cache with large amount of short flows which are typical side effects of scanning or distributed denial of service activities. An extension of the replacement policy is also proposed. The extension complements the replacement policy with an additional information extracted from packet headers. The results show further decrease in the number of evictions when the extension is used.
662	FFRU: A Time- and Space-Efficient Caching Algorithm Garrett, Benjamin, 0000-0003-1587-6585 January 2021 (has links) Cache replacement policies have applications that are nearly ubiquitous in technology. Among these is an interesting subset which occurs when referentially transparent functions are memoized, eg. in compilers, in dynamic programming, and other software caches. In many applications the least recently used (LRU) approach likely preserves items most needed by memoized function calls. However, despite its popularity LRU is expensive to implement, which has caused a spate of research initiatives aimed at approximating its cache miss performance in exchange for faster and more memory efficient implementations. We present a novel caching algorithm, Far From Recently Used (FFRU), which offers a simple, but highly configurable mechanism for providing lower bounds on the usage recency of items evicted from the cache. This algorithm preserves the constant time amortized cost of insertions and updates and minimizes the memory overhead needed to administer the eviction guarantees. We study the cache miss performance of several memoized optimization problems which vary in the number of subproblems generated and the access patterns exhibited by their recursive calls. We study their cache miss performance using LRU cache replacement, then show the performance of FFRU in these same problem scenarios. We show that for equivalent minimum eviction age guarantees, FFRU incurs fewer cache misses than LRU, and does so using less memory. We also present some variations of the algorithms studied (Fibonacci, KMP, LCS, and Euclidean TSP) which exploit the characteristics of the cache replacement algorithms being employed, further resulting in improved cache miss performance. We present a novel implementation of a well known approximation algorithm for the Euclidean Traveling Salesman Problem due to Sanjeev Arora. Our implementation of this algorithm outperforms the currently known implementations of the same. It has long remained an open question whether or not algorithms relying on geometric divisions of space can be implemented into practical tools, and our powerful implementation of Arora's algorithm establishes a new benchmark in that arena. / Computer and Information Science Computer science Applied mathematics Information science Approximation algorithms Cache algorithms Dynamic programming Memoization Optimization problems Traveling salesman problem
663	A Direct-Read, A Posteriori Golden Copy Method for Measuring SoC Cache Upsets Poff, Evan D. 02 June 2022 (has links) A method for measuring system-on-a-chip (SoC) cache upsets is presented and evaluated. In contrast to methods that predict cache contents through analysis or memory access patterns, this method uses system registers to read cache memories directly, thereby creating and checking golden copies to detect individual memory upsets during operation. The test method is driven by the device under test itself and does not require a user to set or know a priori the cache contents. A bare-metal implementation of this “direct golden method” on a Zynq UltraScale+ MPSoC logged upsets in the device’s data cache, data tag, and TLB RAM memories during a neutron radiation beam test. For each of these memories, this direct golden method yields cache upset bit cross sections, such as 7.115 × 10^−16 cm^2 for the data cache. Confidence intervals for these bit cross sections overlap such intervals for three other methods, supporting this method’s validity and candidacy for future use. cache single event upset memory upset MPSoC test method neutron radiation fluence cross section beam test Engineering
664	Critical Words Cache Memory Gieske, Edmund Joseph 28 August 2008 (has links) No description available. Computer Science Electrical Engineering Engineering Information Systems Systems Design computer architecture cache memory critical word criticality regularity critical footprint
665	Practical Support for Strong, Serializability-Based Memory Consistency Biswas, Swarnendu 30 December 2016 (has links) No description available. Computer Science Memory models cache coherence region serializability conflict exceptions region conflicts data races atomicity checking dynamic program analysis
666	Integrated Mobility and Service Management for Network Cost Minimization in Wireless Mesh Networks Li, Yinan 04 June 2012 (has links) In this dissertation research, we design and analyze integrated mobility and service management for network cost minimization in Wireless Mesh Networks (WMNs). We first investigate the problem of mobility management in WMNs for which we propose two efficient per-user mobility management schemes based on pointer forwarding, and then a third one that integrates routing-based location update and pointer forwarding for further performance improvement. We further study integrated mobility and service management for which we propose protocols that support efficient mobile data access services with cache consistency management, and mobile multicast services. We also investigate reliable and secure integrated mobility and service man agement in WMNs, and apply the idea to the design of a protocol for secure and reliable mobile multicast. The most salient feature of our protocols is that they are optimal on a per-user basis (or on a per-group basis for mobile multicast), that is, the overall network communication cost incurred is minimized for each individual user (or group). Per-user based optimization is critical because mobile users normally have vastly different mobility and service characteristics. Thus, the overall cost saving due to per-user based optimization is cumulatively significant with an increasing mobile user population. To evaluate the performance of our proposed protocols, we develop mathematical models and computational procedures used to compute the network communication cost incurred and build simulation systems for validating the results obtained from analytical modeling. We identify optimal design settings under which the network cost is minimized for our mobility and service management protocols in WMNs. Intensive comparative performance studies are carried out to compare our protocols with existing work in the literature. The results show that our protocols significantly outperform existing protocols under identical environmental and operational settings. We extend the design notion of integrated mobility and service management for cost minimization to MANETs and propose a scalable dual-region mobility management scheme for location-based routing. The basic design concept is to use local regions to complement home regions and have mobile nodes in the home region of a mobile node serve as location servers for that node. We develop a mathematical model to derive the optimal home region size and local region size under which overall network cost incurred is minimized. Through a comparative performance study, we show that dual-region mobility management outperforms existing mobility management schemes based on static home regions. / Ph. D. mobile multicast services cache consistency management mobile data access wireless mesh networks performance analysis security
667	Optimizing Virtual Machine I/O Performance in Cloud Environments Lu, Tao 01 January 2016 (has links) Maintaining closeness between data sources and data consumers is crucial for workload I/O performance. In cloud environments, this kind of closeness can be violated by system administrative events and storage architecture barriers. VM migration events are frequent in cloud environments. VM migration changes VM runtime inter-connection or cache contexts, significantly degrading VM I/O performance. Virtualization is the backbone of cloud platforms. I/O virtualization adds additional hops to workload data access path, prolonging I/O latencies. I/O virtualization overheads cap the throughput of high-speed storage devices and imposes high CPU utilizations and energy consumptions to cloud infrastructures. To maintain the closeness between data sources and workloads during VM migration, we propose Clique, an affinity-aware migration scheduling policy, to minimize the aggregate wide area communication traffic during storage migration in virtual cluster contexts. In host-side caching contexts, we propose Successor to recognize warm pages and prefetch them into caches of destination hosts before migration completion. To bypass the I/O virtualization barriers, we propose VIP, an adaptive I/O prefetching framework, which utilizes a virtual I/O front-end buffer for prefetching so as to avoid the on-demand involvement of I/O virtualization stacks and accelerate the I/O response. Analysis on the traffic trace of a virtual cluster containing 68 VMs demonstrates that Clique can reduce inter-cloud traffic by up to 40%. Tests of MPI Reduce_scatter benchmark show that Clique can keep VM performance during migration up to 75% of the non-migration scenario, which is more than 3 times of the Random VM choosing policy. In host-side caching environments, Successor performs better than existing cache warm-up solutions and achieves zero VM-perceived cache warm-up time with low resource costs. At system level, we conducted comprehensive quantitative analysis on I/O virtualization overheads. Our trace replay based simulation demonstrates the effectiveness of VIP for data prefetching with ignorable additional cache resource costs. Cloud Computing Virtualization Virtual Machine Live Migration Cache Virtual I/O Computer and Systems Architecture Data Storage Systems Electrical and Computer Engineering Systems and Communications
668	Adaptive Prefetching for Visual Data Exploration Doshi, Punit Rameshchandra 31 January 2003 (has links) Loading of data from slow persistent memory (disk storage) to main memory represents a bottleneck for current interactive visual data exploration applications, especially when applied to huge volumnes of data. Semantic caching of queries at the client-side is a recently emerging technology that can significantly improve the performance of such systems, though it may not in all cases fully achieve the near real-time responsiveness required by such interactive applications. We hence propose to augment the semantic caching techniques by applying prefetching. That is, the system predicts the user's next requested data and loads the data into the cache as a background process before the next user request is made. Our experimental studies confirm that prefetching indeed achieves performance improvements for interactive visual data exploration. However, a given prefetching technique is not always able to correctly predict changes in a user's navigation pattern. Especially, as different users may have different navigation patterns, implying that the same strategy might fail for a new user. In this research, we tackle this shortcoming by utilizing the adaptation concept of strategy selection to allow the choice of prefetching strategy to change over time both across as well as within one user session. While other adaptive prefetching research has focused on refining a single strategy, we instead have developed a framework that facilitates strategy selection. For this, we explored various metrics to measure performance of prefetching strategies in action and thus guide the adaptive selection process. This work is the first to study caching and prefetching in the context of visual data exploration. In particular, we have implemented and evaluated our proposed approach within XmdvTool, a free-ware visualization system for visually exploring hierarchical multivariate data. We have tested our technique on real user traces gathered by the logging tool of our system as well as on synthetic user traces. Our results confirm that our adaptive approach improves system performance by selecting a good combination of prefetching strategies that adapts to the user's changing navigation patterns. Adaptive prefetching Semantic caching Hierarchical data exploration Exploratory data analysis Cache memory Image processing Digital techniques Multivariate analysis Data processing
669	Heuristisk profilbaserad optimering av instruktionscache i en online Just-In-Time kompilator / Heuristic Online Profile Based Instruction Cache Optimisation in a Just-In-Time Compiler Eng, Stefan January 2004 (has links) <p>This master’s thesis examines the possibility to heuristically optimise instruction cache performance in a Just-In-Time (JIT) compiler. </p><p>Programs that do not fit inside the cache all at once may suffer from cache misses as a result of frequently executed code segments competing for the same cache lines. A new heuristic algorithm LHCPA was created to place frequently executed code segments to avoid cache conflicts between them, reducing the overall cache misses and reducing the performance bottlenecks. Set-associative caches are taken into consideration and not only direct mapped caches. </p><p>In Ahead-Of-Time compilers (AOT), the problem with frequent cache misses is often avoided by using call graphs derived from profiling and more or less complex algorithms to estimate the performance for different placements approaches. This often results in heavy computation during compilation which is not accepted in a JIT compiler. </p><p>A case study is presented on an Alpha processor and an at Ericsson developed JIT Compiler. The results of the case study shows that cache performance can be improved using this technique but also that a lot of other factors influence the result of the cache performance. Such examples are whether the cache is set-associative or not; and especially the size of the cache highly influence the cache performance.</p> Datorsystem Alpha processor Cache Compiler Heuristic Hot Instruction Model Online Optimisation Profile Just-In-Time Set-Associative Datorsystem Computer and systems science Data- och systemvetenskap
670	Software Techniques for Distributed Shared Memory Radovic, Zoran January 2005 (has links) <p>In large multiprocessors, the access to shared memory is often nonuniform, and may vary as much as ten times for some distributed shared-memory architectures (DSMs). This dissertation identifies another important nonuniform property of DSM systems: <i>nonuniform communication architecture</i>, NUCA. High-end hardware-coherent machines built from large nodes, or from chip multiprocessors, are typical NUCA systems, since they have a lower penalty for reading recently written data from a neighbor's cache than from a remote cache. This dissertation identifies <i>node affinity</i> as an important property for scalable general-purpose locks. Several software-based hierarchical lock implementations exploiting NUCAs are presented and evaluated. NUCA-aware locks are shown to be almost twice as efficient for contended critical sections compared to traditional lock implementations.</p><p>The shared-memory “illusion”' provided by some large DSM systems may be implemented using either hardware, software or a combination thereof. A software-based implementation can enable cheap cluster hardware to be used, but typically suffers from poor and unpredictable performance characteristics.</p><p>This dissertation advocates a new software-hardware trade-off design point based on a new combination of techniques. The two low-level techniques, fine-grain deterministic coherence and synchronous protocol execution, as well as profile-guided protocol flexibility, are evaluated in isolation as well as in a combined setting using all-software implementations. Finally, a minimum of hardware trap support is suggested to further improve the performance of coherence protocols across cluster nodes. It is shown that all these techniques combined could result in a fairly stable performance on par with hardware-based coherence.</p> Datorteknik synchronization distributed shared memory write permission cache nonuniform communication architecture node affinity locality hardware-software trade-off profiling flexibility trap-based memory architecture Datorteknik Computer engineering Datorteknik

Search results