Global ETD Search

31	Nutrient and Carbon-Dioxide Requirements for Large-Scale Microalgae Biofuel Production Shurtz, Benjamin K. 01 August 2013 (has links) Growing demand for energy worldwide has increased interest in the production of renewable fuels, with microalgae representing a promising feedstock. The large-scale feasibility of microalgae based biofuels has previously been evaluated through technoeconomic and environmental impact assessments, with limited work performed on resource requirements. This study presents the use of a modular engineering system process model, founded on literature, to evaluate the nutrient (nitrogen and phosphorus) and carbon dioxide resource demand of five large-scale microalgae to biofuels production systems. The baseline scenario, representative of a near-term large-scale production system includes process models for growth, dewater, lipid extraction, anaerobic digestion, and biofuel conversion. Optimistic and conservative process scenarios are simulated to represent practical best and worst case system performance to bound the total resource demand of large-scale production. Baseline modeling results combined with current US nutrient availability from fertilizer and wastewater are used to perform a scalability assessment. Results show nutrient requirements represent a major barrier to the development of microalgae based biofuels to meet the US Department of Energy 2030 renewable fuel goal of 30% of transportation fuel, or 60 billion gallons per year. Specifically, results from the baseline and optimistic fuel production systems show wastewater sources can provide sufficient nutrients to produce 3.8 billion gallons and 13 billion gallons of fuel per year, corresponding to 6% and 22% of the DOE goal, respectively. High resource demand necessitates nutrient recovery from the lipid-extracted algae, thus limiting its use as a value-added co-product. Discussion focuses on system scalability, comparison of results to previous resource assessments, and model sensitivity of nutrient and carbon dioxide resource requirements to system parameter inputs. biofuel microalgae model nutrients scalability Mechanical Engineering
32	SCALABLE HYBRID DATA DISSEMINATION FOR INTERNET HOT SPOTS Zhang, Wenhui 17 January 2007 (has links) No description available. Computer Science scalability middleware multicast Internet
33	Scalable Analysis of Large Dynamic Dependence Graphs Singh, Shashank 01 September 2015 (has links) No description available. Computer Science
34	Design of Secure Scalable Frameworks for Next Generation Cellular Networks Atalay, Tolga Omer 06 June 2024 (has links) Leveraging Network Functions Virtualization (NFV), the Fifth Generation (5G) core, and Radio Access Network (RAN) functions are implemented as Virtual Network Functions (VNFs) on Commercial-off-the-Shelf (COTS) hardware. The use of virtualized micro-services to implement these 5G VNFs enables the flexible and scalable construction of end-to-end logically isolated network fragments denoted as network slices. The goal of this dissertation is to design more scalable, flexible, secure, and visible 5G networks. Thus, each chapter will present a design and evaluation that addresses one or more of these aspects. The first objective is to understand the limits of 5G core micro-service virtualization when using lightweight containers for constructing various network slicing models with different service guarantees. The initial deployment model consists of the OpenAirInterface (OAI) 5G core in a containerized setting to create a universally deployable testbed. Operational and computational stress tests are performed on individual 5G core VNFs where different network slicing models are created that are applicable to real-life scenarios. The analysis captures the increase in compute resource consumption of individual VNFs during various core network procedures. Furthermore, using different network slicing models, the progressive increase in resource consumption can be seen as the service guarantees of the slices become more demanding. The framework created using this testbed is the first to provide such analytics on lightweight virtualized 5G core VNFs with large-scale end-to-end connections. Moving into the cloud-native ecosystem, 5G core deployments will be orchestrated by middle-men Network-slice-as-a-Service (NSaaS) providers. These NSaaS providers will consume Infrastructure-as-a-service (IaaS) offerings and offer network slices to Mobile Virtual Network Operators (MVNOs). To investigate this future model, end-to-end emulated 5G deployments are conducted to offer insight into the cost implications surrounding such NSaaS offerings in the cloud. The deployment features real-life traffic patterns corresponding to practical use cases which are matched with specific network slicing models. These models are implemented in a 5G testbed to gather compute resource consumption metrics. The obtained data are used to formulate infrastructure procurement costs for popular cloud providers such as Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure. The results show steady patterns in compute consumption across multiple use cases, which are used to make high-scale cost projections for public cloud deployments. In the end, the trade-off between cost and throughput is achieved by decentralizing the network slices and offloading the user plane. The next step is the demystification of 5G traffic patterns using the Over-the-Air (OTA) testbed. An open-source OTA testbed is constructed leveraging advanced features of 5G radio access and core networks developed by OAI. The achievable Quality of Service (QoS) is evaluated to provide visibility into the compute consumption of individual components. Additionally, a method is presented to utilize WiFi devices for experimenting with 5G QoS. Resource consumption analytics are collected from the 5G user plane in correlation to raw traffic patterns. The results show that the open-source 5G testbed can sustain sub-20ms latency with up to 80Mbps throughput over a 25m range using COTS devices. Device connection remains stable while supporting different use cases such as AR/VR, online gaming, video streaming, and Voice-over IP (VoIP). It illustrates how these popular use cases affect CPU utilization in the user plane. This provides insight into the capabilities of existing 5G solutions by demystifying the resource needs of specific use cases. Moving into public cloud-based deployments, creates a growing demand for general-purpose compute resources as 5G deployments continue to expand. Given their existing infrastructures, cloud providers such as AWS are attractive platforms to address this need. Therefore, it is crucial to understand the control and user plane QoS implications associated with deploying the 5G core on top of AWS. To this end, a 5G testbed is constructed using open-source components spanning multiple global locations within the AWS infrastructure. Using different core deployment strategies by shuffling VNFs into AWS edge zones, an operational breakdown of the latency overhead is conducted for 5G procedures. The results show that moving specific VNFs into edge regions reduces the latency overhead for key 5G operations. Multiple user plane connections are instantiated between availability zones and edge regions with different traffic loads. As more data sessions are instantiated, it is observed that the deterioration of connection quality varies depending on traffic load. Ultimately, the findings provide new insights for MVNOs to determine favorable placements of their 5G core entities in the cloud. The transition into cloud-native deployments has encouraged the development of supportive platforms for 5G. One such framework is the OpenRAN initiative, led by the O-RAN Alliance. The OpenRAN initiative promotes an open Radio Access Network (RAN) and offers operators fine-grained control over the radio stack. To that end, O-RAN introduces new components to the 5G ecosystem, such as the near real-time RAN Intelligent Controller (near-RT RIC) and the accompanying Extensible Applications (xApps). The introduction of these entities expands the 5G threat surface. Furthermore, with the movement from proprietary hardware to virtual environments enabled by NFV, attack vectors that exploit the existing NFV attack surface pose additional threats. To deal with these threats, the textbf{xApp repository function (XRF)} framework is constructed for scalable authentication, authorization, and discovery of xApps. In order to harden the XRF microservices, deployments are isolated using Intel Software Guard Extensions (SGX). The XRF modules are individually benchmarked to compare how different microservices behave in terms of computational overhead when deployed in virtual and hardware-based isolation sandboxes. The evaluation shows that the XRF framework scales efficiently in a multi-threaded Kubernetes environment. Isolation of the XRF microservices introduces different amounts of processing overhead depending on the sandboxing strategy. A security analysis is conducted to show how the XRF framework addresses chosen key issues from the O-RAN and 5G standardization efforts. In the final chapter of the dissertation, the focus shifts towards the development and evaluation of 5G-STREAM, a service mesh tailored for rapid, efficient, and authorized microservices in cloud-based 5G core networks. 5G-STREAM addresses critical scalability and efficiency challenges in the 5G core control plane by optimizing traffic and reducing signaling congestion across distributed cloud environments. The framework enhances Virtual Network Function (VNF) service chains' topology awareness, enabling dynamic configuration of communication pathways which significantly reduces discovery and authorization signaling overhead. A prototype of 5G-STREAM was developed and tested, showing a reduction of up to 2× in inter-VNF latency per HTTP transaction in the core network service chains, particularly benefiting larger service chains with extensive messaging. Additionally, 5G-STREAM's deployment strategies for VNF placement are explored to further optimize performance and cost efficiency in cloud-based infrastructures, ultimately providing a scalable solution that can adapt to increasing network demands while maintaining robust service levels. This innovative approach signifies a pivotal advancement in managing 5G core networks, paving the way for more dynamic, efficient, and cost-effective cellular network infrastructures. Overall, this dissertation is devoted to designing, building, and evaluating scalable and secure 5G deployments. / Doctor of Philosophy / Ever since the emergence of the Global System for Mobile Communications (GSM), humanity has relied on cellular communications for the fast and efficient exchange of information. Today, with the Fifth Generation (5G) of mobile networks, what may have passed for science fiction 40 years ago, is now slowly becoming reality. In addition to enabling extremely fast data rates and low latency for user handsets, 5G networks promise to deliver a very rich and integrated ecosystem. This includes a plethora of interconnected devices ranging from smart home sensors to Augmented/Virtual Reality equipment. To that end, the stride from the Fourth Generation (4G) of mobile networks to 5G is yet to be the biggest evolutionary step in cellular networks. In 4G, the backbone entities that glued the base stations together were deployed on proprietary hardware. With 5G, these entities have been moved to Commercial off-the-shelf (COTS) hardware which can be hosted by cloud providers (e.g., Amazon, Google, Microsoft) or various Small to Medium Enterprises (SMEs). This substantial paradigm shift in cellular network deployments has introduced a variety of security, flexibility, and scalability concerns around the deployment of 5G networks. Thus, this thesis is a culmination of a wide range of studies that seek to collectively facilitate the secure, scalable, and flexible deployment of 5G networks in different types of environments. Starting with small-scale optimizations and building up towards the analysis of global 5G deployments, the goal of this work is to demystify the scalability implications of deploying 5G networks. On this journey, several security flaws are identified within the 5G ecosystem, and frameworks are constructed to address them in a fluent manner. Next-G Scalability Security Cloud Microservices
35	The Visual Scalability of Integrated and Multiple View Visualizations for Large, High Resolution Displays Yost, Beth Ann 19 April 2007 (has links) Geospatial intelligence analysts, epidemiologists, sociologists, and biologists are all faced with trying to understand massive datasets that require integrating spatial and multidimensional data. Information visualizations are often used to aid these scientists, but designing the visualizations is challenging. One aspect of the visualization design space is a choice of when to use a single complex integrated view and when to use multiple simple views. Because of the many tradeoffs involved with this decision it is not always clear which design to use. Additionally, as the cost of display technologies continues to decrease, large, high resolution displays are gradually becoming a more viable option for single users. These large displays offer new opportunities for scaling up visualization to very large datasets. Visualizations that are visually scalable are able to effectively display large datasets in terms of both graphical scalability (the number of pixels required) and perceptual scalability (the effectiveness of a visualization, measured in terms of user performance, as the amount of data being visualized is scaled-up). The purpose of this research was to compare information visualization designs for integrating spatial and multidimensional data in terms of their visual scalability for large, high resolution displays. Toward that goal a hierarchical design space was articulated and a series of user experiments were performed. A baseline was established by comparing user performance with opposing visualizations on a desktop monitor. Then, visualizations were compared as more information was added using the additional pixels available with a large, high resolution display. Results showed that integrated views were more visually scalable than multiple view visualizations. The visualizations tested were even scalable beyond the limits of visual acuity. User performance on certain tasks improved due to the additional information that was visualized even on a display with enough pixels to require physical navigation to visually distinguish all elements. The reasons for the benefits of integrated views on large, high resolution displays include a reduction in navigation due to spatial grouping and visual aggregation resulting in the emergence of patterns. These findings can help with the design of information visualizations for large, high resolution displays. / Ph. D. information visualization visual scalability large displays
36	POSEIDON: The First Safe and Scalable Persistent Memory Allocator Demeri, Anthony K. 20 May 2020 (has links) With the advent of byte-addressable Non-Volatile Memory (NVMM), the need for a safe, scalable and high-performing memory allocator is inevitable. A slow memory allocator can bottleneck the entire application stack, while an unsecure memory allocator can render underlying systems and applications inconsistent upon program bugs or system failure. Unlike DRAM-based memory allocators, it is indispensable for an NVMM allocator to guarantee its heap metadata safety from both internal and external errors. An effective NVMM memory allocator should be 1) safe 2) scalable and 3) high performing. Unfortunately, none of the existing persistent memory allocators achieve all three requisites; critically, we also note: the de-facto NVMM allocator, Intel's Persistent Memory Development Kit (PMDK), is vulnerable to silent data corruption and persistent memory leaks as result of a simple heap overflow. We closely investigate the existing defacto NVMM memory allocators, especially PMDK, to study their vulnerability to metadata corruption and reasons for poor performance and scalability. We propose Poseidon, which is safe, fast and scalable. The premise of Poseidon revolves around providing a user application with per-CPU sub-heaps for scalability, while managing the heap metadata in a segregated fashion and efficiently protecting the metadata using a scalable hardware-based protection scheme, Intel's Memory Protection Keys (MPK). We evaluate Poseidon with a wide array of microbenchmarks and real-world benchmarks, noting: Poseidon outperforms the state-of-art allocators by a significant margin, showing improved scalability and performance, while also guaranteeing metadata safety. / Master of Science / Since the dawn of time, civilization has revolved around effective communication. From smoke signals to telegraphs and beyond, communication has continued to be a cornerstone of successful societies. Today, communication and collaboration occur, daily, on a global scale, such that even sub-second units of time are critical to successful societal operation. Naturally, many forms of modern communication revolve around our digital systems, such as personal computers, email servers, and social networking database applications. There is, thus, a never-ending surge of digital system development, constantly striving toward increased performance. For some time, increasing a system's dynamic random-access memory, or DRAM, has been able to provide performance gains; unfortunately, due to thermal and power constraints, such an increase is no longer feasible. Additionally, loss of power on a DRAM system causes bothersome loss of data, since the memory storage is volatile to power loss. Now, we are on the advent of an entirely new physical memory system, termed non-volatile main memory (NVMM), which has near identical performance properties to DRAM, but is operational in much larger quantities, thus allowing increased overall system speed. Alas, such a system also imposes additional requirements upon software developers; since, for NVMM, all memory updates are permanent, such that a failed update can cause persistent memory corruption. Regrettably, the existing software standard, led by Intel's Persistent Memory Development Kit (PMDK), is both unsecure (allowing for permanent memory corruption, with ease), low performance, and a bottleneck for multicore systems. Here, we present a secure, high performing solution, termed Poseidon, which harnesses the full potential of NVMM. persistent memory memory allocator scalability security
37	Scalability of Stepping Stones and Pathways Venkatachalam, Logambigai 30 May 2008 (has links) Information Retrieval (IR) plays a key role in serving large communities of users who are in need of relevant answers for their search queries. IR encompasses various search models to address different requirements and has introduced a variety of supporting tools to improve effectiveness and efficiency. "Search" is the key focus of IR. The classic search methodology takes an input query, processes it, and returns the result as a ranked list of documents. However, this approach is not the most effective method to support the task of finding document associations (relationships between concepts or queries) both for direct or indirect relationships. The Stepping Stones and Pathways (SSP) retrieval methodology supports retrieval of ranked chains of documents that support valid relationships between any two given concepts. SSP has many potential practical and research applications, which are in need of a tool to find connections between two concepts. The early SSP "proof-of-concept" implementation could handle only 6000 documents. However, commercial search applications will have to deal with millions of documents. Hence, addressing the scalability limitation becomes extremely important in the current SSP implementation in order to overcome the limitations on handling large datasets. Research on various commercial search applications and their scalability indicates that the Lucene search tool kit is widely used due to its support for scalability, performance, and extensibility features. Many web-based and desktop applications have used this search tool kit to great success, including Wikipedia search, job search sites, digital libraries, e-commerce sites, and the Eclipse Integrated Development Environment (IDE). The goal of this research is to re-implement SSP in a scalable way, so that it can work for larger datasets and also can be deployed commercially. This work explains the approach adopted for re-implementation focusing on scalable indexing, searching components, new ways to process citations (references), a new approach for query expansion, document clustering, and document similarity calculation. The experiments performed to test the factors such as runtime and storage proved that the system can be scaled up to handle up to millions of documents. / Master of Science Scalability Lucene CiteSeer Connection finding search framework
38	Multicore Scalability Through Asynchronous Work Mathew, Ajit 13 January 2020 (has links) With the end of Moore's Law, computer architects have turned to multicore architecture to provide high performance. Unfortunately, to achieve higher performance, multicores require programs to be parallelized which is an untamed problem. Amdahl's law tells that the maximum theoretical speedup of a program is dictated by the size of the non-parallelizable section of a program. Hence to achieve higher performance, programmers need to reduce the size of sequential code in the program. This thesis explores asynchronous work as a means to reduce sequential portions of program. Using asynchronous work, a programmer can remove tasks which do not affect data consistency from the critical path and can be performed using background thread. Using this idea, the thesis introduces two systems. First, a synchronization mechanism, Multi-Version Read-Log-Update(MV-RLU), which extends Read-Log-Update (RLU) through multi-versioning. At the core of MV-RLU design is a concurrent garbage collection algorithm which reclaims obsolete versions asynchronously reducing blocking of threads. Second, a concurrent and highly scalable index-structure called Hydralist for multi-core. The key idea behind design of Hydralist is that an index-structure can be divided into two component (search layer and data layer) and updates to data layer can be done synchronously while updates to search layer can be propagated asynchronously using background threads. / Master of Science / Up until mid-2000s, Moore's law predicted that performance CPU doubled every two years. This is because improvement in transistor technology allowed smaller transistor which can switch at higher frequency leading to faster CPU clocks. But faster clock leads to higher heat dissipation and as chips reached their thermal limits, computer architects could no longer increase clock speeds. Hence they moved to multicore architecture, wherein a single die contains multiple CPUs, to allow higher performance. Now programmers are required to parallelize their code to take advangtage of all the CPUs in a chip which is a non trivial problem. The theoretical speedup achieved by a program on multicore architecture is dictated by Amdahl's law which describes the non parallelizable code in a program as the limiting factor for speedup. For example, a program with 99% parallelizable code can achieve speedup of 20 whereas a program with 50% parallelizable code can only achieve speedup of 2. Therefore to achieve high speedup, programmers need to reduce size of serial section in their program. One way to reduce sequential section in a program is to remove non-critical task from the sequential section and perform the tasks asynchronously using background thread. This thesis explores this technique in two systems. First, a synchronization mechanism which is used co-ordinate access to shared resource called Multi-Version Read-Log-Update (MV-RLU). MV-RLU achieves high performance by removing garbage collection from critical path and performing it asynchronously using background thread. Second, an index structure, Hydralist, which based on the insight that an index structure can be decomposed into two components, search layer and data layer, and decouples updates to both the layer which allows higher performance. Updates to search layer is done synchronously while updates to data layer is done asynchronously using background threads. Evaluation shows that both the systems perform better than state-of-the-art competitors in a variety of workloads. Multicore scalability synchonization index structures databases
39	Improving Energy and Area Scalability of the Cache Hierarchy in CMPs Valls Mompó, Joan Josep 07 April 2017 (has links) As the core counts increase in each chip multiprocessor generation, CMPs should improve scalability in performance, area, and energy consumption to meet the demands of larger core counts. Directory-based protocols constitute the most scalable alternative. A conventional directory, however, suffers from an inefficient use of storage and energy. First, the large, non-scalable, sharer vectors consume unnecessary area and leakage, especially considering that most of the blocks tracked in a directory are cached by a single core. Second, although increasing directory size and associativity could boost system performance by reducing the coverage misses, it would come at the expense of area and energy consumption. This thesis focuses and exploits the important differences of behavior between private and shared blocks from the directory point of view. These differences claim for a separate management of both types of blocks at the directory. First, we propose the PS-Directory, a two-level directory cache that keeps the reduced number of frequently accessed shared entries in a small and fast first-level cache, namely Shared Directory Cache, and uses a larger and slower second-level Private Directory Cache to track the large amount of private blocks. Experimental results show that, compared to a conventional directory, the PS-Directory improves performance while also reducing silicon area and energy consumption. In this thesis we also show that the shared/private ratio of entries in the directory varies across applications and across different execution phases within the applications, which encourages us to propose Dynamic Way Partitioning (DWP) Directory. DWP-Directory reduces the number of ways with storage for shared blocks and it allows this storage to be powered off or on at run-time according to the dynamic requirements of the applications following a repartitioning algorithm. Results show similar performance as a traditional directory with high associativity, and similar area requirements as recent state-of-the-art schemes. In addition, DWP-Directory achieves notable static and dynamic power consumption savings. This dissertation also deals with the scalability issues in terms of power found in processor caches. A significant fraction of the total power budget is consumed by on-chip caches which are usually deployed with a high associativity degree (even L1 caches are being implemented with eight ways) to enhance the system performance. On a cache access, each way in the corresponding set is accessed in parallel, which is costly in terms of energy. This thesis presents the PS-Cache architecture, an energy-efficient cache design that reduces the number of accessed ways without hurting the performance. The PS-Cache takes advantage of the private-shared knowledge of the referenced block to reduce energy by accessing only those ways holding the kind of block looked up. Results show significant dynamic power consumption savings. Finally, we propose an energy-efficient architectural design that can be effectively applied to any kind of set-associative cache memory, not only to processor caches. The proposed approach, called the Tag Filter (TF) Architecture, filters the ways accessed in the target cache set, and just a few ways are searched in the tag and data arrays. This allows the approach to reduce the dynamic energy consumption of caches without hurting their access time. For this purpose, the proposed architecture holds the X least significant bits of each tag in a small auxiliary X-bit-wide array. These bits are used to filter the ways where the least significant bits of the tag do not match with the bits in the X-bit array. Experimental results show that this filtering mechanism achieves energy consumption in set-associative caches similar to direct mapped ones. Experimental results show that the proposals presented in this thesis offer a good tradeoff among these three major design axes. / Conforme se incrementa el número de núcleos en las nuevas generaciones de multiprocesadores en chip, los CMPs deben de escalar en prestaciones, área y consumo energético para cumplir con las demandas de un número núcleos mayor. Los protocolos basados en directorio constituyen la alternativa más escalable. Un directorio convencional, no obstante, sufre de una utilización ineficiente de almacenamiento y energía. En primer lugar, los grandes y poco escalables vectores de compartidores consumen una cantidad de energía de fuga y de área innecesaria, especialmente si se tiene en consideración que la mayoría de los bloques en un directorio solo se encuentran en la cache de un único núcleo. En segundo lugar, aunque incrementar el tamaño y la asociatividad del directorio aumentaría las prestaciones del sistema, esto supondría un incremento notable en el consumo energético. Esta tesis estudia las diferencias significativas entre el comportamiento de bloques privados y compartidos en el directorio, lo que nos lleva hacia una gestión separada para cada uno de los tipos de bloque. Proponemos el PS-Directory, una cache de directorio de dos niveles que mantiene el reducido número de las entradas compartidas, que son los que se acceden con más frecuencia, en una estructura pequeña de primer nivel (concretamente, la Shared Directory Cache) y que utiliza una estructura más grande y lenta en el segundo nivel (Private Directory Cache) para poder mantener la información de los bloques privados. Los resultados experimentales muestran que, comparado con un directorio convencional, el PS-Directory consigue mejorar las prestaciones a la vez que reduce el área de silicio y el consumo energético. Ya que el ratio compartido/privado de las entradas en el directorio varia entre aplicaciones y entre las diferentes fases de ejecución dentro de las aplicaciones, proponemos el Dynamic Way Partitioning (DWP) Directory. El DWP-Directory reduce el número de vías que almacenan entradas compartidas y permite que éstas se enciendan o apaguen en tiempo de ejecución según los requisitos dinámicos de las aplicaciones según un algoritmo de reparticionado. Los resultados muestran unas prestaciones similares a un directorio tradicional de alta asociatividad y un área similar a otros esquemas recientes del estado del arte. Adicionalmente, el DWP-Directory obtiene importantes reducciones de consumo estático y dinámico. Esta disertación también se enfrenta a los problemas de escalabilidad que se pueden encontrar en las memorias cache. En un acceso a la cache, se accede a cada vía del conjunto en paralelo, siendo así un acción costosa en energía. Esta tesis presenta la arquitectura PS-Cache, un diseño energéticamente eficiente que reduce el número de vías accedidas sin perjudicar las prestaciones. La PS-Cache utiliza la información del estado privado-compartido del bloque referenciado para reducir la energía, ya que tan solo accedemos a un subconjunto de las vías que mantienen los bloques del tipo solicitado. Los resultados muestran unos importantes ahorros de energía dinámica. Finalmente, proponemos otro diseño de arquitectura energéticamente eficiente que se puede aplicar a cualquier tipo de memoria cache asociativa por conjuntos. La propuesta, la Tag Filter (TF) Architecture, filtra las vías accedidas en el conjunto de la cache, de manera que solo se mira un número reducido de vías tanto en el array de etiquetas como en el de datos. Esto permite que nuestra propuesta reduzca el consumo de energía dinámico de las caches sin perjudicar su tiempo de acceso. Los resultados experimentales muestran que este mecanismo de filtrado es capaz de obtener un consumo energético en caches asociativas por conjunto similar de las caches de mapeado directo. Los resultados experimentales muestran que las propuestas presentadas en esta tesis consiguen un buen compromiso entre estos tres importantes pilares de diseño. / Conforme s'incrementen el nombre de nuclis en les noves generacions de multiprocessadors en xip, els CMPs han d'escalar en prestacions, àrea i consum energètic per complir en les demandes d'un nombre de nuclis major. El protocols basats en directori són l'alternativa més escalable. Un directori convencional, no obstant, pateix una utilització ineficient d'emmagatzematge i energia. En primer lloc, els grans i poc escalables vectors de compartidors consumeixen una quantitat d'energia estàtica i d'àrea innecessària, especialment si es considera que la majoria dels blocs en un directori només es troben en la cache d'un sol nucli. En segon lloc, tot i que incrementar la grandària i l'associativitat del directori augmentaria les prestacions del sistema, això suposaria un increment notable en el consum d'energia. Aquesta tesis estudia les diferències significatives entre el comportament de blocs privats i compartits dins del directori, la qual cosa ens guia cap a una gestió separada per a cada un dels tipus de bloc. Proposem el PS-Directory, una cache de directori de dos nivells que manté el reduït nombre de les entrades de blocs compartits, que són els que s'accedeixen amb més freqüència, en una estructura menuda de primer nivell (concretament, la Shared Directory Cache) i que empra una estructura més gran i lenta en el segon nivell (Private Directory Cache) per poder mantenir la informació dels blocs privats. Els resultats experimentals mostren que, comparat amb un directori convencional, el PS-Directory aconsegueix millorar les prestacions a la vegada que redueix l'àrea de silici i el consum energètic. Ja que la ràtio compartit/privat de les entrades en el directori varia entre aplicacions i entre les diferents fases d'execució dins de les aplicacions, proposem el Dynamic Way Partitioning (DWP) Directory. DWP-Directory redueix el nombre de vies que emmagatzemen entrades compartides i permeten que aquest s'encengui o apagui en temps d'execució segons els requeriments dinàmics de les aplicacions seguint un algoritme de reparticionat. Els resultats mostren unes prestacions similars a un directori tradicional d'alta associativitat i una àrea similar a altres esquemes recents de l'estat de l'art. Adicionalment, el DWP-Directory obté importants reduccions de consum estàtic i dinàmic. Aquesta dissertació també s'enfronta als problemes d'escalabilitat que es poden tro- bar en les memòries cache. Les caches on-chip consumeixen una part significativa del consum total del sistema. Aquestes caches implementen un alt nivell d'associativitat. En un accés a la cache, s'accedeix a cada via del conjunt en paral·lel, essent així una acció costosa en energia. Aquesta tesis presenta l'arquitectura PS-Cache, un disseny energèticament eficient que redueix el nombre de vies accedides sense perjudicar les prestacions. La PS-Cache utilitza la informació de l'estat privat-compartit del bloc referenciat per a reduir energia, ja que només accedim al subconjunt de vies que mantenen blocs del tipus sol·licitat. Els resultats mostren uns importants estalvis d'energia dinàmica. Finalment, proposem un altre disseny d'arquitectura energèticament eficient que es pot aplicar a qualsevol tipus de memòria cache associativa per conjunts. La proposta, la Tag Filter (TF) Architecture, filtra les vies accedides en el conjunt de la cache, de manera que només un reduït nombre de vies es miren tant en el array d'etiquetes com en el de dades. Això permet que la nostra proposta redueixi el consum dinàmic energètic de les caches sense perjudicar el seu temps d'accés. Els resultats experimentals mostren que aquest mecanisme de filtre és capaç d'obtenir un consum energètic en caches associatives per conjunt similar al de les caches de mapejada directa. Els resultats experimentals mostren que les propostes presentades en aquesta tesis conseguixen un bon compromís entre aquestros tres importants pilars de diseny. / Valls Mompó, JJ. (2017). Improving Energy and Area Scalability of the Cache Hierarchy in CMPs [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/79551 Cache architecture scalability CMPs Direcotry cache
40	Scalable Robust Models Under Adversarial Data Corruption Zhang, Xuchao 04 April 2019 (has links) The presence of noise and corruption in real-world data can be inevitably caused by accidental outliers, transmission loss, or even adversarial data attacks. Unlike traditional random noise usually assume a specific distribution with low corruption ratio, the data collected from crowdsourcing or labeled by weak annotators can contain adversarial data corruption. More challenge, the adversarial data corruption can be arbitrary, unbounded and do not follow any specific distribution. In addition, in the era of data explosion, the fast-growing amount of data makes the robust models more difficult to handle large-scale data sets. This thesis focuses on the development of methods for scalable robust models under the adversarial data corruption assumptions. Four methods are proposed, including robust regression via heuristic hard-thresholding, online and distributed robust regression with adversarial noises, self-paced robust learning for leveraging clean labels in noisy data, and robust regression via online feature selection with adversarial noises. Moreover, I extended the self-paced robust learning method to its distributed version for the scalability of the proposed algorithm, named distributed self-paced learning in alternating direction method of multiplier. Last, a robust multi-factor personality prediction model is proposed to hand the correlated data noises. For the first method, existing solutions for robust regression lack rigorous recovery guarantee of regression coefficients under the adversarial data corruption with no prior knowledge of corruption ratio. The proposed contributions of our work include: (1) Propose efficient algorithms to address the robust least-square regression problem; (2) Design effective approaches to estimate the corruption ratio; (3) Provide a rigorous robustness guarantee for regression coefficient recovery; and (4) Conduct extensive experiments for performance evaluation. For the second method, existing robust learning methods typically focus on modeling the entire dataset at once; however, they may meet the bottleneck of memory and computation as more and more datasets are becoming too large to be handled integrally. The proposed contributions of our work for this task include: (1) Formulate a framework for the scalable robust least-squares regression problem; (2) Propose online and distributed algorithms to handle the adversarial corruption; (3) Provide a rigorous robustness guarantee for regression coefficient recovery; and (4) Conduct extensive experiments for performance evaluations. For the third method, leveraging the prior knowledge of clean labels in noisy data is actually a crucial issue in practice, but existing robust learning methods typically focus more on eliminating noisy data. However, the data collected by ``weak annotator" or crowd-sourcing can be too noisy for existing robust methods to train an accurate model. Moreover, existing work that utilize additional clean labels are usually designed for some specific problems such as image classification. These methods typically utilize clean labels in large-scale noisy data based on their additional domain knowledge; however, these approaches are difficult to handle extremely noisy data and relied on their domain knowledge heavily, which makes them difficult be used in more general problems. The proposed contributions of our work for this task include: (1) Formulating a framework to leverage the clean labels in noisy data; (2) Proposing a self-paced robust learning algorithm to train models under the supervision of clean labels; (3) Providing a theoretical analysis for the convergence of the proposed algorithm; and (4) Conducting extensive experiments for performance evaluations. For the fourth method, the presence of data corruption in user-generated streaming data, such as social media, motivates a new fundamental problem that learns reliable regression coefficient when features are not accessible entirely at one time. Until now, several important challenges still cannot be handled concurrently: 1) corrupted data estimation when only partial features are accessible; 2) online feature selection when data contains adversarial corruption; and 3) scaling to a massive dataset. This paper proposes a novel RObust regression algorithm via Online Feature Selection (textit{RoOFS}) that concurrently addresses all the above challenges. Specifically, the algorithm iteratively updates the regression coefficients and the uncorrupted set via a robust online feature substitution method. We also prove that our algorithm has a restricted error bound compared to the optimal solution. Extensive empirical experiments in both synthetic and real-world data sets demonstrated that the effectiveness of our new method is superior to that of existing methods in the recovery of both feature selection and regression coefficients, with very competitive efficiency. For the fifth method, existing self-paced learning approaches typically focus on modeling the entire dataset at once; however, this may introduce a bottleneck in terms of memory and computation, as today's fast-growing datasets are becoming too large to be handled integrally. The proposed contributions of our work for this task include: (1) Reformulate the self-paced problem into a distributed setting.; (2) A distributed self-paced learning algorithm based on consensus ADMM is proposed to solve the textit{SPL} problem in a distributed setting; (3) A theoretical analysis is provided for the convergence of our proposed textit{DSPL} algorithm; and (4) Extensive experiments have been conducted utilizing both synthetic and real-world data based on a robust regression task. For the last method, personality prediction in multiple factors, such as openness and agreeableness, is growing in interest especially in the context of social media, which contains massive online posts or likes that can potentially reveal an individual's personality. However, the data collected from social media inevitably contains massive amounts of noise and corruption. To address it, traditional robust methods still suffer from several important challenges, including 1) existence of correlated corruption among multiple factors, 2) difficulty in estimating the corruption ratio in multi-factor data, and 3) scalability to massive datasets. This paper proposes a novel robust multi-factor personality prediction model that concurrently addresses all the above challenges by developing a distributed robust regression algorithm. Specifically, the algorithm optimizes regression coefficients of each factor in parallel with a heuristically estimated corruption ratio and then consolidates the uncorrupted set from multiple factors in two strategies: global consensus and majority voting. We also prove that our algorithm benefits from strong guarantees in terms of convergence rates and coefficient recovery, which can be utilized as a generic framework for the multi-factor robust regression problem with correlated corruption property. Extensive experiment on synthetic and real dataset demonstrates that our algorithm is superior to those of existing methods in both effectiveness and efficiency. / Doctor of Philosophy / Social media has experienced a rapid growth during the past decade. Millions of users of sites such as Twitter have been generating and sharing a wide variety of content including texts, images, and other metadata. In addition, social media can be treated as a social sensor that reflects different aspects of our society. Event analytics in social media have enormous significance for applications like disease surveillance, business intelligence, and disaster management. Social media data possesses a number of important characteristics including dynamics, heterogeneity, noisiness, timeliness, big volume, and network properties. These characteristics cause various new challenges and hence invoke many interesting research topics, which will be addressed here. This dissertation focuses on the development of five novel methods for social media-based spatiotemporal event detection and forecasting. The first of these is a novel unsupervised approach for detecting the dynamic keywords of spatial events in targeted domains. This method has been deployed in a practical project for monitoring civil unrest events in several Latin American regions. The second builds on this by discovering the underlying development progress of events, jointly considering the structural contexts and spatiotemporal burstiness. The third seeks to forecast future events using social media data. The basic idea here is to search for subtle patterns in specific cities as indicators of ongoing or future events, where each pattern is defined as a burst of context features (keywords) that are relevant to a specific event. For instance, an initial expression of discontent gas price increases could actually be a potential precursor to a more general protest about government policies. Beyond social media data, in the fourth method proposed here, multiple data sources are leveraged to reflect different aspects of the society for event forecasting. This addresses several important problems, including the common phenomena that different sources may come from different geographical levels and have different available time periods. The fifth study is a novel flu forecasting method based on epidemics modeling and social media mining. A new framework is proposed to integrate prior knowledge of disease propagation mechanisms and real-time information from social media. Robust Model Adversarial Data Corruption Scalability

Search results