Global ETD Search

261	Resource optimization of edge servers dealing with priority-based workloads by utilizing service level objective-aware virtual rebalancing Shahid, Amna 08 August 2023 (has links) (PDF) IoT enables profitable communication between sensor/actuator devices and the cloud. Slow network causing Edge data to lack Cloud analytics hinders real-time analytics adoption. VRebalance solves priority-based workload performance for stream processing at the Edge. BO is used in VRebalance to prioritize workloads and find optimal resource configurations for efficient resource management. Apache Storm platform was used with RIoTBench IoT benchmark tool for real-time stream processing. Tools were used to evaluate VRebalance. Study shows VRebalance is more effective than traditional methods, meeting SLO targets despite system changes. VRebalance decreased SLO violation rates by almost 30% for static priority-based workloads and 52.2% for dynamic priority-based workloads compared to hill climbing algorithm. Using VRebalance decreased SLO violations by 66.1% compared to Apache Storm's default allocation. SLO- aware apache storm edge servers CPU optimization priority based workloads stream processing
262	Analyzing Parameter Sets For Apache Kafka and RabbitMQ On A Cloud Platform Rabiee, Amir January 2018 (has links) Applications found in both large and small enterprises need a communication method in order to meet requirements of scalability and durability. Many communication methods exist, but the most well-used are message queues and message brokers. The problem is that there exist many different types of message queues and message brokers with their own unique design and implementation choices. These choices result in different parameter sets, which can be configured in order to meet requirements of for example high durability, throughput, and availability. This thesis tests two different message brokers, Apache Kafka and RabbitMQ, with the purpose of discussing and showing the impact on throughput and latency when using a variety of parameters. The experiments conducted are focused on two primary metrics, latency and throughput, with secondary metrics such as diskand CPU-usage. The parameters chosen for both RabbitMQ and Kafka are optimized for maximized throughput and decreased latency. The experiments conducted are tested on a cloud platform; Amazon Web Services. The results show that Kafka outshines RabbitMQ regarding throughput and latency. RabbitMQ is the most efficient in terms of quantity of data being written, while on the other hand being more CPU-heavy than Kafka. Kafka performs better than RabbitMQ in terms of the amount of messages being sent and having the shortest one-way latency. / Applikationer som finns i både komplexa och icke-komplexa system behöver en kommunikationsmetod för att uppfylla kriterierna för skalbarhet och hållbarhet. Många kommunikationsmetoder existerar, men de mest använda är meddelandeköer och meddelandemäklare. Problemet är att det finns en uppsjö av olika typer av meddelandeköer och meddelandemäklare som är unika med avseende på deras design och implementering. Dessa val resulterar i olika parametersatser som kan konfigureras för att passa olika kriterier, exempelvis hög hållbarhet, genomströmning och tillgänglighet. Denna avhandling testar två olika meddelandemäklare, Apache Kafka och RabbitMQ med syfte att diskutera och visa effekterna av att använda olika parametrar. De utförda experimenten är inriktade på två primära mätvärden, latens och genomströmning, med sekundära mätvärden som exempelvis diskanvändning och CPU-användning. De parametrar som valts för både RabbitMQ och Kafka optimeras med fokus på de primära mätvärdena. Experimenten som genomförs testades på en molnplattform; Amazon Web Services. Resultaten visar att Kafka presterar bättre än RabbitMQ när det kommer till genomströmning och latens. Gällande inverkan av Kafka och RabbitMQ på mängden skriven data, är RabbitMQ den mest effektiva, medan den å andra sidan är mer CPU-tung än Kafka. Kafka RabbitMQ throughput latency cloud platform testing Apache Kafka RabbitMQ genomflöde latens molnplattform testning Computer and Information Sciences Data- och informationsvetenskap
263	A Shared-Memory Coupled Architecture to Leverage Big Data Frameworks in Prototyping and In-Situ Analytics for Data Intensive Scientific Workflows Lemon, Alexander Michael 01 July 2019 (has links) There is a pressing need for creative new data analysis methods whichcan sift through scientific simulation data and produce meaningfulresults. The types of analyses and the amount of data handled by currentmethods are still quite restricted, and new methods could providescientists with a large productivity boost. New methods could be simpleto develop in big data processing systems such as Apache Spark, which isdesigned to process many input files in parallel while treating themlogically as one large dataset. This distributed model, combined withthe large number of analysis libraries created for the platform, makesSpark ideal for processing simulation output.Unfortunately, the filesystem becomes a major bottleneck in any workflowthat uses Spark in such a fashion. Faster transports are notintrinsically supported by Spark, and its interface almost denies thepossibility of maintainable third-party extensions. By leveraging thesemantics of Scala and Spark's recent scheduler upgrades, we forceco-location of Spark executors with simulation processes and enable fastlocal inter-process communication through shared memory. This provides apath for bulk data transfer into the Java Virtual Machine, removing thecurrent Spark ingestion bottleneck.Besides showing that our system makes this transfer feasible, we alsodemonstrate a proof-of-concept system integrating traditional HPC codeswith bleeding-edge analytics libraries. This provides scientists withguidance on how to apply our libraries to gain a new and powerful toolfor developing new analysis techniques in large scientific simulationpipelines. Apache Spark Data-Intensive Science High-Performance Computing In-Situ Analytics Parameter Sweep State-Space Pruning Computer Sciences Physical Sciences and Mathematics
264	High Performance and Scalable Matching and Assembly of Biological Sequences Abu Doleh, Anas 21 December 2016 (has links) No description available. Computer Engineering Bioinformatics bioinformatics sequence similarity indexing graphical processing unit Apache Spark de Bruijn graph de novo assembly metagenomics
265	Page connection representation: An object-oriented and dynamic language for complex web applications Zhou, Yin January 2001 (has links) No description available. Page Connection Representation Database-Backed Web Applications 3GL Language 4GL Language C++ Objects Linux Platform Redhat 6.0 Apache Web Server
266	The gray wolf and Native American self-determination : a comparative study of the White Mountain Apache and Nez Perce Tribe Block, Kelci A. M. 01 May 2009 (has links) No description available. Political Science
267	Auto-Tuning Apache Spark Parameters for Processing Large Datasets / Auto-Optimering av Apache Spark-parametrar för bearbetning av stora datamängder Zhou, Shidi January 2023 (has links) Apache Spark is a popular open-source distributed processing framework that enables efficient processing of large amounts of data. Apache Spark has a large number of configuration parameters that are strongly related to performance. Selecting an optimal configuration for Apache Spark application deployed in a cloud environment is a complex task. Making a poor choice may not only result in poor performance but also increases costs. Manually adjusting the Apache Spark configuration parameters can take a lot of time and may not lead to the best outcomes, particularly in a cloud environment where computing resources are allocated dynamically, and workloads can fluctuate significantly. The focus of this thesis project is the development of an auto-tuning approach for Apache Spark configuration parameters. Four machine learning models are formulated and evaluated to predict Apache Spark’s performance. Additionally, two models for Apache Spark configuration parameter search are created and evaluated to identify the most suitable parameters, resulting in the shortest execution time. The obtained results demonstrates that with the developed auto-tuning approach and adjusting Apache Spark configuration parameters, Apache Spark applications can achieve a shorter execution time than when using the default parameters. The developed auto-tuning approach gives an improved cluster utilization and shorter job execution time, with an average performance improvement of 49.98%, 53.84%, and 64.16% for the three different types of Apache Spark applications benchmarked. / Apache Spark är en populär öppen källkodslösning för distribuerad databehandling som möjliggör effektiv bearbetning av stora mängder data. Apache Spark har ett stort antal konfigurationsparametrar som starkt påverkar prestandan. Att välja en optimal konfiguration för en Apache Spark-applikation som distribueras i en molnmiljö är en komplex uppgift. Ett dåligt val kan inte bara leda till dålig prestanda utan också ökade kostnader. Manuell anpassning av Apache Spark-konfigurationsparametrar kan ta mycket tid och leda till suboptimala resultat, särskilt i en molnmiljö där beräkningsresurser tilldelas dynamiskt och arbetsbelastningen kan variera avsevärt. Fokus för detta examensprojekt är att utveckla en automatisk optimeringsmetod för konfigurationsparametrarna i Apache Spark. Fyra maskininlärningsmodeller formuleras och utvärderas för att förutsäga Apache Sparks prestanda. Dessutom skapas och utvärderas två modeller för att söka efter de mest lämpliga konfigurationsparametrarna för Apache Spark, vilket resulterar i kortast möjliga exekveringstid. De erhållna resultaten visar att den utvecklade automatiska optimeringsmetoden, med anpassning av Apache Sparks konfigurationsparameterar, bidrar till att Apache Spark-applikationer kan uppnå kortare exekveringstider än vid användning av standard-parametrar. Den utvecklade metoden för automatisk optimering bidrar till en förbättrad användning av klustret och kortare exekveringstider, med en genomsnittlig prestandaförbättring på 49,98%, 53,84% och 64,16% för de tre olika typerna av Apache Spark-applikationer som testades. Apache Spark Cloud Environment Spark Configuration Parameter Resource Utilization Ridge Regression Elastic Net Random Forest Deep Neural Network Bayesian Optimization Particle Swarm Optimization. Apache Spark Molnmiljö Apache Spark konfigurationsparameter Resursutnyttjande Ridge-regression Elastisk nät Slumpskog Djupt neuralt nätverk Bayesiansk optimering Partikelsvärmsoptimering. Computer and Information Sciences Data- och informationsvetenskap
268	物聯網與大數據平台之通訊架構設計與實作 / Design and Implementation of the Communication Architecture for IoT & Big Data Platform 胡學賓, Hu, Hsueh Pin Unknown Date (has links) 本研究針對物聯網與雲端大數據分析之不同程度的通訊需求，以微服務架構為基礎，設計一個四層式物聯網與大數據平台之通訊架構。面對物聯網之即時通訊需求，本研究採用MQTT通訊協定做為解決方案，而雲端大數據分析之通訊需求則採用Apache Kafka。本研究基於參與者模型（Actor Model）所提出之「裝置代理人」，全面的解決了物聯網中異質通訊協定所產生的複雜性，同時解決了集中式物聯網閘道器所造成的系統複雜性與效能瓶頸，使物聯網閘道器能進行分散式部署，並且共享運算資源。物聯網雲端運算大數據巨量資料通訊架構微服務 MQTT Apache Kafka Actor Model Akka
269	Étude et amélioration de la performance des serveurs de données pour les architectures multi-cœurs Gaud, Fabien 02 December 2010 (has links) (PDF) Cette thèse s'intéresse à la performance des serveurs de données sur les architectures multi-cœurs. Nous avons choisi d'étudier ce problème sous deux aspects différents. Premièrement, nous étudions un support d'exécution événementiel. Nous montrons notamment que le mécanisme de vol de tâches, utilisé pour équilibrer la charge entre les cœurs, peut pénaliser la performance d'un serveur Web. Nous proposons donc diverses optimisations pour améliorer les performances de ce mécanisme sur les processeurs multi-cœurs. Deuxièmement, nous étudions la performance du serveur Web Apache, exploitant à la fois un ensemble de threads et de processus, sur une architecture multi-cœurs NUMA. Nous montrons notamment que, sous une charge réaliste, ce serveur Web ne passe pas idéalement à l'échelle. Grâce à une analyse détaillée des coûts, nous déterminons les raisons de ce manque de passage à l'échelle et présentons un ensemble de propositions visant à améliorer la performance de ce serveur sur une architecture NUMA. Architectures multi-cœurs Architectures NUMA Performance des serveurs de données Programmation événementielle Vol de tâches Serveurs Web Serveur de fichiers Apache Analyse de performance
270	以雲端平行運算建立期貨走勢預測模型-Logistic Regression之應用 / Prediction Model of Futures Trend by Cloud and Parallel Computing - Application of Logistic Regression 呂縩正, Lu, Tsai Cheng Unknown Date (has links) 在科技持續進步的時代，金融市場發展隨著社會的演進不斷地成長與活絡，金融商品也從原本單純的本國存放款、外幣投資衍生出票券、債券等利率投資工具；除此之外，隨著資本市場的擴張，股票、基金、期貨與選擇權等投資標的更是琳瑯滿目。而後產生了許多人使用資料探勘工具預測市場的買賣時機。如Baba N., Asakawa H. and Sato K.(1999)使用倒傳遞類神經網路來預測到股市未來的漲跌，而後又在2000年研究當中加入基因演算法來求得倒傳遞類神經網路的權重，得到最後的類神經網路模型。在做資料探勘的同時，我們得在希望預測目標(Target)上事先定義好一套固定規則，這會使得模型的彈性與可預測度降低，本研究希望能透過資料探勘工具增加預測目標規則的彈性，增加模型最後的預測準確度。本研究樣本區間選用2010年到2015年的台指期貨數據做為資料，並結合羅吉斯回歸與粒子群演算法建構更加有彈性的預測模型結果，最後發現在未來10分鐘，若漲幅超過0.1114%做為買進訊號的話，其建立出的模型可達到84%的預測準確度。羅吉斯回歸粒子群演算法最佳化預測台股期貨 Apache Spark

Search results