Global ETD Search

11	Performance Evaluation of Cassandra Scalability on Amazon EC2 Srinadhuni, Siddhartha January 2018 (has links) Context In the fields of communication systems and computer science, Infrastructure as a Service consists of building blocks for cloud computing and to provide robust network features. AWS is one such infrastructure as a service which provides several services out of which Elastic Cloud Compute (EC2) is used to deploy virtual machines across several data centers and provides fault tolerant storage for applications across the cloud. Apache Cassandra is one of the many NoSQL databases which provides fault tolerance and elasticity across the servers. It has a ring structure which helps the communication effective between the nodes in a cluster. Cassandra is robust which means that there will not be a down-time when adding new Cassandra nodes to the existing cluster. Objectives. In this study quantifying the latency in adding Cassandra nodes to the Amazon EC2 instances and assessing the impact of Replication factors (RF) and Consistency Levels (CL) on autoscaling have been put forth. Methods. Primarily a literature review is conducted on how the experiment with the above-mentioned constraints can be carried out. Further an experimentation is conducted to address the latency and the effects of autoscaling. A 3-node Cassandra cluster runs on Amazon EC2 with Ubuntu 14.04 LTS as the operating system. A threshold value is identified for each Cassandra specific configuration and is scaled over to five nodes on AWS utilizing the benchmarking tool, Cassandra stress tool. This procedure is repeated for a 5-node Cassandra cluster and each of the configurations with a mixed workload of equal reads and writes. Results. Latency has been identified in adding Cassandra nodes on Amazon EC2 instances and the impacts of replication factors and consistency levels on autoscaling have been quantified. Conclusions. It is concluded that there is a decrease in latency after autoscaling for all the configurations of Cassandra and changing the replication factors and consistency levels have also resulted in performance change of Cassandra. Cassandra Amazon EC2 Performance Evaluation Scalability Cloud Benchmarking Data Scalability Infrastructure as a service Cassandra-stress Computer Sciences Datavetenskap (datalogi)
12	AES - kryptering med cuda : Skillnader i beräkningshastighet mellan AES-krypteringsmetoderna ECB och CTR vid implementering med Cuda-ramverket. Vidén, Pontus, Henningsson, Viktor January 2020 (has links) Purpose – The purpose of this study is partly to illustrate how the AES encryption methods ECB and CTR affect the computational speed when using the GPGPU framework Cuda, but also to clarify the advantages and disadvantages between the different AES encryption modes. Method – A preliminary study was conducted to obtain empirical data on the AES encryption modes ECB and CTR. Data from the study has been analyzed and compared to determine the various aspects of the AES encryption modes and to create a basis for determining the advantages and disadvantages between them. The preliminary study has been carried out systematically by finding scientific works by searching databases within the subject. An experiment has been used as a method to be able to extract execution time data for the GPGPU framework Cuda when processing the AES encryption modes. Experiment were chosen as a method to gain control over the variables included in the study and to see how these variables change when they are consciously influenced. Findings – The findings of the preliminary study show that CTR is more secure than the ECB, but also considerably more complex, which can lead to integrity risks when implementation is done incorrectly. In the experiment, computational speeds are produced when the CPU memory sends to the GPU memory, the encryption on the GPU and how long it takes for the GPU memory to send to the CPU memory. This is done for both CTR and ECB in encryption and decryption. The result of the analysis shows that the ECB is faster than CTR in encryption and decryption. The calculation speed is higher with the ECB compared to the CTR. Implications – The experiment shows that CTR is slower than the ECB. But the most amount of time spent in encryption for both modes are the transfers between the CPU memory and the GPU memory. Limitations – The file sizes of the files tested only goes up to about 1 gigabyte which gave small computation times. GPGPU CTR ECB Cuda AES parallellisering GPGPU-ramverk processorer AES-krypteringsmetod Amazon EC2 P3 AWS Engineering and Technology Teknik och teknologier
13	透過Spark平台實現大數據分析與建模的比較：以微博為例 / Accomplish Big Data Analytic and Modeling Comparison on Spark: Weibo as an Example 潘宗哲, Pan, Zong Jhe Unknown Date (has links) 資料的快速增長與變化以及分析工具日新月異，增加資料分析的挑戰，本研究希望透過一個完整機器學習流程，提供學術或企業在導入大數據分析時的參考藍圖。我們以Spark作為大數據分析的計算框架，利用MLlib的Spark.ml與Spark.mllib兩個套件建構機器學習模型，解決傳統資料分析時可能會遇到的問題。在資料分析過程中會比較Spark不同分析模組的適用性情境，首先使用本地端叢集進行開發，最後提交至Amazon雲端叢集加快建模與分析的效能。大數據資料分析流程將以微博為實驗範例，並使用香港大學新聞與傳媒研究中心提供的2012年大陸微博資料集，我們採用RDD、Spark SQL與GraphX萃取微博使用者貼文資料的特增值，並以隨機森林建構預測模型，來預測使用者是否具有官方認證的二元分類。 / The rapid growth of data volume and advanced data analytics tools dramatically increase the challenge of big data analytics services adoption. This paper presents a big data analytics pipeline referenced blueprint for academic and company when they consider importing the associated services. We propose to use Apache Spark as a big data computing framework, which Spark MLlib contains two packages Spark.ml and Spark.mllib, on building a machine learning model. This resolves the traditional data analytics problem. In this big data analytics pipeline, we address a situation for adopting suitable Spark modules. We first use local cluster to develop our data analytics project following the jobs submitted to AWS EC2 clusters to accelerate analytic performance. We demonstrate the proposed big data analytics blueprint by using 2012 Weibo datasets. Finally, we use Spark SQL and GraphX to extract information features from large amount of the Weibo users’ posts. The official certification prediction model is constructed for Weibo users through Random Forest algorithm. 大數據分析機器學習微博分析流程亞馬遜雲端服務 Big data analytics machine learning Weibo analytics pipeline Amazon EC2
14	Empirical Performance Analysis of High Performance Computing Benchmarks Across Variations in Cloud Computing Mani, Sindhu 01 January 2012 (has links) High Performance Computing (HPC) applications are data-intensive scientific software requiring significant CPU and data storage capabilities. Researchers have examined the performance of Amazon Elastic Compute Cloud (EC2) environment across several HPC benchmarks; however, an extensive HPC benchmark study and a comparison between Amazon EC2 and Windows Azure (Microsoft’s cloud computing platform), with metrics such as memory bandwidth, Input/Output (I/O) performance, and communication computational performance, are largely absent. The purpose of this study is to perform an exhaustive HPC benchmark comparison on EC2 and Windows Azure platforms. We implement existing benchmarks to evaluate and analyze performance of two public clouds spanning both IaaS and PaaS types. We use Amazon EC2 and Windows Azure as platforms for hosting HPC benchmarks with variations such as instance types, number of nodes, hardware and software. This is accomplished by running benchmarks including STREAM, IOR and NPB benchmarks on these platforms on varied number of nodes for small and medium instance types. These benchmarks measure the memory bandwidth, I/O performance, communication and computational performance. Benchmarking cloud platforms provides useful objective measures of their worthiness for HPC applications in addition to assessing their consistency and predictability in supporting them. Thesis University of North Florida UNF Cloud Computing High Performance Computing HPC HPC Benchmarks Windows Azure Amazon EC2 Digital Communications and Networking Other Computer Engineering

Page generated in 0.0261 seconds