Global ETD Search

81	Effective and Efficient Methodologies for Social Network Analysis Pan, Long 16 January 2008 (has links) Performing social network analysis (SNA) requires a set of powerful techniques to analyze structural information contained in interactions between social entities. Many SNA technologies and methodologies have been developed and have successfully provided significant insights for small-scale interactions. However, these techniques are not suitable for analyzing large social networks, which are very popular and important in various fields and have special structural properties that cannot be obtained from small networks or their analyses. There are a number of issues that need to be further studied in the design of current SNA techniques. A number of key issues can be embodied in three fundamental and critical challenges: long processing time, large computational resource requirements, and network dynamism. In order to address these challenges, we discuss an anytime-anywhere methodology based on a parallel/distributed computational framework to effectively and efficiently analyze large and dynamic social networks. In our methodology, large social networks are decomposed into intra-related smaller parts. A coarse-level of network analysis is built based on comprehensively analyzing each part. The partial analysis results are incrementally refined over time. Also, during the analyses process, network dynamic changes are effectively and efficiently adapted based on the obtained results. In order to evaluate and validate our methodology, we implement our methodology for a set of SNA metrics which are significant for SNA applications and cover a wide range of difficulties. Through rigorous theoretical and experimental analyses, we demonstrate that our anytime-anywhere methodology is / Ph. D. Anytime-Anywhere Methodology Parallel/Distributed Computing Social Network Analysis
82	Distributed Localization for Wireless Distributed Networks in Indoor Environments Mendoza, Hermie P. 18 August 2011 (has links) Positioning systems enable location-awareness for mobile devices, computers, and even tactical radios. From the collected location information, location-based services can be realized. One type of positioning system is based on location fingerprints. Unlike the conventional positioning techniques of time of or time delay of arrival (TOA/TDOA) or even angle of arrival (AOA), fingerprinting associates unique characteristics such as received signal strength (RSS) that differentiates a location from another location. The location-dependent characteristics then can be used to infer a user's location. Furthermore, fingerprinting requires no specialized hardware because of its reliance on an existing communications infrastructure. In estimating a user's position, fingerprint-based positioning systems are centrally calculated on a mobile computer using either a Euclidean distance algorithm, Bayesian statistics, or neural networks. With large service areas and, subsequently, large radio maps, one mobile computer may not have the adequate resources to locally compute a user's position. Wireless distributed computing provides a means for the mobile computer to meet the location-based service requirements and increase its network lifetime. This thesis develops distributed localization algorithms to be used in an indoor fingerprint-based positioning system. Fingerprint calculations are not computed on a single device, but rather on a wireless distributed computing network on Virginia Tech's Cognitive Radio Network Testbed (CORNET). / Master of Science fingerprinting indoor localization wireless distributed computing networks localization
83	Wireless Distributed Computing in Cloud Computing Networks Datla, Dinesh 25 October 2013 (has links) The explosion in growth of smart wireless devices has increased the ubiquitous presence of computational resources and location-based data. This new reality of numerous wireless devices capable of collecting, sharing, and processing information, makes possible an avenue for new enhanced applications. Multiple radio nodes with diverse functionalities can form a wireless cloud computing network (WCCN) and collaborate on executing complex applications using wireless distributed computing (WDC). Such a dynamically composed virtual cloud environment can offer services and resources hosted by individual nodes for consumption by user applications. This dissertation proposes an architectural framework for WCCNs and presents the different phases of its development, namely, development of a mathematical system model of WCCNs, simulation analysis of the performance benefits offered by WCCNs, design of decision-making mechanisms in the architecture, and development of a prototype to validate the proposed architecture. The dissertation presents a system model that captures power consumption, energy consumption, and latency experienced by computational and communication activities in a typical WCCN. In addition, it derives a stochastic model of the response time experienced by a user application when executed in a WCCN. Decision-making and resource allocation play a critical role in the proposed architecture. Two adaptive algorithms are presented, namely, a workload allocation algorithm and a task allocation - scheduling algorithm. The proposed algorithms are analyzed for power efficiency, energy efficiency, and improvement in the execution time of user applications that are achieved by workload distribution. Experimental results gathered from a software-defined radio network prototype of the proposed architecture validate the theoretical analysis and show that it is possible to achieve 80 % improvement in execution time with the help of just three nodes in the network. / Ph. D. distributed computing wireless cloud computing mobile cloud computing wireless networks
84	A Software Framework For the Detection and Classification of Biological Targets in Bio-Nano Sensing Hafeez, Abdul 08 September 2014 (has links) Detection and identification of important biological targets, such as DNA, proteins, and diseased human cells are crucial for early diagnosis and prognosis. The key to discriminate healthy cells from the diseased cells is the biophysical properties that differ radically. Micro and nanosystems, such as solid-state micropores and nanopores can measure and translate these properties of biological targets into electrical spikes to decode useful insights. Nonetheless, such approaches result in sizable data streams that are often plagued with inherit noise and baseline wanders. Moreover, the extant detection approaches are tedious, time-consuming, and error-prone, and there is no error-resilient software that can analyze large data sets instantly. The ability to effectively process and detect biological targets in larger data sets lie in the automated and accelerated data processing strategies using state-of-the-art distributed computing systems. In this dissertation, we design and develop techniques for the detection and classification of biological targets and a distributed detection framework to support data processing from multiple bio-nano devices. In a distributed setup, the collected raw data stream on a server node is split into data segments and distributed across the participating worker nodes. Each node reduces noise in the assigned data segment using moving-average filtering, and detects the electric spikes by comparing them against a statistical threshold (based on the mean and standard deviation of the data), in a Single Program Multiple Data (SPMD) style. Our proposed framework enables the detection of cancer cells in a mixture of cancer cells, red blood cells (RBCs), and white blood cells (WBCs), and achieves a maximum speedup of 6X over a single-node machine by processing 10 gigabytes of raw data using an 8-node cluster in less than a minute, which will otherwise take hours using manual analysis. Diseases such as cancer can be mitigated, if detected and treated at an early stage. Micro and nanoscale devices, such as micropores and nanopores, enable the translocation of biological targets at finer granularity. These devices are tiny orifices in silicon-based membranes, and the output is a current signal, measured in nanoamperes. Solid-state micropore is capable of electrically measuring the biophysical properties of human cells, when a blood sample is passed through it. The passage of cells via such pores results in an interesting pattern (pulse) in the baseline current, which can be measured at a very high rate, such as 500,000 samples per second, and even higher resolution. The pulse is essentially a sequence of temporal data samples that abruptly falls below and then reverts back to a normal baseline with an acceptable predefined time interval, i.e., pulse width. The pulse features, such as width and amplitude, correspond to the translocation behavior and the extent to which the pore is blocked, under a constant potential. These features are crucial in discriminating the diseased cells from healthy cells, such as identifying cancer cells in a mixture of cells. / Ph. D. Accelerated Diagnosis Solid-state micropores Parallel and Distributed Computing
85	MULTILINGUAL CYBERBULLYING DETECTION SYSTEM Rohit Sidram Pawar (6613247) 11 June 2019 (has links) Since the use of social media has evolved, the ability of its users to bully others has increased. One of the prevalent forms of bullying is Cyberbullying, which occurs on the social media sites such as Facebook©, WhatsApp©, and Twitter©. The past decade has witnessed a growth in cyberbullying – is a form of bullying that occurs virtually by the use of electronic devices, such as messaging, e-mail, online gaming, social media, or through images or mails sent to a mobile. This bullying is not only limited to English language and occurs in other languages. Hence, it is of the utmost importance to detect cyberbullying in multiple languages. Since current approaches to identify cyberbullying are mostly focused on English language texts, this thesis proposes a new approach (called Multilingual Cyberbullying Detection System) for the detection of cyberbullying in multiple languages (English, Hindi, and Marathi). It uses two techniques, namely, Machine Learning-based and Lexicon-based, to classify the input data as bullying or non-bullying. The aim of this research is to not only detect cyberbullying but also provide a distributed infrastructure to detect bullying. We have developed multiple prototypes (standalone, collaborative, and cloud-based) and carried out experiments with them to detect cyberbullying on different datasets from multiple languages. The outcomes of our experiments show that the machine-learning model outperforms the lexicon-based model in all the languages. In addition, the results of our experiments show that collaboration techniques can help to improve the accuracy of a poor-performing node in the system. Finally, we show that the cloud-based configurations performed better than the local configurations. Computer Engineering Computer Software Distributed Computing Natural Language Processing Computer System Architecture Distributed computing Natural language processsing machine Learning Predictions cloud applications Indian languages
86	A framework for automatic optimization of MapReduce programs based on job parameter configurations. Lakkimsetti, Praveen Kumar January 1900 (has links) Master of Science / Department of Computing and Information Sciences / Mitchell L. Neilsen / Recently, cost-effective and timely processing of large datasets has been playing an important role in the success of many enterprises and the scientific computing community. Two promising trends ensure that applications will be able to deal with ever increasing data volumes: first, the emergence of cloud computing, which provides transparent access to a large number of processing, storage and networking resources; and second, the development of the MapReduce programming model, which provides a high-level abstraction for data-intensive computing. MapReduce has been widely used for large-scale data analysis in the Cloud [5]. The system is well recognized for its elastic scalability and fine-grained fault tolerance. However, even to run a single program in a MapReduce framework, a number of tuning parameters have to be set by users or system administrators to increase the efficiency of the program. Users often run into performance problems because they are unaware of how to set these parameters, or because they don't even know that these parameters exist. With MapReduce being a relatively new technology, it is not easy to find qualified administrators [4]. The major objective of this project is to provide a framework that optimizes MapReduce programs that run on large datasets. This is done by executing the MapReduce program on a part of the dataset using stored parameter combinations and setting the program with the most efficient combination and this modified program can be executed over the different datasets. We know that many MapReduce programs are used over and over again in applications like daily weather analysis, log analysis, daily report generation etc. So, once the parameter combination is set, it can be used on a number of data sets efficiently. This feature can go a long way towards improving the productivity of users who lack the skills to optimize programs themselves due to lack of familiarity with MapReduce or with the data being processed. Hadoop mapreduce Optimization Performance Parallel processing Job configuration parameters Distributed computing Computer Science (0984)
87	Towards a Framework for DHT Distributed Computing Rosen, Andrew 12 August 2016 (has links) Distributed Hash Tables (DHTs) are protocols and frameworks used by peer-to-peer (P2P) systems. They are used as the organizational backbone for many P2P file-sharing systems due to their scalability, fault-tolerance, and load-balancing properties. These same properties are highly desirable in a distributed computing environment, especially one that wants to use heterogeneous components. We show that DHTs can be used not only as the framework to build a P2P file-sharing service, but as a P2P distributed computing platform. We propose creating a P2P distributed computing framework using distributed hash tables, based on our prototype system ChordReduce. This framework would make it simple and efficient for developers to create their own distributed computing applications. Unlike Hadoop and similar MapReduce frameworks, our framework can be used both in both the context of a datacenter or as part of a P2P computing platform. This opens up new possibilities for building platforms to distributed computing problems. One advantage our system will have is an autonomous load-balancing mechanism. Nodes will be able to independently acquire work from other nodes in the network, rather than sitting idle. More powerful nodes in the network will be able use the mechanism to acquire more work, exploiting the heterogeneity of the network. By utilizing the load-balancing algorithm, a datacenter could easily leverage additional P2P resources at runtime on an as needed basis. Our framework will allow MapReduce-like or distributed machine learning platforms to be easily deployed in a greater variety of contexts. Distributed Hash Tables MapReduce Distributed Computing Load Balancing Cryptographic Hash Functions
88	An Investigation of Run-time Operations in a Heterogeneous Desktop Grid Environment: The Texas Tech University Desktop Grid Case Study Perez, Jerry Felix 01 January 2013 (has links) The goal of the dissertation study was to evaluate the existing DG scheduling algorithm. The evaluation was developed through previously explored simulated analyses of DGs performed by researchers in the field of DG scheduling optimization and to improve the current RT framework of the DG at TTU. The author analyzed the RT of an actual DG, thereby enabling other investigators to compare theoretical results with the results of this dissertation case study. Two statistical methods were used to formulate and validate predictive models: multiple linear regression and graphical exploratory data analysis techniques. Using both statistical methods, the author was able to determine that the theoretical model was able to predict the significance of four independent variables of resource fragmentation, computational volatility, resource management, and grid job scheduling on the dependent variables quality of service and job performance affecting RT. After an experimental case study analysis of the DG variables, the author identified the best DG resources to perform optimization of run-time performance of DG at TTU. The projected outcome of this investigation is the improved job scheduling techniques of the DG at TTU. Desktop Grid Distributed Computing Grid Computing Improved Performance QoS Run-time Computer Sciences
89	Computing resources sensitive parallelization of neural neworks for large scale diabetes data modelling, diagnosis and prediction Qi, Hao January 2011 (has links) Diabetes has become one of the most severe deceases due to an increasing number of diabetes patients globally. A large amount of digital data on diabetes has been collected through various channels. How to utilize these data sets to help doctors to make a decision on diagnosis, treatment and prediction of diabetic patients poses many challenges to the research community. The thesis investigates mathematical models with a focus on neural networks for large scale diabetes data modelling and analysis by utilizing modern computing technologies such as grid computing and cloud computing. These computing technologies provide users with an inexpensive way to have access to extensive computing resources over the Internet for solving data and computationally intensive problems. This thesis evaluates the performance of seven representative machine learning techniques in classification of diabetes data and the results show that neural network produces the best accuracy in classification but incurs high overhead in data training. As a result, the thesis develops MRNN, a parallel neural network model based on the MapReduce programming model which has become an enabling technology in support of data intensive applications in the clouds. By partitioning the diabetic data set into a number of equally sized data blocks, the workload in training is distributed among a number of computing nodes for speedup in data training. MRNN is first evaluated in small scale experimental environments using 12 mappers and subsequently is evaluated in large scale simulated environments using up to 1000 mappers. Both the experimental and simulations results have shown the effectiveness of MRNN in classification, and its high scalability in data training. MapReduce does not have a sophisticated job scheduling scheme for heterogonous computing environments in which the computing nodes may have varied computing capabilities. For this purpose, this thesis develops a load balancing scheme based on genetic algorithms with an aim to balance the training workload among heterogeneous computing nodes. The nodes with more computing capacities will receive more MapReduce jobs for execution. Divisible load theory is employed to guide the evolutionary process of the genetic algorithm with an aim to achieve fast convergence. The proposed load balancing scheme is evaluated in large scale simulated MapReduce environments with varied levels of heterogeneity using different sizes of data sets. All the results show that the genetic algorithm based load balancing scheme significantly reduce the makespan in job execution in comparison with the time consumed without load balancing. 610.21
90	Research and development of accounting system in grid environment Chen, Xiaoyn January 2010 (has links) The Grid has been recognised as the next-generation distributed computing paradigm by seamlessly integrating heterogeneous resources across administrative domains as a single virtual system. There are an increasing number of scientific and business projects that employ Grid computing technologies for large-scale resource sharing and collaborations. Early adoptions of Grid computing technologies have custom middleware implemented to bridge gaps between heterogeneous computing backbones. These custom solutions form the basis to the emerging Open Grid Service Architecture (OGSA), which aims at addressing common concerns of Grid systems by defining a set of interoperable and reusable Grid services. One of common concerns as defined in OGSA is the Grid accounting service. The main objective of the Grid accounting service is to ensure resources to be shared within a Grid environment in an accountable manner by metering and logging accurate resource usage information. This thesis discusses the origins and fundamentals of Grid computing and accounting service in the context of OGSA profile. A prototype was developed and evaluated based on OGSA accounting-related standards enabling sharing accounting data in a multi-Grid environment, the World-wide Large Hadron Collider Grid (WLCG). Based on this prototype and lessons learned, a generic middleware solution was also implemented as a toolkit that eases migration of existing accounting system to be standard compatible. 658.05

Search results