Global ETD Search

11	Assessment of Factors Influencing Intent-to-Use Big Data Analytics in an Organization: A Survey Study Madhlangobe, Wayne 01 January 2018 (has links) The central question was how the relationship between trust-in-technology and intent-to-use Big Data Analytics in an organization is mediated by both Perceived Risk and Perceived Usefulness. Big Data Analytics is quickly becoming a critically important driver for business success. Many organizations are increasing their Information Technology budgets on Big Data Analytics capabilities. Technology Acceptance Model stands out as a critical theoretical lens primarily due to its assessment approach and predictive explanatory capacity to explain individual behaviors in the adoption of technology. Big Data Analytics use in this study was considered a voluntary act, therefore, well aligned with the Theory of Reasoned Action and the Technology Acceptance Model. Both theories have validated the relationships between beliefs, attitudes, intentions and usage behavior. Predicting intent-to-use Big Data Analytics is a broad phenomenon covering multiple disciplines in literature. Therefore, a robust methodology was employed to explore the richness of the topic. A deterministic philosophical approach was applied using a survey method approach as an exploratory study which is a variant of the mixed methods sequential exploratory design. The research approach consisted of two phases: instrument development and quantitative. The instrument development phase was anchored with a systemic literature review to develop an instrument and ended with a pilot study. The pilot study was instrumental in improving the tool and switching from a planned covariance-based SEM approach to PLS-SEM for data analysis. A total of 277 valid observations were collected. PLS-SEM was leveraged for data analysis because of the prediction focus of the study and the requirement to assess both reflective and formative measures in the same research model. The measurement and structural models were tested using the PLS algorithm. R2, f2, and Q2 were used as the basis for the acceptable fit measurement. Based on the valid structural model and after running the bootstrapping procedure, Perceived Risk has no mediating effect on Trust-in-Technology on Intent-to-Use. Perceived Usefulness has a full mediating effect. Level of education, training, experience and the perceived capability of analytics within an organization are good predictors of Trust-in-Technology. analytics big data Big Data Analytics information science Intent to Use big data analytics perceived capability Trust-in-Technology Computer Sciences
12	Data Science Professionals’ Innovation with Big Data Analytics: The Essential Role of Commitment and Organizational Context Abouei, Mahdi January 2023 (has links) Implementing Big Data Analytics (BDA) has been widely known as a major source of competitiveness and innovation. While previous research suggests several process models and identifies critical factors for the successful implementation of BDA, there is a lack of understanding of how this organizational process is realized by its primary recipients, that is, Data Science Professionals (DSPs) whose innovation with BDA technologies stands at the core of big data-driven innovation. In particular, far less understood are the motivational and contextual factors that derive DSPs’ innovation with BDA technologies. This study proposes that commitment is the force that can attach DSPs to the BDA implementation process and motivate them to engage in innovative behaviors. It also introduces two organizational mechanisms, namely, BDA communication reciprocity and BDA leader theme-specific reputation, that can be employed to develop this constructive force in DSPs. Inspired by this, a theoretical model was developed based on the assertions of Commitment in Workplace Theory and the literature on creativity in organizations to assess the impact of DSPs’ commitment to BDA implementation and organizational context on their innovation with BDA technologies. This study theorizes that communication reciprocity and leader theme-specific reputation influence the three components of DSPs’ commitment (affective, continuance, and normative) to BDA implementation through their perceived participation in organizational decision-making and positive uncertainty, which, in turn, derive DSP’s innovation with BDA technologies. To further enrich the theorization, the moderating role of DSPs’ competency on the effect of DSPs’ components of commitment on their innovation with BDA technologies is investigated. Predictions were tested following an experimental vignette methodology with 240 subjects where the two organizational mechanisms were manipulated. Results indicate that organizational mechanisms provoke mediating psychological perceptions, though with varying strengths. In addition, results suggest that DSPs’ innovation with BDA technologies is primarily rooted in their affective and continuance commitments, and DSPs’ competency interacts with DSPs’ affective commitment to affect their innovation with BDA technologies. This research enhances the theoretical understanding of the role of commitment and organizational context in fostering DSPs’ innovation with BDA technologies. The results of this study also offer suggestions for information systems implementation practitioners on the effectiveness of organizational mechanisms that facilitate big data-driven innovation. / Thesis / Doctor of Philosophy (PhD) Data science professionals Organizational context
13	Det binära guldet : en uppsats om big data och analytics Hellström, Elin, Hemlin, My January 2013 (has links) Syftet med denna studie är att utreda begreppen big data och analytics. Utifrån vetenskapliga teorier om begreppen undersöks hur konsultföretag uppfattar och använder sig av big data och analytics. För att skapa en nyanserad bild har även en organisation inom vården undersökts för att få kunskap om hur de kan dra nytta av big data och analytics. Ett antal viktiga svårigheter och framgångsfaktorer kopplade till båda begreppen presenteras. De svårigheterna kopplas sedan ihop med en framgångsfaktor som anses kunna bidra till att lösa det problemet. De mest relevanta framgångsfaktorer som identifierats är att högkvalitativ data finns tillgänglig men även kunskap och kompetens kring hur man hanterar data. Slutligen tydliggörs begreppens innebörd där man kan se att big data oftast beskrivs ur dimensionerna volym, variation och hastighet och att analytics i de flesta fall syftar till att deskriptiv och preventiv analys genomförs. / The purpose of this study is to investigate the concepts of big data and analytics. The concepts are explored based on scientific theories and interviews with consulting firms. A healthcare organization has also been interviewed to get a richer understanding of how big data and analytics can be used to gain insights and how an organisation can benefit from them. A number of important difficulties and sucess facors connected to the concepts are presented. These difficulties are then linked to a sucess factor that is considered to solve the problem. The most relevant success factors identified are the avaliability of high quality data and knowledge and expertise on how to handle the data. Finally the concepts are clarified and one can see that big data is usually described from the dimensions volume, variety and velocity and analytics is usually described as descriptive and preventive analysis. big data analytics big data analytics business intelligence decision support systems advanced analytics big data analytics big data analytics business intelligence decision support systems advanced analytics
14	What are the Potential Impacts of Big Data, Artificial Intelligence and Machine Learning on the Auditing Profession? Evett, Chantal 01 January 2017 (has links) To maintain public confidence in the financial system, it is essential that most financial fraud is prevented and that incidents of fraud are detected and punished. The responsibility of uncovering creatively implemented fraud is placed, in a large part, on auditors. Recent advancements in technology are helping auditors turn the tide against fraudsters. Big Data, made possible by the proliferation, widespread availability and amalgamation of diverse digital data sets, has become an important driver of technological change. Big Data analytics are already transforming the traditional audit. Sampling and testing a limited number of random samples has turned into a much more comprehensive audit that analyzes the entire population of transactions within an account, allowing auditors to flag and investigate all sorts of potentially fraudulent anomalies that were previously invisible. Artificial intelligence (AI) programs, typified by IBM’s Watson, can mimic the thought processes of the human mind and will soon be adopted by the auditing profession. Machine learning (ML) programs, with the ability to change when exposed to new data, are developing rapidly and may take over many of the decision-making functions currently performed by auditors. The SEC has already implemented pioneering fraud-detection software based on AI and ML programs. The evolution of the auditor’s role has already begun. Current accounting students must understand the traditional auditing skillset will not longer be sufficient. While facing a future with fewer auditing positions available due to increased automation, auditors will need training for roles that will be more data analytical and computer-science based. Auditing Big Data Analytics Artificial Intelligence Machine Learning Fraud Accounting Accounting Technology and Innovation
15	The use of Big Data Analytics to protect Critical Information Infrastructures from Cyber-attacks Oseku-Afful, Thomas January 2016 (has links) Unfortunately, cyber-attacks, which are the consequence of our increasing dependence on digital technology, is a phenomenon that we have to live with today. As technology becomes more advanced and complex, so have the types of malware that are used in these cyber-attacks. Currently, targeted cyber-attacks directed at CIIs such as financial institutions and telecom companies are on the rise. A particular group of malware known as APTs, which are used for targeted attacks, are very difficult to detect and prevent due to their sophisticated and stealthy nature. These malwares are able to attack and wreak havoc (in the targeted system) within a matter of seconds; this is very worrying because traditional cyber security defence systems cannot handle these attacks. The solution, as proposed by some in the industry, is the use of BDA systems. However, whilst it appears that BDA has achieved greater success at large companies, little is known about success at smaller companies. Also, there is scarcity of research addressing how BDA is deployed for the purpose of detecting and preventing cyber-attacks on CII. This research examines and discusses the effectiveness of the use of BDA for detecting cyber-attacks and also describes how such a system is deployed. To establish the effectiveness of using a BDA, a survey by questionnaire was conducted. The target audience of the survey were large corporations that were likely to use such systems for cyber security. The research concludes that a BDA system is indeed a powerful and effective tool, and currently the best method for protecting CIIs against the range of stealthy cyber-attacks. Also, a description of how such a system is deployed is abstracted into a model of meaningful practice. Big data big data analytics CII CI APTs cyber-attacks cyber-security
16	Chromosome 3D Structure Modeling and New Approaches For General Statistical Inference Rongrong Zhang (5930474) 03 January 2019 (has links) <div>This thesis consists of two separate topics, which include the use of piecewise helical models for the inference of 3D spatial organizations of chromosomes and new approaches for general statistical inference. The recently developed Hi-C technology enables a genome-wide view of chromosome</div><div>spatial organizations, and has shed deep insights into genome structure and genome function. However, multiple sources of uncertainties make downstream data analysis and interpretation challenging. Specically, statistical models for inferring three-dimensional (3D) chromosomal structure from Hi-C data are far from their maturity. Most existing methods are highly over-parameterized, lacking clear interpretations, and sensitive to outliers. We propose a parsimonious, easy to interpret, and robust piecewise helical curve model for the inference of 3D chromosomal structures</div><div>from Hi-C data, for both individual topologically associated domains and whole chromosomes. When applied to a real Hi-C dataset, the piecewise helical model not only achieves much better model tting than existing models, but also reveals that geometric properties of chromatin spatial organization are closely related to genome function.</div><div><br></div><div><div>For potential applications in big data analytics and machine learning, we propose to use deep neural networks to automate the Bayesian model selection and parameter estimation procedures. Two such frameworks are developed under different scenarios. First, we construct a deep neural network-based Bayes estimator for the parameters of a given model. The neural Bayes estimator mitigates the computational challenges faced by traditional approaches for computing Bayes estimators. When applied to the generalized linear mixed models, the neural Bayes estimator</div><div>outperforms existing methods implemented in R packages and SAS procedures. Second, we construct a deep convolutional neural networks-based framework to perform</div><div>simultaneous Bayesian model selection and parameter estimation. We refer to the neural networks for model selection and parameter estimation in the framework as the</div><div>neural model selector and parameter estimator, respectively, which can be properly trained using labeled data systematically generated from candidate models. Simulation</div><div>study shows that both the neural selector and estimator demonstrate excellent performances.</div></div><div><br></div><div><div>The theory of Conditional Inferential Models (CIMs) has been introduced to combine information for efficient inference in the Inferential Models framework for priorfree</div><div>and yet valid probabilistic inference. While the general theory is subject to further development, the so-called regular CIMs are simple. We establish and prove a</div><div>necessary and sucient condition for the existence and identication of regular CIMs. More specically, it is shown that for inference based on a sample from continuous</div><div>distributions with unknown parameters, the corresponding CIM is regular if and only if the unknown parameters are generalized location and scale parameters, indexing</div><div>the transformations of an affine group.</div></div> Statistics big data analytics machine learning deep neural networks bayesian model selection
17	Fast demand response with datacenter loads: a green dimension of big data McClurg, Josiah 01 August 2017 (has links) Demand response is one of the critical technologies necessary for allowing large-scale penetration of intermittent renewable energy sources in the electric grid. Data centers are especially attractive candidates for providing flexible, real-time demand response services to the grid because they are capable of fast power ramp-rates, large dynamic range, and finely-controllable power consumption. This thesis makes a contribution toward implementing load shaping with server clusters through a detailed experimental investigation of three broadly-applicable datacenter workload scenarios. We experimentally demonstrate the eminent feasibility of datacenter demand response with a distributed video transcoding application and a simple distributed power controller. We also show that while some software power capping interfaces performed better than others, all the interfaces we investigated had the high dynamic range and low power variance required to achieve high quality power tracking. Our next investigation presents an empirical performance evaluation of algorithms that replace arithmetic operations with low-level bit operations for power-aware Big Data processing. Specifically, we compare two different data structures in terms of execution time and power efficiency: (a) a baseline design using arrays, and (b) a design using bit-slice indexing (BSI) and distributed BSI arithmetic. Across three different datasets and three popular queries, we show that the bit-slicing queries consistently outperform the array algorithm in both power efficiency and execution time. In the context of datacenter power shaping, this performance optimization enables additional power flexibility -- achieving the same or greater performance than the baseline approach, even under power constraints. The investigation of read-optimized index queries leads up to an experimental investigation of the tradeoffs among power constraint, query freshness, and update aggregation size in a dynamic big data environment. We compare several update strategies, presenting a bitmap update optimization that allows improved performance over both a baseline approach and an existing state-of-the-art update strategy. Performing this investigation in the context of load shaping, we show that read-only range queries can be served without performance impact under power cap, and index updates can be tuned to provide a flexible base load. This thesis concludes with a brief discussion of control implementation and summary of our findings. big data analytics cluster computing demand response load shaping power aware computing Electrical and Computer Engineering
18	Shared and distributed memory parallel algorithms to solve big data problems in biological, social network and spatial domain applications Sharma, Rahil 01 December 2016 (has links) Big data refers to information which cannot be processed and analyzed using traditional approaches and tools, due to 4 V's - sheer Volume, Velocity at which data is received and processed, and data Variety and Veracity. Today massive volumes of data originate in domains such as geospatial analysis, biological and social networks, etc. Hence, scalable algorithms for effcient processing of this massive data is a signicant challenge in the field of computer science. One way to achieve such effcient and scalable algorithms is by using shared & distributed memory parallel programming models. In this thesis, we present a variety of such algorithms to solve problems in various above mentioned domains. We solve five problems that fall into two categories. The first group of problems deals with the issue of community detection. Detecting communities in real world networks is of great importance because they consist of patterns that can be viewed as independent components, each of which has distinct features and can be detected based upon network structure. For example, communities in social networks can help target users for marketing purposes, provide user recommendations to connect with and join communities or forums, etc. We develop a novel sequential algorithm to accurately detect community structures in biological protein-protein interaction networks, where a community corresponds with a functional module of proteins. Generally, such sequential algorithms are computationally expensive, which makes them impractical to use for large real world networks. To address this limitation, we develop a new highly scalable Symmetric Multiprocessing (SMP) based parallel algorithm to detect high quality communities in large subsections of social networks like Facebook and Amazon. Due to the SMP architecture, however, our algorithm cannot process networks whose size is greater than the size of the RAM of a single machine. With the increasing size of social networks, community detection has become even more difficult, since network size can reach up to hundreds of millions of vertices and edges. Processing such massive networks requires several hundred gigabytes of RAM, which is only possible by adopting distributed infrastructure. To address this, we develop a novel hybrid (shared + distributed memory) parallel algorithm to efficiently detect high quality communities in massive Twitter and .uk domain networks. The second group of problems deals with the issue of effciently processing spatial Light Detection and Ranging (LiDAR) data. LiDAR data is widely used in forest and agricultural crop studies, landscape classification, 3D urban modeling, etc. Technological advancements in building LiDAR sensors have enabled highly accurate and dense LiDAR point clouds resulting in massive data volumes, which pose computing issues with processing and storage. We develop the first published landscape driven data reduction algorithm, which uses the slope-map of the terrain as a filter to reduce the data without sacrificing its accuracy. Our algorithm is highly scalable and adopts shared memory based parallel architecture. We also develop a parallel interpolation technique that is used to generate highly accurate continuous terrains, i.e. Digital Elevation Models (DEMs), from discrete LiDAR point clouds. Algorithms Big Data Analytics Distributed Systems High Performance Parallel Computing Computer Sciences
19	CrowdCloud: Combining Crowdsourcing with Cloud Computing for SLO Driven Big Data Analysis Flatt, Taylor 01 December 2017 (has links) The evolution of structured data from simple rows and columns on a spreadsheet to more complex unstructured data such as tweets, videos, voice, and others, has resulted in a need for more adaptive analytical platforms. It is estimated that upwards of 80% of data on the Internet today is unstructured. There is a drastic need for crowdsourcing platforms to perform better in the wake of the tsunami of data. We investigated the employment of a monitoring service which would allow the system take corrective action in the event the results were trending in away from meeting the accuracy, budget, and time SLOs. Initial implementation and system validation has shown that taking corrective action generally leads to a better success rate of reaching the SLOs. Having a system which can dynamically adjust internal parameters in order to perform better can lead to more harmonious interactions between humans and machine algorithms and lead to more efficient use of resources. Big Data Analytics Crowdsourcing Data Analytics Human Augmented Computing Microtask Service level objectives
20	Evaluation of Storage Systems for Big Data Analytics January 2017 (has links) abstract: Recent trends in big data storage systems show a shift from disk centric models to memory centric models. The primary challenges faced by these systems are speed, scalability, and fault tolerance. It is interesting to investigate the performance of these two models with respect to some big data applications. This thesis studies the performance of Ceph (a disk centric model) and Alluxio (a memory centric model) and evaluates whether a hybrid model provides any performance benefits with respect to big data applications. To this end, an application TechTalk is created that uses Ceph to store data and Alluxio to perform data analytics. The functionalities of the application include offline lecture storage, live recording of classes, content analysis and reference generation. The knowledge base of videos is constructed by analyzing the offline data using machine learning techniques. This training dataset provides knowledge to construct the index of an online stream. The indexed metadata enables the students to search, view and access the relevant content. The performance of the application is benchmarked in different use cases to demonstrate the benefits of the hybrid model. / Dissertation/Thesis / Masters Thesis Computer Science 2017 Computer science Alluxio Big Data Analytics Ceph Disk Centric Hybrid Memory Centric

Search results