1 |
iLORE: A Data Schema for Aggregating Disparate Sources of Computer System and Benchmark InformationHardy, Nicolas Randell 08 June 2021 (has links)
The era of modern computing has been the stage for numerous innovations that have led to cutting edge applications and systems. The characteristics of these systems and applications have been described and quantified by many, however such information is fragmented between various repositories of system and component information. In an effort to collate these disparate collections of information we propose iLORE, an extensible data framework for representing computer systems and their components. We describe the iLORE framework and the pipeline used to aggregate, clean, and insert system and component information into a database that uses iLORE's framework. Additionally, we demonstrate how the database can be used to analyze trends in computing by validating the collected data using previous works, and by showcasing new analyses that were created with said data. Analyses and visualizations created via iLORE are available at csgenome.org. / Master of Science / The era of modern computing has been the stage for numerous innovations that have led to cutting edge applications and computer systems. The characteristics of these systems and applications have been described and quantified by many, however such information is fragmented amongst different websites and databases. We propose iLORE, an extensible data framework for representing computer systems and their components. We describe the iLORE framework and the steps taken to create an iLORE database: aggregation, standardization, and insertion. Additionally, we demonstrate how the database can be used to analyze trends in computing by validating the collected data using previous works, and by showcasing new analyses that were created with said data. Analyses and visualizations created via iLORE are available at csgenome.org.
|
2 |
A strategic approach of value identification for a big data projectLakoju, Mike January 2017 (has links)
The disruptive nature of innovations and technological advancements present potentially huge benefits, however, it is critical to take caution because they also come with challenges. This author holds fast to the school of thought which suggests that every organisation or society should properly evaluate innovations and their attendant challenges from a strategic perspective, before adopting them, or else could get blindsided by the after effects. Big Data is one of such innovations, currently trending within industry and academia. The instinctive nature of Organizations compels them to constantly find new ways to stay ahead of the competition. It is for this reason, that some incoherencies exist in the field of big data. While on the one hand, we have some Organizations rushing into implementing Big Data Projects, we also have in possibly equal measure, many other organisations that remain sceptical and uncertain of the benefits of "Big Data" in general and are also concerned with the implementation costs. What this has done is, create a huge focus on the area of Big Data Implementation. Literature reveals a good number of challenges around Big Data project implementations. For example, most Big Data projects are either abandoned or do not hit their expected target. Unfortunately, most IS literature has focused on implementation methodologies that are primarily focused on the data, resources, Big Data infrastructures, algorithms etc. Rather than leaving the incoherent space that exists to remain, this research seeks to collapse the space and open opportunities to harness and expand knowledge. Consequently, the research takes a slightly different standpoint by approaching Big Data implementation from a Strategic Perspective. The author emphasises the fact that focus should be shifted from going straight into implementing Big Data projects to first implementing a Big Data Strategy for the Organization. Before implementation, this strategy step will create the value proposition and identify deliverables to justify the project. To this end, the researcher combines an Alignment theory, with Digital Business Strategy theory to create a Big Data Strategy Framework that Organisations could use to align their business strategy with the Big Data project. The Framework was tested in two case studies, and the study resulted in the generation of the strategic Big Data Goals for both case studies. This Big Data Strategy framework aided the organisation in identifying the potential value that could be obtained from their Big Data project. These Strategic Big Data Goals can now be implemented in Big data Projects.
|
3 |
Managing food security through food waste and loss: Small data to big dataIrani, Zahir, Sharif, Amir M., Lee, Habin, Aktas, E., Topaloğlu, Z., van't Wout, T. 11 March 2017 (has links)
Yes / This paper provides a management perspective of organisational factors that contributes to the reduction of food waste through the application of design science principles to explore causal relationships between food distribution (organisational) and consumption (societal) factors. Qualitative data were collected with an organisational perspective from commercial food consumers along with large-scale food importers, distributors, and retailers. Cause-effect models are built and “what-if” simulations are conducted through the development and application of a Fuzzy Cognitive Map (FCM) approaches to elucidate dynamic interrelationships. The simulation models developed provide a practical insight into existing and emergent food losses scenarios, suggesting the need for big data sets to allow for generalizable findings to be extrapolated from a more detailed quantitative exercise. This research offers itself as evidence to support policy makers in the development of policies that facilitate interventions to reduce food losses. It also contributes to the literature through sustaining, impacting and potentially improving levels of food security, underpinned by empirically constructed policy models that identify potential behavioural changes. It is the extension of these simulation models set against a backdrop of a proposed big data framework for food security, where this study sets avenues for future research for others to design and construct big data research in food supply chains. This research has therefore sought to provide policymakers with a means to evaluate new and existing policies, whilst also offering a practical basis through which food chains can be made more resilient through the consideration of management practices and policy decisions.
|
4 |
Integrative Analysis of Genomic Aberrations in Cancer and Xenograft ModelsJanuary 2015 (has links)
abstract: No two cancers are alike. Cancer is a dynamic and heterogeneous disease, such heterogeneity arise among patients with the same cancer type, among cancer cells within the same individual’s tumor and even among cells within the same sub-clone over time. The recent application of next-generation sequencing and precision medicine techniques is the driving force to uncover the complexity of cancer and the best clinical practice. The core concept of precision medicine is to move away from crowd-based, best-for-most treatment and take individual variability into account when optimizing the prevention and treatment strategies. Next-generation sequencing is the method to sift through the entire 3 billion letters of each patient’s DNA genetic code in a massively parallel fashion.
The deluge of next-generation sequencing data nowadays has shifted the bottleneck of cancer research from multiple “-omics” data collection to integrative analysis and data interpretation. In this dissertation, I attempt to address two distinct, but dependent, challenges. The first is to design specific computational algorithms and tools that can process and extract useful information from the raw data in an efficient, robust, and reproducible manner. The second challenge is to develop high-level computational methods and data frameworks for integrating and interpreting these data. Specifically, Chapter 2 presents a tool called Snipea (SNv Integration, Prioritization, Ensemble, and Annotation) to further identify, prioritize and annotate somatic SNVs (Single Nucleotide Variant) called from multiple variant callers. Chapter 3 describes a novel alignment-based algorithm to accurately and losslessly classify sequencing reads from xenograft models. Chapter 4 describes a direct and biologically motivated framework and associated methods for identification of putative aberrations causing survival difference in GBM patients by integrating whole-genome sequencing, exome sequencing, RNA-Sequencing, methylation array and clinical data. Lastly, chapter 5 explores longitudinal and intratumor heterogeneity studies to reveal the temporal and spatial context of tumor evolution. The long-term goal is to help patients with cancer, particularly those who are in front of us today. Genome-based analysis of the patient tumor can identify genomic alterations unique to each patient’s tumor that are candidate therapeutic targets to decrease therapy resistance and improve clinical outcome. / Dissertation/Thesis / Doctoral Dissertation Biomedical Informatics 2015
|
5 |
APPLICATION OF BIG DATA ANALYTICS FRAMEWORK FOR ENHANCING CUSTOMER EXPERIENCE ON E-COMMERCE SHOPPING PORTALSNimita Shyamsunder Atal (8785316) 01 May 2020 (has links)
<div>
<p>E-commerce
organizations, these days, need to keep striving for constant innovation.
Customers have a massive impact on the performance of an organization, so
industries need to have solid customer retention strategies. Various big data
analytics methodologies are being used by organizations to improve overall
online customer experience. While there are multiple techniques available, this
research study utilized and tested a framework proposed by Laux et al. (2017),
which combines Big Data and Six Sigma methodologies, to the e-commerce domain
for identification of issues faced by the customer; this was done by analyzing
online product reviews and ratings of customers to provide improvement
strategies for enhancing customer experience. </p>
<p>Analysis performed
on the data showed that approximately 90% of the customer reviews had positive
polarity. Among the factors which were identified to have affected the opinions
of the customers, the Rating field had the most impact on the sentiments of the
users and it was found to be statistically significant. Upon further analysis of
reviews with lower rating, the results attained showed that the major issues
faced by customers were related to the product itself; most issues were more
specifically about the size/fit of the product, followed by the product
quality, material used, how the product looked on the online portal versus how
it looked in reality, and its price concerning the quality.</p>
</div>
<br>
|
6 |
Closing the Gaps in Professional Development: A Tool for School-based Leadership TeamsSampayo, Sandra 01 January 2015 (has links)
The field of professional learning in education has been studied and added to extensively in the last few decades. Because the importance of learning in authentic contexts through professional dialogue has become so important, high quality, school-based professional learning is vital to building capacity at the school level. Unfortunately, the literature on professional development (PD) does not provide much guidance on how to bridge theory and practice at the school level, creating a gap. With the goal of PD ultimately being to improve teacher performance and student learning, the problem with this gap is that school-level professional development is arbitrarily planned, resulting in variable outcomes. I propose the reason for this is schools lack a comprehensive framework or tool that guides the design of a quality professional learning plan. This problem was identified in Orange County Public School and this dissertation in practice aims at developing a solution that accounts for the district*s specific contextual needs. My proposed solution is the design of an integrative tool that school leaders can use to guide them through the professional development planning process. The School-based Professional Learning Design Tool incorporates the professional development standards in planning, learning, implementing, and evaluating outlined in the Florida Professional Development System Evaluation Protocol. It also guides leaders in taking an inventory of the culture and context of their school in order to plan PD that will be viable given those considerations. The components of the Tool guide teams through assessing school teacher performance and student achievement data to help identify focus groups; determining gaps in learning through root cause analysis; creating goals aligned to gaps in performance; and selecting strategies for professional learning, follow-up support, and evaluation. The development of the Tool was informed by the extant literature on professional development, organizational theory, state and national standards for professional development, and principles of design. The Tool is to be completed in four phases. Phases one and two, the focus of this paper, include the literature review, organizational assessment, design specifications, and the first iteration of the Tool. In the next phases, the goals are to solicit feedback from an expert panel review, create a complete version of the Tool, and pilot it in elementary schools. Although the development of the Tool through its final phases will refine it considerably, there are limitations that will transcend all iterations. While the Tool incorporates best practices in professional development, the lack of empirical evidence on the effectiveness of specific PD elements in the literature renders this Tool only a best guess in helping schools plan effective professional development. Another limitation is that the Tool is not prescriptive and cannot use school data to make decisions for what strategies to implement. Taking these limitations into consideration, the use of this Tool can significantly impact the quality and effectiveness of professional development in schools.
|
7 |
EXPLOITING THE SPATIAL DIMENSION OF BIG DATA JOBS FOR EFFICIENT CLUSTER JOB SCHEDULINGAkshay Jajoo (9530630) 16 December 2020 (has links)
With the growing business impact of distributed big data analytics jobs, it has become crucial to optimize their execution and resource consumption. In most cases, such jobs consist of multiple sub-entities called tasks and are executed online in a large shared distributed computing system. The ability to accurately estimate runtime properties and coordinate execution of sub-entities of a job allows a scheduler to efficiently schedule jobs for optimal scheduling. This thesis presents the first study that highlights spatial dimension, an inherent property of distributed jobs, and underscores its importance in efficient cluster job scheduling. We develop two new classes of spatial dimension based algorithms to<br>address the two primary challenges of cluster scheduling. First, we propose, validate, and design two complete systems that employ learning algorithms exploiting spatial dimension. We demonstrate high similarity in runtime properties between sub-entities of the same job by detailed trace analysis on four different industrial cluster traces. We identify design challenges and propose principles for a sampling based learning system for two examples, first for a coflow scheduler, and second for a cluster job scheduler.<br>We also propose, design, and demonstrate the effectiveness of new multi-task scheduling algorithms based on effective synchronization across the spatial dimension. We underline and validate by experimental analysis the importance of synchronization between sub-entities (flows, tasks) of a distributed entity (coflow, data analytics jobs) for its efficient execution. We also highlight that by not considering sibling sub-entities when scheduling something it may also lead to sub-optimal overall cluster performance. We propose, design, and implement a full coflow scheduler based on these assertions.
|
Page generated in 0.0679 seconds