Global ETD Search

81	SYSTEMS SUPPORT FOR DATA ANALYTICS BY EXPLOITING MODERN HARDWARE Hongyu Miao (11751590) 03 December 2021 (has links) <p>A large volume of data is continuously being generated by data centers, humans, and the internet of things (IoT). In order to get useful insights, such enormous data must be processed in time with high throughput, low latency, and high accuracy. To meet such performance demands, a large body of new hardware is being shipped by vendors, such as multi-core CPUs, 3D-stacked memory, embedded microcontrollers, and other accelerators.</p><br><p>However, traditional operating systems (OSes) and data analytics frameworks, the key layer that bridges high-level data processing applications and low-level hardware, fails to deliver these requirements due to quickly evolving new hardware and increases in explosion of data. For instance, general OSes are not aware of the unique characters and demands of data processing applications. Data analytics engines for stream processing, e.g., Apache Spark and Beam, always add more machines to deal with more data but leave every single machine underutilized without fully exploiting underlying hardware features, which leads to poor efficiency. Data analytics frameworks for machine learning inference on IoT devices cannot run neural networks that exceed SRAM size, which disqualifies many important use cases.</p><br><p>In order to bridge the gap between the performance demands of data analytics and the new features of emerging hardware, in this thesis we exploit runtime system designs for high-level data processing applications by exploiting low-level modern hardware features. We study two important data analytics applications, including real-time stream processing and on-device machine learning inference, on three important hardware platforms across the Cloud and the Edge, including multicore CPUs, hybrid memory system combining 3D-stacked memory and general DRAM, and embedded microcontrollers with limited resources. </p><br><p>In order to speed up and enable the two data analytics applications on the three hardware platforms, this thesis contributes three related research projects. In project StreamBox, we exploit the parallelism and memory hierarchy of modern multicore hardware on single machines for stream processing, achieving scalable and highly efficient performance. In project StreamBox-HBM, we exploit hybrid memories to balance bandwidth and latency, achieving memory scalability and highly efficient performance. StreamBox and StreamBox-HBM both offer orders of magnitude performance improvements over the prior state of the art, opening up new applications with higher data processing needs. In project SwapNN, we investigate a system solution for microcontrollers (MCUs) to execute neural networks (NNs) inference out-of-core without losing accuracy, enabling new use cases and significantly expanding the scope of NN inference on tiny MCUs. </p><br><p>We report the system designs, system implementations, and experimental results. Based on our experience in building above systems, we provide general guidance on designing runtime systems across hardware/software stack for a wider range of new applications on future hardware platforms.</p><div><br></div> Computer Engineering Computer systems Machine learning Multicore CPU High bandwidth memory Hybrid memory Microcontrollers Data analytics
82	HRD Professionals' Experience Utilizing Data Analytics in the Training Evaluation Process Anthony E Randolph (11831450) 18 December 2021 (has links) <p>In the past, Human Research Development (HRD) professionals have faced barriers of gaining access to the data they need to conduct higher level evaluations. However, recent technological innovations have presented opportunities for them to obtain this data, and consequently, apply new approaches for the training evaluation process. One approach being used is the application of data analytics. Because organizations have begun to embrace its use, recent research activities in the literature have focused on the promotion of analytics versus the practical application of analytics in the organization.<b> </b>This study investigated how HRD professionals utilize data analytics in the training evaluation process. It contributes to the body of research on the practical application of analytics in determining training effectiveness. The Unified Theory of Acceptance and Use of Technology (UTAUT) and Sociomateriality served as the theoretical framework for understanding how HRD professionals use data analytics in the training evaluation process. To address the research objective, a qualitative descriptive design was employed to investigate the phenomenon of lived experience, how HRD professionals use data analytics in the training evaluation process. Data were collected through semi-structured interviews with six (6) participants who were front and center in the organization’s transition to the analytics tool, Metrics That Matter (MTM), for evaluating training initiatives. The thematic analysis approach was applied. The study findings suggest three factors that influenced HR professionals to use human resource analytics, while revealing four ways they used those analytics in the training evaluation process. More importantly, findings from this study will provide training departments and HRD professionals recommendations for expanded job role and/or function descriptions, as well as best practices for incorporating data analytics in the training evaluation process.</p> Human Resources Management Human Resources Development Organizational Training Data Analytics Training Evaluation Process Training and Development
83	Turbine Generator Performance Dashboard for Predictive Maintenance Strategies Emily R Rada (11813852) 19 December 2021 (has links) <div>Equipment health is the root of productivity and profitability in a company; through the use of machine learning and advancements in computing power, a maintenance strategy known as Predictive Maintenance (PdM) has emerged. The predictive maintenance approach utilizes performance and condition data to forecast necessary machine repairs. Predicting maintenance needs reduces the likelihood of operational errors, aids in the avoidance of production failures, and allows for preplanned outages. The PdM strategy is based on machine-specific data, which proves to be a valuable tool. The machine data provides quantitative proof of operation patterns and production while offering machine health insights that may otherwise go unnoticed.</div><div><br> </div><div>Purdue University's Wade Utility Plant is responsible for providing reliable utility services for the campus community. The Wade Utility Plant has invested in an equipment monitoring system for a thirty-megawatt turbine generator. The equipment monitoring system records operational and performance data as the turbine generator supplies campus with electricity and high-pressure steam. Unplanned and surprise maintenance needs in the turbine generator hinder utility production and lessen the dependability of the system.</div><div><br> </div> The work of this study leverages the turbine generator data the Wade Utility Plant records and stores, to justify equipment care and provide early error detection at an in-house level. The research collects and aggregates operational, monitoring and performance-based data for the turbine generator in Microsoft Excel, creating a dashboard which visually displays and statistically monitors variables for discrepancies. The dashboard records ninety days of data, tracked hourly, determining averages, extrema, and alerting the user as data approaches recommended warning levels. Microsoft Excel offers a low-cost and accessible platform for data collection and analysis providing an adaptable and comprehensible collection of data from a turbine generator. The dashboard offers visual trends, simple statistics, and status updates using 90 days of user selected data. This dashboard offers the ability to forecast maintenance needs, plan work outages, and adjust operations while continuing to provide reliable services that meet Purdue University's utility demands. <br> Mechanical Engineering Predictive Maintenance Mechanical Engineering Turbine Generators Preformance Dashboard Big Data analytics applications error detection
84	Data-Driven Decision Support Systems for Product Development - A Data Exploration Study Using Machine Learning Aeddula, Omsri January 2021 (has links) Modern product development is a complex chain of events and decisions. The ongoing digital transformation of society, increasing demands in innovative solutions puts pressure on organizations to maintain, or increase competitiveness. As a consequence, a major challenge in the product development is the search for information, analysis, and the build of knowledge. This is even more challenging when the design element comprises complex structural hierarchy and limited data generation capabilities. This challenge is even more pronounced in the conceptual stage of product development where information is scarce, vague, and potentially conflicting. The ability to conduct exploration of high-level useful information using a machine learning approach in the conceptual design stage would hence enhance be of importance to support the design decision-makers, where the decisions made at this stage impact the success of overall product development process. The thesis aims to investigate the conceptual stage of product development, proposing methods and tools in order to support the decision-making process by the building of data-driven decision support systems. The study highlights how the data can be utilized and visualized to extract useful information in design exploration studies at the conceptual stage of product development. The ability to build data-driven decision support systems in the early phases facilitates more informed decisions. The thesis presents initial descriptive study findings from the empirical studies, showing the capabilities of the machine learning approaches in extracting useful information, and building data-driven decision support systems. The thesis initially describes how the linear regression model and artificial neural networks extract useful information in design exploration, providing support for the decision-makers to understand the consequences of the design choices through cause-and-effect relationships on a detailed level. Furthermore, the presented approach also provides input to a novel visualization construct intended to enhance comprehensibility within cross-functional design teams. The thesis further studies how the data can be augmented and analyzed to extract the necessary information from an existing design element to support the decision-making process in an oral healthcare context. Product Development Data-driven DSS Machine Learning Conceptual Stage Data Analytics Mechanical Engineering Maskinteknik
85	Methods in intelligent transportation systems exploiting vehicle connectivity, autonomy and roadway data Zhang, Yue 29 September 2019 (has links) Intelligent transportation systems involve a variety of information and control systems methodologies, from cooperative systems which aim at traffic flow optimization by means of vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) communication, to information fusion from multiple traffic sensing modalities. This thesis aims to address three problems in intelligent transportation systems, one in optimal control of connected automated vehicles, one in discrete-event and hybrid traffic simulation model, and one in sensing and classifying roadway obstacles in smart cities. The first set of problems addressed relates to optimally controlling connected automated vehicles (CAVs) crossing an urban intersection without any explicit traffic signaling. A decentralized optimal control framework is established whereby, under proper coordination among CAVs, each CAV can jointly minimize its energy consumption and travel time subject to hard safety constraints. A closed-form analytical solution is derived while taking speed, control, and safety constraints into consideration. The analytical solution of each such problem, when it exists, yields the optimal CAV acceleration/deceleration. The framework is capable of accommodating for turns and ensures the absence of collisions. In the meantime, a measurement of passenger comfort is taken into account while the vehicles make turns. In addition to the first-in-first-out (FIFO) ordering structure, the concept of dynamic resequencing is introduced which aims at further increasing the traffic throughput. This thesis also studies the impact of CAVs and shows the benefit that can be achieved by incorporating CAVs to conventional traffic. To validate the effectiveness of the proposed solution, a discrete-event and hybrid simulation framework based on SimEvents is proposed, which facilitates safety and performance evaluation of an intelligent transportation system. The traffic simulation model enables traffic study at the microscopic level, including new control algorithms for CAVs under different traffic scenarios, the event-driven aspects of transportation systems, and the effects of communication delays. The framework spans multiple toolboxes including MATLAB, Simulink, and SimEvents. In another direction, an unsupervised anomaly detection system is developed based on data collected through the Street Bump smartphone application. The system, which is built based on signal processing techniques and the concept of information entropy, is capable of generating a prioritized list of roadway obstacles, such that the higher-ranked entries are most likely to be actionable bumps (e.g., potholes) requiring immediate attention, while those lower-ranked are most likely to be nonactionable bumps(e.g., flat castings, cobblestone streets, speed bumps) for which no immediate action is needed. This system enables the City to efficiently prioritize repairs. Results on an actual data set provided by the City of Boston illustrate the feasibility and effectiveness of the system in practice. Electrical engineering Autonomous vehicles Data analytics Motion planning Traffic control Trajectory optimization
86	Modelo de madurez de analítica de datos para el sector financiero / Data Analytics Maturity Model for Financial Sector Companies Perales Manrique, Jonathan Hernán, Molina Chirinos, Jorge Alonso 02 March 2020 (has links) La analítica de datos permite a las organizaciones del sector financiero obtener una ventaja competitiva a través de procesos destinados a obtener datos, procesarlos y mostrarlos como información valiosa para comprender el comportamiento de sus clientes y estar preparados contra riesgos como el lavado de dinero, el fraude crediticio, entre otros. Sin embargo, las organizaciones no pueden identificar fácilmente las brechas relacionadas con el personal, los sistemas de información y los procesos comerciales que obstaculizan la mejora de su entorno de analítica de datos. En este contexto, los modelos de madurez evalúan, con base en criterios definidos, el estado actual de una organización e identifican su nivel de madurez para mejorar en función de los hallazgos. En este documento, se propone un modelo de madurez para identificar brechas en el entorno analítico de las compañías financieras que conducen a la reducción de estas. Este modelo incluye artefactos y criterios de evaluación centrados en tecnología, gobernanza, gestión de datos, cultura y analítica en sí, lo que proporciona un proceso de diagnóstico más amplio y estructurado con respecto al entorno analítico. El modelo propuesto se probó en tres empresas del sector financiero peruano y los resultados sugieren que los especialistas obtuvieron una perspectiva más clara que sus pensamientos iniciales sobre la situación del entorno analítico de sus empresas. / Data analytics allows organizations in the financial sector to gain a competitive advantage through processes aimed at obtaining data, processing them and displaying them as valuable information to understand the behavior of their clients and to be prepared against risks as money laundering, credit fraud, among others. However, organizations cannot easily identify gaps related to personnel, information systems and business processes that hinder the improvement of their data analytics environment. In this context, maturity models evaluate, based on defined criteria, the current state of an organization and identify its maturity level in order to improve based on the findings. In this paper, a maturity model is proposed to identify gaps in analytics environment of financial companies that lead to the reduction of these. This model includes artifacts and evaluation criteria focused on technology, governance, data management, culture and analytics itself, which gives a broader and structured diagnosis process with respect to the analytics environment. The proposed model was tested in three companies of Peruvian financial sector and the results suggest that the specialists obtained a clearer perspective than their initial thoughts on the situation of the analytics environment of their companies. / Tesis Analítica de datos Modelo de madurez Sector financiero Data analytics Maturity model Financial sector
87	Big Data Analytics of City Wide Building Energy Declarations MA, YIXIAO January 2015 (has links) This thesis explores the building energy performance of the domestic sector in the city of Stockholm based on the building energy declaration database. The aims of this master thesis are to analyze the big data sets of around 20,000 buildings in Stockholm region, explore the correlation between building energy performance and different internal and external affecting factors on building energy consumption, such as building energy systems, building vintages and etc. By using clustering method, buildings with different energy consumptions can be easily identified. Thereafter, energy saving potential is estimated by setting step-by-step target, while feasible energy saving solutions can also be proposed in order to drive building energy performance at city level. A brief introduction of several key concepts, energy consumption in buildings, building energy declaration and big data, serves as the background information, which helps to clarify the necessity of conducting this master thesis. The methods used in this thesis include data processing, descriptive analysis, regression analysis, clustering analysis and energy saving potential analysis. The provided building energy declaration data is firstly processed in MS Excel then reorganized in MS Access. As for the data analysis process, IBM SPSS is further introduced for the descriptive analysis and graphical representation. By defining different energy performance indicators, the descriptive analysis presents the energy consumption and composition for different building classifications. The results also give the application details of different ventilation systems in different building types. Thereafter, the correlation between building energy performance and five different independent variables is analyzed by using a linear regression model. Clustering analysis is further performed on studied buildings for the purpose of targeting low energy efficiency groups, and the buildings with various energy consumptions are well identified and grouped based on their energy performance. It proves that clustering method is quite useful in the big data analysis, however some parameters in the process of clustering needs to be further adjusted in order to achieve more satisfied results. Energy saving potential for the studied buildings is calculated as well. The conclusion shows that the maximal potential for energy savings in the studied buildings is estimated at 43% (2.35 TWh) for residential buildings and 54% (1.68 TWh) for non-residential premises, and the saving potential is calculated for different building categories and different clusters as well. smart city big data analytics building energy declaration the city of Stockholm Energy Systems Energisystem Environmental Management Miljöledning
88	A Hybrid Infrastructure of Enterprise Architecture and Business Intelligence & Analytics to Empower Knowledge Management in Education Moscoso-Zea, Oswaldo 09 May 2019 (has links) The large volumes of data (Big Data) that are generated on a global scale and within organizations along with the knowledge that resides in people and in business processes makes organizational knowledge management (KM) very complex. A right KM can be a source of opportunities and competitive advantage for organizations that use their data intelligently and subsequently generate knowledge with them. Two of the fields that support KM and that have had accelerated growth in recent years are business intelligence (BI) and enterprise architecture (EA). On the one hand, BI allows taking advantage of the information stored in data warehouses using different operations such as slice, dice, roll-up, and drill-down. This information is obtained from the operational databases through an extraction, transformation, and loading (ETL) process. On the other hand, EA allows institutions to establish methods that support the creation, sharing and transfer of knowledge that resides in people and processes through the use of blueprints and models. One of the objectives of KM is to create a culture where tacit knowledge (knowledge that resides in a person) stays in an organization when qualified and expert personnel leave the institution or when changes are required in the organizational structure, in computer applications or in the technological infrastructure. In higher education institutions (HEIs) not having an adequate KM approach to handle data is even a greater problem due to the nature of this industry. Generally, HEIs have very little interdependence between departments and faculties. In other words, there is low standardization, redundancy of information, and constant duplicity of applications and functionalities in the different departments which causes inefficient organizations. That is why the research performed within this dissertation has focused on finding an adequate KM method and researching on the right technological infrastructure that supports the management of information of all the knowledge dimensions such as people, processes and technology. All of this with the objective to discover innovative mechanisms to improve education and the service that HEIs offer to their students and teachers by improving their processes. Despite the existence of some initiatives, and papers on KM frameworks, we were not able to find a standard framework that supports or guides KM initiatives. In addition, KM frameworks found in the literature do not present practical mechanisms to gather and analyze all the knowledge dimensions to facilitate the implementation of KM projects. The core contribution of this thesis is a hybrid infrastructure of KM based on EA and BI that was developed from research using an empirical approach and taking as reference the framework developed for KM. The proposed infrastructure will help HEIs to improve education in a general way by analyzing reliable and cleaned data and integrating analytics from the perspective of EA. EA analytics takes into account the interdependence between the objects that make up the organization: people, processes, applications, and technology. Through the presented infrastructure, the doors are opened for the realization of different research projects that increment the type of knowledge that is generated by integrating the information of the applications found in the data warehouses together with the information of the people and the organizational processes that are found in the EA repositories. In order to validate the proposal, a case study was carried out within a university with promising initial results. As future works, it is planned that different HEIs' activities can be automated through a software development methodology based on EA models. In addition, it is desired to develop a KM system that allows the generation of different and new types of analytics, which would be impossible to obtain with only transactional or multidimensional databases. Enterprise Architecture Business Intelligence Data Analytics Knowledge Management Lenguajes y Sistemas Informáticos
89	Marketing Research in the 21st Century: Opportunities and Challenges Hair, Joe F., Harrison, Dana E., Risher, Jeffrey J. 01 October 2018 (has links) The role of marketing is evolving rapidly, and design and analysis methods used by marketing researchers are also changing. These changes are emerging from transformations in management skills, technological innovations, and continuously evolving customer behavior. But perhaps the most substantial driver of these changes is the emergence of big data and the analytical methods used to examine and understand the data. To continue being relevant, marketing research must remain as dynamic as the markets themselves and adapt accordingly to the following: Data will continue increasing exponentially; data quality will improve; analytics will be more powerful, easier to use, and more widely used; management and customer decisions will increasingly be knowledge-based; privacy issues and challenges will be both a problem and an opportunity as organizations develop their analytics skills; data analytics will become firmly established as a competitive advantage, both in the marketing research industry and in academics; and for the foreseeable future, the demand for highly trained data scientists will exceed the supply. big data data analytics data quality marketing analytics marketing research Management and Marketing
90	Finding co-workers with similar competencies through data clustering / Att upptäcka medarbetare med liknande kompetensprofil via dataklustring Skoglund, Oskar January 2022 (has links) In this thesis, data clustering techniques are applied to a competence database from the company Combitech. The goal of the clustering is to connect co-workers with similar competencies and competence areas in order to enable more skill sharing. This is accomplished by implementing and evaluating three clustering algorithms, k-modes, DBSCAN, and ROCK. The clustering algorithms are fine-tuned with the use of three internal validity indices, the Dunn, Silhouette, and Davies-Bouldin score. Finally, a form regarding the clustering of the three algorithms is sent out to the co-workers, which the clustering is based on, in order to obtain external validation by calculating the clustering accuracy. The results from the internal validity indices show that ROCK and DBSCAN create the most separated and dense clusters. The results from the form show that ROCK is the most accurate of the three algorithms, with an accuracy of 94%, followed by k-modes at 58% and DBSCAN at 40% accuracy. However, the visualization of the clusters shows that both ROCK and DBSCAN create one very big cluster, which is not desirable. This was not the case for k-modes, where the clusters are more evenly sized while still being fairly well-separated. In general, the results show that it is possible to use data clustering techniques to connect people with similar competencies and that the predicted clusters agree fairly well with the gold-standard data from the co-workers. However, the results are very dependent on the choice of algorithm and parametric values, and thus have to be chosen carefully. Data analytics data clustering k-modes DBSCAN ROCK Computer and Information Sciences Data- och informationsvetenskap

Search results