• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 158
  • 18
  • 8
  • 6
  • 5
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 275
  • 275
  • 116
  • 65
  • 56
  • 49
  • 47
  • 47
  • 44
  • 43
  • 38
  • 31
  • 30
  • 29
  • 29
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

Advanced Data Analytics Methodologies for Anomaly Detection in Multivariate Time Series Vehicle Operating Data

Alizadeh, Morteza 06 August 2021 (has links)
Early detection of faults in the vehicle operating systems is a research domain of high significance to sustain full control of the systems since anomalous behaviors usually result in performance loss for a long time before detecting them as critical failures. In other words, operating systems exhibit degradation when failure begins to occur. Indeed, multiple presences of the failures in the system performance are not only anomalous behavior signals but also show that taking maintenance actions to keep the system performance is vital. Maintaining the systems in the nominal performance for the lifetime with the lowest maintenance cost is extremely challenging and it is important to be aware of imminent failure before it arises and implement the best countermeasures to avoid extra losses. In this context, the timely anomaly detection of the performance of the operating system is worthy of investigation. Early detection of imminent anomalous behaviors of the operating system is difficult without appropriate modeling, prediction, and analysis of the time series records of the system. Data based technologies have prepared a great foundation to develop advanced methods for modeling and prediction of time series data streams. In this research, we propose novel methodologies to predict the patterns of multivariate time series operational data of the vehicle and recognize the second-wise unhealthy states. These approaches help with the early detection of abnormalities in the behavior of the vehicle based on multiple data channels whose second-wise records for different functional working groups in the operating systems of the vehicle. Furthermore, a real case study data set is used to validate the accuracy of the proposed prediction and anomaly detection methodologies.
42

<b>Sample Size Determination for Subsampling in the Analysis of Big Data, Multiplicative models for confidence intervals and Free-Knot changepoint models</b>

Sheng Zhang (18468615) 11 June 2024 (has links)
<p dir="ltr">We studied the relationship between subsample size and the accuracy of resulted estimation under big data setup.</p><p dir="ltr">We also proposed a novel approach to the construction of confidence intervals based on improved concentration inequalities.</p><p dir="ltr">Lastly, we studied irregular change-point models using free-knot splines.</p>
43

Online Denoising Solutions for Forecasting Applications

Khadivi, Pejman 08 September 2016 (has links)
Dealing with noisy time series is a crucial task in many data-driven real-time applications. Due to the inaccuracies in data acquisition, time series suffer from noise and instability which leads to inaccurate forecasting results. Therefore, in order to improve the performance of time series forecasting, an important pre-processing step is the denoising of data before performing any action. In this research, we will propose various approaches to tackle the noisy time series in forecasting applications. For this purpose, we use different machine learning methods and information theoretical approaches to develop online denoising algorithms. In this dissertation, we propose four categories of time series denoising methods that can be used in different situations, depending on the noise and time series properties. In the first category, a seasonal regression technique is proposed for the denoising of time series with seasonal behavior. In the second category, multiple discrete universal denoisers are developed that can be used for the online denoising of discrete value time series. In the third category, we develop a noisy channel reversal model based on the similarities between time series forecasting and data communication and use that model to deploy an out-of-band noise filtering in forecasting applications. The last category of the proposed methods is deep-learning based denoisers. We use information theoretic concepts to analyze a general feed-forward deep neural network and to prove theoretical bounds for deep neural networks behavior. Furthermore, we propose a denoising deep neural network method for the online denoising of time series. Real-world and synthetic time series are used for numerical experiments and performance evaluations. Experimental results show that the proposed methods can efficiently denoise the time series and improve their quality. / Ph. D.
44

On the Use of Grouped Covariate Regression in Oversaturated Models

Loftus, Stephen Christopher 11 December 2015 (has links)
As data collection techniques improve, oftentimes the number of covariates exceeds the number of observations. When this happens, regression models become oversaturated and, thus, inestimable. Many classical and Bayesian techniques have been designed to combat this difficulty, with various means of combating the oversaturation. However, these techniques can be tricky to implement well, difficult to interpret, and unstable. What is proposed is a technique that takes advantage of the natural clustering of variables that can often be found in biological and ecological datasets known as the omics datasests. Generally speaking, omics datasets attempt to classify host species structure or function by characterizing a group of biological molecules, such as genes (Genomics), the proteins (Proteomics), and metabolites (Metabolomics). By clustering the covariates and regressing on a single value for each cluster, the model becomes both estimable and stable. In addition, the technique can account for the variability within each cluster, allow for the inclusion of expert judgment, and provide a probability of inclusion for each cluster. / Ph. D.
45

Efficient Spatio-Temporal Network Analytics in Epidemiological Studies using Distributed Databases

Khan, Mohammed Saquib Akmal 26 January 2015 (has links)
Real-time Spatio-Temporal Analytics has become an integral part of Epidemiological studies. The size of the spatio-temporal data has been increasing tremendously over the years, gradually evolving into Big Data. The processing in such domains are highly data and compute intensive. High performance computing resources resources are actively being used to handle such workloads over massive datasets. This confluence of High performance computing and datasets with Big Data characteristics poses great challenges pertaining to data handling and processing. The resource management of supercomputers is in conflict with the data-intensive nature of spatio-temporal analytics. This is further exacerbated due to the fact that the data management is decoupled from the computing resources. Problems of these nature has provided great opportunities in the growth and development of tools and concepts centered around MapReduce based solutions. However, we believe that advanced relational concepts can still be employed to provide an effective solution to handle these issues and challenges. In this study, we explore distributed databases to efficiently handle spatio-temporal Big Data for epidemiological studies. We propose DiceX (Data Intensive Computational Epidemiology using supercomputers), which couples high-performance, Big Data and relational computing by embedding distributed data storage and processing engines within the supercomputer. It is characterized by scalable strategies for data ingestion, unified framework to setup and configure various processing engines, along with the ability to pause, materialize and restore images of a data session. In addition, we have successfully configured DiceX to support approximation algorithms from MADlib Analytics Library [54], primarily Count-Min Sketch or CM Sketch [33][34][35]. DiceX enables a new style of Big Data processing, which is centered around the use of clustered databases and exploits supercomputing resources. It can effectively exploit the cores, memory and compute nodes of supercomputers to scale processing of spatio-temporal queries on datasets of large volume. Thus, it provides a scalable and efficient tool for data management and processing of spatio-temporal data. Although DiceX has been designed for computational epidemiology, it can be easily extended to different data-intensive domains facing similar issues and challenges. We thank our external collaborators and members of the Network Dynamics and Simulation Science Laboratory (NDSSL) for their suggestions and comments. This work has been partially supported by DTRA CNIMS Contract HDTRA1-11-D-0016-0001, DTRA Validation Grant HDTRA1-11-1-0016, NSF - Network Science and Engineering Grant CNS-1011769, NIH and NIGMS - Models of Infectious Disease Agent Study Grant 5U01GM070694-11. Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the U.S. Government. / Master of Science
46

Data-Driven Park Planning: Comparative Study of Survey with Social Media Data

Sim, Jisoo 05 May 2020 (has links)
The purpose of this study was (1) to identify visitors’ behaviors in and perceptions of linear parks, (2) to identify social media users’ behaviors in and perceptions of linear parks, and (3) to compare small data with big data. This chapter discusses the main findings and their implications for practitioners such as landscape architects and urban planners. It has three sections. The first addresses the main findings in the order of the research questions at the center of the study. The second describes implications and recommendations for practitioners. The final section discusses the limitations of the study and suggests directions for future work. This study compares two methods of data collection, focused on activities and benefits. The survey asked respondents to check all the activities they did in the park. Social media users’ activities were detected by term frequency in social media data. Both results ordered the activities similarly. For example social interaction and art viewing were most popular on the High Line, then the 606, then the High Bridge according to both methods. Both methods also reported that High Line visitors engaged in viewing from overlooks the most. As for benefits, according to both methods vistors to the 606 were more satisfied than High Line visitors with the parks’ social and natural benefits. These results suggest social media analytics can replace surveys when the textual information is sufficient for analysis. Social media analytics also differ from surveys in accuracy of results. For example, social media revealed that 606 users were interested in events and worried about housing prices and crimes, but the pre-designed survey could not capture those facts. Social media analytics can also catch hidden and more general information: through cluster analysis, we found possible reasons for the High Line’s success in the arts and in the New York City itself. These results involve general information that would be hard to identify through a survey. On the other hand, surveys provide specific information and can describe visitors’ demographics, motivations, travel information, and specific benefits. For example, 606 users tend to be young, high-income, well educated, white, and female. These data cannot be collected through social media. / Doctor of Philosophy / Turning unused infrastructure into green infrastructure, such as linear parks, is not a new approach to managing brownfields. In the last few decades, changes in the industrial structure and the development of transportation have had a profound effect on urban spatial structure. As the need for infrastructure, which played an important role in the development of past industry, has decreased, many industrial sites, power plants, and military bases have become unused. This study identifies new ways of collecting information about a new type of park, linear parks, using a new method, social media analytics. The results are then compared with survey results to establish the credibility of social media analytics. Lastly, shortcomings of social media analytics are identified. This study is meaningful in helping us understand the users of new types of parks and suggesting design and planning strategies. Regarding methodology, this study also involves evaluating the use of social media analytics and its advantages, disadvantages, and reliability.
47

The impact of big data analytics on firms’ high value business performance

Popovic, A., Hackney, R., Tassabehji, Rana, Castelli, M. 2016 October 1928 (has links)
Yes / Big Data Analytics (BDA) is an emerging phenomenon with the reported potential to transform how firms manage and enhance high value businesses performance. The purpose of our study is to investigate the impact of BDA on operations management in the manufacturing sector, which is an acknowledged infrequently researched context. Using an interpretive qualitative approach, this empirical study leverages a comparative case study of three manufacturing companies with varying levels of BDA usage (experimental, moderate and heavy). The information technology (IT) business value literature and a resource based view informed the development of our research propositions and the conceptual framework that illuminated the relationships between BDA capability and organizational readiness and design. Our findings indicate that BDA capability (in terms of data sourcing, access, integration, and delivery, analytical capabilities, and people’s expertise) along with organizational readiness and design factors (such as BDA strategy, top management support, financial resources, and employee engagement) facilitated better utilization of BDA in manufacturing decision making, and thus enhanced high value business performance. Our results also highlight important managerial implications related to the impact of BDA on empowerment of employees, and how BDA can be integrated into organizations to augment rather than replace management capabilities. Our research will be of benefit to academics and practitioners in further aiding our understanding of BDA utilization in transforming operations and production management. It adds to the body of limited empirically based knowledge by highlighting the real business value resulting from applying BDA in manufacturing firms and thus encouraging beneficial economic societal changes.
48

Critical analysis of Big Data challenges and analytical methods

Sivarajah, Uthayasankar, Kamal, M.M., Irani, Zahir, Weerakkody, Vishanth J.P. 08 October 2016 (has links)
Yes / Big Data (BD), with their potential to ascertain valued insights for enhanced decision-making process, have recently attracted substantial interest from both academics and practitioners. Big Data Analytics (BDA) is increasingly becoming a trending practice that many organizations are adopting with the purpose of constructing valuable information from BD. The analytics process, including the deployment and use of BDA tools, is seen by organizations as a tool to improve operational efficiency though it has strategic potential, drive new revenue streams and gain competitive advantages over business rivals. However, there are different types of analytic applications to consider. Therefore, prior to hasty use and buying costly BD tools, there is a need for organizations to first understand the BDA landscape.Given the significant nature of the BDand BDA, this paper presents a state-ofthe- art review that presents a holistic view of the BD challenges and BDA methods theorized/proposed/ employed by organizations to help others understand this landscape with the objective of making robust investment decisions. In doing so, systematically analysing and synthesizing the extant research published on BD and BDA area. More specifically, the authors seek to answer the following two principal questions: Q1 –What are the different types of BD challenges theorized/proposed/confronted by organizations? and Q2 – What are the different types of BDA methods theorized/proposed/employed to overcome BD challenges?. This systematic literature review (SLR) is carried out through observing and understanding the past trends and extant patterns/themes in the BDA research area, evaluating contributions, summarizing knowledge, thereby identifying limitations, implications and potential further research avenues to support the academic community in exploring research themes/patterns. Thus, to trace the implementation of BD strategies, a profiling method is employed to analyze articles (published in English-speaking peer-reviewed journals between 1996 and 2015) extracted from the Scopus database. The analysis presented in this paper has identified relevant BD research studies that have contributed both conceptually and empirically to the expansion and accrual of intellectual wealth to the BDA in technology and organizational resource management discipline.
49

Big Data Analytics and Business Failures in Data-Rich Environments: An Organizing Framework

Amankwah-Amoah, J., Adomako, Samuel 2018 December 1924 (has links)
Yes / In view of the burgeoning scholarly works on big data and big data analytical capabilities, there remains limited research on how different access to big data and different big data analytic capabilities possessed by firms can generate diverse conditions leading to business failure. To fill this gap in the existing literature, an integrated framework was developed that entailed two approaches to big data as an asset (i.e. threshold resource and distinctive resource) and two types of competences in big data analytics (i.e. threshold competence and distinctive/core competence). The analysis provides insights into how ordinary big data analytic capability and mere possession of big data are more likely to create conditions for business failure. The study extends the existing streams of research by shedding light on decisions and processes in facilitating or hampering firms’ ability to harness big data to mitigate the cause of business failures. The analysis led to the categorization of a number of fruitful avenues for research on data-driven approaches to business failure.
50

Possibilities and Limitations of Analytics for Efficiencies in Project Management

Sarosh Iqbal (11828870) 18 December 2021 (has links)
<div>This study aimed to identify if data and analytics are, or can be, meaningfully and extensively used for improving efficiencies in project management. The research problem was addressed using a survey, involving capture, collection, analysis and interpretation of qualitative (and some quantitative) data obtained from industry practitioners of project management.</div><div><br></div><div>The study was completed in two important parts. First was the laying of groundwork which involved questionnaire planning and design to ensure coverage, completeness, relevance, usefulness, and logical (and where pertinent, statistical) validity of the answers for performing analysis and drawing inferences. The second part was the actual analysis of the survey results, and compilation of the research details into this written report with a conclusion (my M.S. thesis). </div><div><br></div><div>The survey was mainly in the form of multiple-choice questions, along with two free form text boxes to glean additional insights from comments and notes that structured questions with fixed choices for answers could not have easily elicited.</div><div><br></div>

Page generated in 0.0495 seconds