Global ETD Search

11	A framework for efficiently mining the organisational perspective of business processes Schönig, Stefan, Cabanillas Macias, Cristina, Jablonski, Stefan, Mendling, Jan 23 June 2016 (has links) (PDF) Process mining aims at discovering processes by extracting knowledge from event logs. Such knowledge may refer to different business process perspectives. The organisational perspective deals, among other things, with the assignment of human resources to process activities. Information about the resources that are involved in process activities can be mined from event logs in order to discover resource assignment conditions, which is valuable for process analysis and redesign. Prior process mining approaches in this context present one of the following issues: (i) they are limited to discovering a restricted set of resource assignment conditions; (ii) they do not aim at providing efficient solutions; or (iii) the discovered process models are difficult to read due to the number of assignment conditions included. In this paper we address these problems and develop an efficient and effective process mining framework that provides extensive support for the discovery of patterns related to resource assignment. The framework is validated in terms of performance and applicability.
12	Problem Determination In Message-Flow Internet Services Based On Statistical Analysis of Event Logs Xu, Yu January 2009 (has links) In a message-flow Internet service where messages travel through multiple nodes, event log analysis is one of the most important methods to identify the root causes of problems. Traditional approaches for event log analysis have been largely based on expert systems that build static dependency models on rules and patterns defined by human experts. However, the semantic complexity and the various formats of event logs make it difficult to be modeled. In addition, it is time consuming to maintain such static model for constantly evolving Internet services. Recent research has been focused on building statistical models. However, all of these models rely on the trace information provided by J2EE or .NET frameworks, which are not available to all Internet services. In this thesis, we propose a framework of problem determination based on statistical analysis of event logs. We assume a unique message ID will be logged in multiple log lines to trace the message flow in the system. A generic log adaptor is defined to extract valuable information from the log entries. We also develop an algorithm of log event clustering and log pattern clustering. Frequency analysis will be performed based on the log patterns in order to build a statistical model of the system behaviors. Once the system is modeled, we can determine problems by running a chi-square goodness of fit test using a sliding window approach. As event logs are available on all major operating systems, we believe our framework is a generic solution for problem determination in message-flow Internet services. Our solution has been validated by the log data collected from the Blackberry Internet Service (BIS) engine ‎[4] , a wireless email service that serves millions of users across the world. According to the test results, our solution shows high accuracy of problem determination. problem determination message-flow internet services log analysis Electrical and Computer Engineering
13	Problem Determination In Message-Flow Internet Services Based On Statistical Analysis of Event Logs Xu, Yu January 2009 (has links) In a message-flow Internet service where messages travel through multiple nodes, event log analysis is one of the most important methods to identify the root causes of problems. Traditional approaches for event log analysis have been largely based on expert systems that build static dependency models on rules and patterns defined by human experts. However, the semantic complexity and the various formats of event logs make it difficult to be modeled. In addition, it is time consuming to maintain such static model for constantly evolving Internet services. Recent research has been focused on building statistical models. However, all of these models rely on the trace information provided by J2EE or .NET frameworks, which are not available to all Internet services. In this thesis, we propose a framework of problem determination based on statistical analysis of event logs. We assume a unique message ID will be logged in multiple log lines to trace the message flow in the system. A generic log adaptor is defined to extract valuable information from the log entries. We also develop an algorithm of log event clustering and log pattern clustering. Frequency analysis will be performed based on the log patterns in order to build a statistical model of the system behaviors. Once the system is modeled, we can determine problems by running a chi-square goodness of fit test using a sliding window approach. As event logs are available on all major operating systems, we believe our framework is a generic solution for problem determination in message-flow Internet services. Our solution has been validated by the log data collected from the Blackberry Internet Service (BIS) engine ‎[4] , a wireless email service that serves millions of users across the world. According to the test results, our solution shows high accuracy of problem determination. problem determination message-flow internet services log analysis Electrical and Computer Engineering
14	Integrated geological and petrophysical investigation on carbonate rocks of the middle early to late early Canyon high frequency sequence in the Northern Platform area of the SACROC Unit Isdiken, Batur 18 February 2014 (has links) The SACROC unit is an isolated carbonate platform style of reservoir that typifies a peak icehouse system. Icehouse carbonate platforms are one of the least well understood and documented carbonate reservoir styles due to the reservoir heterogeneities they embody. The current study is an attempt to recognize carbonate rock types defined based on rock fabrics by integrating log and core based petrophysical analysis in high-frequency cycle (HFC) scale sequence stratigraphic framework and to improve our ability to understand static and dynamic petrophysical properties of these reservoir rock types, and there by, improve our understanding of heterogeneity in the middle early to late early Canyon (Canyon 2) high frequency sequence (HFS) in the Northern Platform of the SACROC Unit. Based on core descriptions, four different sub-tidal depositional facies were defined in the Canyon 2 HFS. Identified depositional facies were grouped into three different reservoir rock types in respect to their rock fabrics in order for the HFC scale petrophysical reservoir rock type characteristic analysis. Composed of succession of the identified reservoir rocks, twenty different HFCs were determined within the HFC scale sequence stratigraphic framework. The overall trend in the HFCs demonstrate systematic coarsening upward cycles with high reservoir quality at the cycle tops and low reservoir quality at the cycle bottoms. It was observed in terms of systems tracts described within the cycle scale frame work that the overall stacking pattern for high stand systems tracts (HST) and transgressive systems tracts (TST) is aggradational. And, the reservoir rocks representing the HST are more porous and permeable than those of TST. In addition to that, it was detected that the diagenetic overprint on the HST reservoir rocks is more than that of the TST. According to the overall petrophysical observations, the grain-dominated packstone deposited during HST was interpreted as the best reservoir rock. Upon well log analysis on the identified reservoir rocks, some specific log responses were attributed to the identified reservoir rocks as their characteristic log signatures. / text SACROC Unit Isolated carbonate platform Carbonate sequence stratigraphy Petrophysics Well log analysis
15	Exploring Event Log Analysis with Minimum Apriori Information Makanju, Adetokunbo 02 April 2012 (has links) The continued increase in the size and complexity of modern computer systems has led to a commensurate increase in the size of their logs. System logs are an invaluable resource to systems administrators during fault resolution. Fault resolution is a time-consuming and knowledge intensive process. A lot of the time spent in fault resolution is spent sifting through large volumes of information, which includes event logs, to find the root cause of the problem. Therefore, the ability to analyze log files automatically and accurately will lead to significant savings in the time and cost of downtime events for any organization. The automatic analysis and search of system logs for fault symptoms, otherwise called alerts, is the primary motivation for the work carried out in this thesis. The proposed log alert detection scheme is a hybrid framework, which incorporates anomaly detection and signature generation to accomplish its goal. Unlike previous work, minimum apriori knowledge of the system being analyzed is assumed. This assumption enhances the platform portability of the framework. The anomaly detection component works in a bottom-up manner on the contents of historical system log data to detect regions of the log, which contain anomalous (alert) behaviour. The identified anomalous regions are then passed to the signature generation component, which mines them for patterns. Consequently, future occurrences of the underlying alert in the anomalous log region, can be detected on a production system using the discovered pattern. The combination of anomaly detection and signature generation, which is novel when compared to previous work, ensures that a framework which is accurate while still being able to detect new and unknown alerts is attained. Evaluations of the framework involved testing it on log data for High Performance Cluster (HPC), distributed and cloud systems. These systems provide a good range for the types of computer systems used in the real world today. The results indicate that the system that can generate signatures for detecting alerts, which can achieve a Recall rate of approximately 83% and a false positive rate of approximately 0%, on average. Event Log Analysis Network Management Data Mining Systems Management Anomaly Detection
16	Techniques and Tools for Mining Pre-Deployment Testing Data Chan, BRIAN 17 September 2009 (has links) Pre-deployment field testing in is the process of testing software to uncover unforeseen problems before it is released in the market. It is commonly conducted by recruiting users to experiment with the software in as natural setting as possible. Information regarding the software is then sent to the developers as logs. Log data helps developers fix bugs and better understand the user behaviors so they can refine functionality to user needs. More importantly, logs contain specific problems as well as call traces that can be used by developers to trace its origins. However, developers focus their analysis on post-deployment data such as bug reports and CVS data to resolve problems, which has the disadvantage of releasing software before it can be optimized. Therefore, more techniques are needed to harness field testing data to reduce post deployment problems. We propose techniques to process log data generated by users in order to resolve problems in the application before its deployment. We introduce a metric system to predict the user perceived quality in software if it were to be released into market in its current state. We also provide visualization techniques which can identify the state of problems and patterns of problem interaction with users that provide insight into solving the problems. The visualization techniques can also be extended to determine the point of origin of a problem, to resolve it more efficiently. Additionally, we devise a method to determine the priority of reported problems. The results generated from the case studies on mobile software applications. The metric results showed a strong ability predict the number of reported bugs in the software after its release. The visualization techniques uncovered problem patterns that provided insight to developers to the relationship between problems and users themselves. Our analysis on the characteristics of problems determined the highest priority problems and their distribution among users. / Thesis (Master, Electrical & Computer Engineering) -- Queen's University, 2009-09-16 17:50:31.094 Field Testing Visualization of User Logs User Perceived Quality Metrics Problem Prioritization User Log Analysis
17	Mining team compositions for collaborative work in business processes Schönig, Stefan, Cabanillas Macias, Cristina, Di Ciccio, Claudio, Jablonski, Stefan, Mendling, Jan 22 October 2016 (has links) (PDF) Process mining aims at discovering processes by extracting knowledge about their different perspectives from event logs. The resource perspective (or organisational perspective) deals, among others, with the assignment of resources to process activities. Mining in relation to this perspective aims to extract rules on resource assignments for the process activities. Prior research in this area is limited by the assumption that only one resource is responsible for each process activity, and hence, collaborative activities are disregarded. In this paper, we leverage this assumption by developing a process mining approach that is able to discover team compositions for collaborative process activities from event logs. We evaluate our novel mining approach in terms of computational performance and practical applicability.
18	Log Analysis for Failure Diagnosis and Workload Prediction in Cloud Computing / Analys av loggfiler för feldiagnos och skattning av kommande belastning i system för molntjänster Hunt, Kristian January 2016 (has links) The size and complexity of cloud computing systems makes runtime errors inevitable. These errors could be caused by the system having insufficient resources or an unexpected failure in the system. In order to be able to provide highly available cloud computing services it is necessary to auto- mate the resource provisioning and failure diagnosing processes as much as possible. Log files are often a good source of information about the current status of the system. In this thesis methods for diagnosing failures and predicting system workload using log file analysis are presented and the performance of different machine learning algorithms using our proposed methods are compared. Our experimental results show that classification tree and random forest algorithms are both suitable for diagnosing failures and that Support Vector Regression outperforms linear regression and regression trees when predicting disk availability and memory usage. However, we conclude that predicting CPU utilization requires further studies. Computer Sciences Datavetenskap (datalogi)
19	Data Analysis of Minimally-Structured Heterogeneous Logs : An experimental study of log template extraction and anomaly detection based on Recurrent Neural Network and Naive Bayes. Liu, Chang January 2016 (has links) Nowadays, the ideas of continuous integration and continuous delivery are under heavy usage in order to achieve rapid software development speed and quick product delivery to the customers with good quality. During the process ofmodern software development, the testing stage has always been with great significance so that the delivered software is meeting all the requirements and with high quality, maintainability, sustainability, scalability, etc. The key assignment of software testing is to find bugs from every test and solve them. The developers and test engineers at Ericsson, who are working on a large scale software architecture, are mainly relying on the logs generated during the testing, which contains important information regarding the system behavior and software status, to debug the software. However, the volume of the data is too big and the variety is too complex and unpredictable, therefore, it is very time consuming and with great efforts for them to manually locate and resolve the bugs from such vast amount of log data. The objective of this thesis project is to explore a way to conduct log analysis efficiently and effectively by applying relevant machine learning algorithms in order to help people quickly detect the test failure and its possible causalities. In this project, a method of preprocessing and clusering original logs is designed and implemented in order to obtain useful data which can be fed to machine learning algorithms. The comparable log analysis, based on two machine learning algorithms - Recurrent Neural Network and Naive Bayes, is conducted for detecting the place of system failures and anomalies. Finally, relevant experimental results are provided and analyzed. Data analysis Log analysis RNN Naive Bayes Engineering and Technology Teknik och teknologier
20	Anomaly detection for automated security log analysis : Comparison of existing techniques and tools / Detektion av anomalier för automatisk analys av säkerhetsloggar Fredriksson Franzén, Måns, Tyrén, Nils January 2021 (has links) Logging security-related events is becoming increasingly important for companies. Log messages can be used for surveillance of a system or to make an assessment of the dam- age caused in the event of, for example, an infringement. Typically, large quantities of log messages are produced making manual inspection for finding traces of unwanted activity quite difficult. It is therefore desirable to be able to automate the process of analysing log messages. One way of finding suspicious behavior within log files is to set up rules that trigger alerts when certain log messages fit the criteria. However, this requires prior knowl- edge about the system and what kind of security issues that can be expected. Meaning that any novel attacks will not be detected with this approach. It can also be very difficult to determine what normal behavior and abnormal behavior is. A potential solution to this problem is machine learning and anomaly-based detection. Anomaly detection is the pro- cess of finding patterns which do not behave like defined notion of normal behavior. This thesis examines the process of going from raw log data to finding anomalies. Both existing log analysis tools and the creation of our own proof-of-concept implementation are used for the analysis. With the use of labeled log data, our implementation was able to reach a precision of 73.7% and a recall of 100%. The advantages and disadvantages of creating our own implementation as opposed to using an existing tool is presented and discussed along with several insights from the field of anomaly detection for log analysis. Security Log analysis Anomaly detection Machine learning Clustering Computer Sciences Datavetenskap (datalogi)

Search results