• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 31
  • 5
  • 4
  • 3
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 53
  • 53
  • 18
  • 14
  • 13
  • 12
  • 9
  • 9
  • 8
  • 8
  • 7
  • 7
  • 7
  • 6
  • 6
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Analysis of Diameter Log Files with Elastic Stack / Analysering av Diameter log filer med hjälp av Elastic Stack

Olars, Sebastian January 2020 (has links)
There is a growing need for more efficient tools and services for log analysis. A need that comes from the ever-growing use of digital services and applications, each one generating thousands of lines of log event message for the sake of auditing and troubleshooting. This thesis was initiated on behalf of one of the departments of the IT consulting company TietoEvry in Karlstad. The purpose of this thesis project was to investigate whether the log analysis service Elastic Stack would be a suitable solution for TietoEvry’s need for a more efficient method of log event analysis. As part of this investigation, a small-scale deployment of Elastic Stack was created, used as proof of concept. The investigation showed that Elastic Stack would be a suitable tool for the monitoring and analysis needs of TietoEvry. The final version of deployment was, however, not able to fulfill all of the requirements that were initially set out by TietoEvry, however, this was mainly due to a lack of time and rather than limitations of Elastic Stack.
22

Log Frequency Analysis for Anomaly Detection in Cloud Environments

Bendapudi, Prathyusha January 2024 (has links)
Background: Log analysis has been proven to be highly beneficial in monitoring system behaviour, detecting errors and anomalies, and predicting future trends in systems and applications. However, with continuous evolution of these systems and applications, the amount of log data generated on a timely basis is increasing rapidly. Hence, the amount of manual effort invested in log analysis for error detection and root cause analysis is also increasing. While there is continuous research to reduce manual effort, This Thesis introduced a new approach based on the temporal patternsof logs in a particular system environment, to the current scenario of automated log analysis which can help in reducing manual effort to a great extent. Objectives: The main objective of this research is to identify temporal patterns in logs using clustering algorithms, extract the outlier logs which do not adhere to any time pattern, and further analyse them to check if these outlier logs are helpful in error detection and identifying the root cause of the said errors. Methods: Design Science Research was implemented to fulfil the objectives of the thesis, as the thesis required generation of intermediary results and an iterative and responsive approach. The initial part of the thesis consisted of building an artifact which aided in identifying temporal patterns in the logs of different log types using DBSCAN clustering algorithm. After identification of patterns and extraction of outlier logs, Interviews were conducted which employed manual analysis of the outlier logs by system experts, who then provided insights on the logs and validated the log frequency analysis. Results: The results obtained after running the clustering algorithm on logs of different log types show clusters which represent temporal patterns in most of the files. There are log files which do not have any time patterns, which indicate that not all log types have logs which adhere to a fixed time pattern. The interviews conducted with system experts on the outlier logs yield promising results, indicating that the log frequency analysis is indeed helpful in reducing manual effort involved in log analysis for error detection and root cause analysis. Conclusions: The results of the Thesis show that most of the logs in the given cloud environment adhere to time frequency patterns, and analysing these patterns and their outliers will lead to easier error detection and root cause analysis in the given cloud environment.
23

Analysis and Modeling of World Wide Web Traffic

Abdulla, Ghaleb 30 April 1998 (has links)
This dissertation deals with monitoring, collecting, analyzing, and modeling of World Wide Web (WWW) traffic and client interactions. The rapid growth of WWW usage has not been accompanied by an overall understanding of models of information resources and their deployment strategies. Consequently, the current Web architecture often faces performance and reliability problems. Scalability, latency, bandwidth, and disconnected operations are some of the important issues that should be considered when attempting to adjust for the growth in Web usage. The WWW Consortium launched an effort to design a new protocol that will be able to support future demands. Before doing that, however, we need to characterize current users' interactions with the WWW and understand how it is being used. We focus on proxies since they provide a good medium or caching, filtering information, payment methods, and copyright management. We collected proxy data from our environment over a period of more than two years. We also collected data from other sources such as schools, information service providers, and commercial aites. Sampling times range from days to years. We analyzed the collected data looking for important characteristics that can help in designing a better HTTP protocol. We developed a modeling approach that considers Web traffic characteristics such as self-similarity and long-range dependency. We developed an algorithm to characterize users' sessions. Finally we developed a high-level Web traffic model suitable for sensitivity analysis. As a result of this work we develop statistical models of parameters such as arrival times, file sizes, file types, and locality of reference. We describe an approach to model long-range and dependent Web traffic and we characterize activities of users accessing a digital library courseware server or Web search tools. Temporal and spatial locality of reference within examined user communities is high, so caching can be an effective tool to help reduce network traffic and to help solve the scalability problem. We recommend utilizing our findings to promote a smart distribution or push model to cache documents when there is likelihood of repeat accesses. / Ph. D.
24

Machine Learning-Assisted Log Analysis for Uncovering Anomalies

Rurling, Samuel January 2024 (has links)
Logs, which are semi-structured records of system runtime information, contain a lot of valuable insights. By looking at the logs, developers and operators can analyse their system’s behavior. This is especially necessary when something in the system goes wrong, as nonconforming logs may indicate a root cause. With the growing complexity and size of IT systems however, millions of logs are generated hourly. Reviewing them manually can therefore become an all consuming task. A potential solution to aid in log analysis is machine learning. By leveraging their ability to automatically learn from experience, machine learning algorithms can be modeled to automatically analyse logs. In this thesis, machine learning is used to perform anomaly detection, which is the discovery of so called nonconforming logs. An experiment is created in which four feature extraction methods - that is four ways of creating data representations from the logs - are tested in combination with three machine learning models. These models are: LogCluster, PCA and SVM. Additionally, a neural network architecture called an LSTM network is explored as well, a network that can craft its own features and analyse them. The results show that the LSTM performed the best, in terms of precision, recall and f1-score, followed by SVM, LogCluster and PCA, in combination with a feature extraction method using word embeddings.
25

Korektorské vlastnosti sedimentárních hornin z karotážních měření / Well log analysis for sedimentary formation evaluation

Šálek, Ondřej January 2013 (has links)
3 ABSTRACT The work is focused on analysis of five structural well profiles penetrating sediments of the Bohemian Cretaceous Basin and the underlying Upper Palaeozoic continental basins to the crystalline basement. The objectives of well profile analysis are sedimentary formation evaluation from well log analysis and statistical analysis and evaluation of some physical properties of sedimentary rocks, which have been determined by measurements of drill cores. The aim of the work is to verify the possibility of porosity evaluation from well log analysis in the Bohemian Cretaceous Basin and the underlying Upper Palaeozoic continental basins. The next aim is to compare different geological environments with respect to physical properties of rocks. The content of the work involves presentation of well log curves, computation of porosity values and comparison between the resulting values of porosity from resistivity log, acoustic log and neutron-neutron log and from laboratory measurements of drill core samples. Data from five deep structural wells are used. Different geological environments were compared by statistical methods with respect to physical properties of rocks measured on well core samples from these five wells. Porosity evaluation from well log analysis is difficult but it is possible provided that...
26

Knowledge Driven Search Intent Mining

Jadhav, Ashutosh 31 May 2016 (has links)
No description available.
27

Evaluation of Automotive Data mining and Pattern Recognition Techniques for Bug Analysis

Gawande, Rashmi 02 February 2016 (has links) (PDF)
In an automotive infotainment system, while analyzing bug reports, developers have to spend significant time on reading log messages and trying to locate anomalous behavior before identifying its root cause. The log messages need to be viewed in a Traceviewer tool to read in a human readable form and have to be extracted to text files by applying manual filters in order to further analyze the behavior. There is a need to evaluate machine learning/data mining methods which could potentially assist in error analysis. One such method could be learning patterns for “normal” messages. “Normal” could even mean that they contain keywords like “exception”, “error”, “failed” but are harmless or not relevant to the bug that is currently analyzed. These patterns could then be applied as a filter, leaving behind only truly anomalous messages that are interesting for analysis. A successful application of the filter would reduce the noise, leaving only a few “anomalous” messages. After evaluation of the researched candidate algorithms, two algorithms namely GSP and FP Growth were found useful and thus implemented together in a prototype. The prototype implementation overall includes processes like pre-processing, creation of input, executing algorithms, creation of training set and analysis of new trace logs. Execution of prototype resulted in reducing manual effort thus achieving the objective of this thesis work.
28

Usage-driven unified model for user profile and data source profile extraction / Model unifié dérigé par l'usage pour l'extraction du profile de l'utilisateur et de la source de donnée

Limam, Lyes 24 June 2014 (has links)
La problématique traitée dans la thèse s’inscrit dans le cadre de l’analyse d’usage dans les systèmes de recherche d’information. En effet, nous nous intéressons à l’utilisateur à travers l’historique de ses requêtes, utilisées comme support d’analyse pour l’extraction d'un profil d’usage. L’objectif est de caractériser l’utilisateur et les sources de données qui interagissent dans un réseau afin de permettre des comparaisons utilisateur-utilisateur, source-source et source-utilisateur. Selon une étude que nous avons menée sur les travaux existants sur les modèles de profilage, nous avons conclu que la grande majorité des contributions sont fortement liés aux applications dans lesquelles ils étaient proposés. En conséquence, les modèles de profils proposés ne sont pas réutilisables et présentent plusieurs faiblesses. Par exemple, ces modèles ne tiennent pas compte de la source de données, ils ne sont pas dotés de mécanismes de traitement sémantique et ils ne tiennent pas compte du passage à l’échelle (en termes de complexité). C'est pourquoi, nous proposons dans cette thèse un modèle d’utilisateur et de source de données basé sur l’analyse d’usage. Les caractéristiques de ce modèle sont les suivantes. Premièrement, il est générique, permettant de représenter à la fois un utilisateur et une source de données. Deuxièmement, il permet de construire le profil de manière implicite à partir de l’historique de requêtes de recherche. Troisièmement, il définit le profil comme un ensemble de centres d’intérêts, chaque intérêt correspondant à un cluster sémantique de mots-clés déterminé par un algorithme de clustering spécifique. Et enfin, dans ce modèle le profil est représenté dans un espace vectoriel. Les différents composants du modèle sont organisés sous la forme d’un Framework, la complexité de chaque composant y est évaluée. Le Framework propose : - une méthode pour la désambigüisation de requêtes; - une méthode pour la représentation sémantique des logs sous la forme d’une taxonomie ; - un algorithme de clustering qui permet l’identification rapide et efficace des centres d’intérêt représentés par des clusters sémantiques de mots clés ; - une méthode pour le calcul du profil de l’utilisateur et du profil de la source de données à partir du modèle générique. Le Framework proposé permet d'effectuer différentes tâches liées à la structuration d’un environnement distribué d’un point de vue usage. Comme exemples d’application, le Framework est utilisé pour la découverte de communautés d’utilisateurs et la catégorisation de sources de données. Pour la validation du Framework, une série d’expérimentations est menée en utilisant des logs du moteur de recherche AOL-search, qui ont démontrées l’efficacité de la désambigüisation sur des requêtes courtes, et qui ont permis d’identification de la relation entre le clustering basé sur une fonction de qualité et le clustering basé sur la structure. / This thesis addresses a problem related to usage analysis in information retrieval systems. Indeed, we exploit the history of search queries as support of analysis to extract a profile model. The objective is to characterize the user and the data source that interact in a system to allow different types of comparison (user-to-user, source-to-source, user-to-source). According to the study we conducted on the work done on profile model, we concluded that the large majority of the contributions are strongly related to the applications within they are proposed. As a result, the proposed profile models are not reusable and suffer from several weaknesses. For instance, these models do not consider the data source, they lack of semantic mechanisms and they do not deal with scalability (in terms of complexity). Therefore, we propose a generic model of user and data source profiles. The characteristics of this model are the following. First, it is generic, being able to represent both the user and the data source. Second, it enables to construct the profiles in an implicit way based on histories of search queries. Third, it defines the profile as a set of topics of interest, each topic corresponding to a semantic cluster of keywords extracted by a specific clustering algorithm. Finally, the profile is represented according to the vector space model. The model is composed of several components organized in the form of a framework, in which we assessed the complexity of each component. The main components of the framework are: - a method for keyword queries disambiguation; - a method for semantically representing search query logs in the form of a taxonomy; - a clustering algorithm that allows fast and efficient identification of topics of interest as semantic clusters of keywords; - a method to identify user and data source profiles according to the generic model. This framework enables in particular to perform various tasks related to usage-based structuration of a distributed environment. As an example of application, the framework is used to the discovery of user communities, and the categorization of data sources. To validate the proposed framework, we conduct a series of experiments on real logs from the search engine AOL search, which demonstrate the efficiency of the disambiguation method in short queries, and show the relation between the quality based clustering and the structure based clustering.
29

Mobilní webová analytika - Nástroj pro analýzu návštěvnosti mobilního portálu / Mobile web analytics - tool for analysis of the mobile portal traffic

Joha, Miroslav January 2011 (has links)
This thesis analyses tools for mobile web portal logs processing which could be used for reporting of required metrics. The area covers not only web analytics, but also mobile web and mobile handsets capabilities. The objective is to analyze possibilities of existing products which can be used to process web logs and choose appropriate existing or propose proprietary solution. This solution should comply with requirements and limitations given by the portal platform limitations. Theoretical part introduces the web analytics domain and the mobile web environment. Basic concepts and metrics are described as well as the fundamental data collection methods and mobile internet basic specifics. The knowledge of the web analytics and mobile web basics are crucial to be able to eruditely assess the tools capabilities introduced in the second part of this thesis. In the practical part, there are technical details on the mobile portal operating environment described. The requirements for the tool outputs are defined there and the prospective ready-to-use solutions are researched. The final part of the text addresses the final solution for the mobile portal reporting tool including the operational data processing. The secondary output of this thesis comprises of explanation of the mobile web analytics area for the broader audience. The main output of the practical part is analysis and specification of the requirements for mobile portal reporting tool and design of the solution with the supervision of the implementation and deployment within the production environment.
30

Enterprise Users and Web Search Behavior

Lewis, April Ann 01 May 2010 (has links)
This thesis describes analysis of user web query behavior associated with Oak Ridge National Laboratory’s (ORNL) Enterprise Search System (Hereafter, ORNL Intranet). The ORNL Intranet provides users a means to search all kinds of data stores for relevant business and research information using a single query. The Global Intranet Trends for 2010 Report suggests the biggest current obstacle for corporate intranets is “findability and Siloed content”. Intranets differ from internets in the way they create, control, and share content which can make it often difficult and sometimes impossible for users to find information. Stenmark (2006) first noted studies of corporate internal search behavior is lacking and so appealed for more published research on the subject. This study employs mature scientific internet web query transaction log analysis (TLA) to examine how corporate intranet users at ORNL search for information. The focus of the study is to better understand general search behaviors and to identify unique trends associated with query composition and vocabulary. The results are compared to published Intranet studies. A literature review suggests only a handful of intranet based web search studies exist and each focus largely on a single aspect of intranet search. This implies that the ORNL study is the first to comprehensively analyze a corporate intranet user web query corpus, providing results to the public. This study analyzes 65,000 user queries submitted to the ORNL intranet from September 17, 2007 through December 31, 2007. A granular relational data model first introduced by Wang, Berry, and Yang (2003) for Web query analysis was adopted and modified for data mining and analysis of the ORNL query corpus. The ORNL query corpus is characterized using Zipf Distributions, descriptive word statistics, and Mutual Information. User search vocabulary is analyzed using frequency distribution and probability statistics. The results showed that ORNL users searched for unique types of information. ORNL users are uncertain of how to best formulate queries and don’t use search interface tools to narrow search scope. Special domain language comprised 38% of the queries. The average results returned per query for ORNL were too high and no hits occurred 16.34%.

Page generated in 0.1 seconds