Global ETD Search

1	Clustering Generic Log Files Under Limited Data Assumptions / Klustring av generiska loggfiler under begränsade antaganden Eriksson, Håkan January 2016 (has links) Complex computer systems are often prone to anomalous or erroneous behavior, which can lead to costly downtime as the systems are diagnosed and repaired. One source of information for diagnosing the errors and anomalies are log files, which are often generated in vast and diverse amounts. However, the log files' size and semi-structured nature makes manual analysis of log files generally infeasible. Some automation is desirable to sift through the log files to find the source of the anomalies or errors. This project aimed to develop a generic algorithm that could cluster diverse log files in accordance to domain expertise. The results show that the developed algorithm performs well in accordance to manual clustering even under more relaxed data assumptions. / Komplexa datorsystem är ofta benägna att uppvisa anormalt eller felaktigt beteende, vilket kan leda till kostsamma driftstopp under tiden som systemen diagnosticeras och repareras. En informationskälla till feldiagnosticeringen är loggfiler, vilka ofta genereras i stora mängder och av olika typer. Givet loggfilernas storlek och semistrukturerade utseende så blir en manuell analys orimlig att genomföra. Viss automatisering är önsvkärd för att sovra bland loggfilerna så att källan till felen och anormaliteterna blir enklare att upptäcka. Det här projektet syftade till att utveckla en generell algoritm som kan klustra olikartade loggfiler i enlighet med domänexpertis. Resultaten visar att algoritmen presterar väl i enlighet med manuell klustring även med färre antaganden om datan. machine learning cluster analysis log file analysis Computer Sciences Datavetenskap (datalogi)
2	Log File Categorization and Anomaly Analysis Using Grammar Inference Memon, Ahmed Umar 28 May 2008 (has links) In the information age of today, vast amounts of sensitive and confidential data is exchanged over an array of different mediums. Accompanied with this phenomenon is a comparable increase in the number and types of attacks to acquire this information. Information security and data consistency have hence, become quintessentially important. Log file analysis has proven to be a good defense mechanism as logs provide an accessible record of network activities in the form of server generated messages. However, manual analysis is tedious and prohibitively time consuming. Traditional log analysis techniques, based on pattern matching and data mining approaches, are ad hoc and cannot readily adapt to different kinds of log files. The goal of this research is to explore the use of grammar inference for log file analysis in order to build a more adaptive, flexible and generic method for message categorization, anomaly detection and reporting. The grammar inference process employs robust parsing, islands grammars and source transformation techniques. We test the system by using three different kinds of log file training sets as input and infer a grammar and generate message categories for each set. We detect anomalous messages in new log files using the inferred grammar as a catalog of valid traces and present a reporting program to extract the instances of specified message categories from the log files. / Thesis (Master, Computing) -- Queen's University, 2008-05-22 14:12:30.199 Anomaly analysis Log file analysis Log categorization Grammar inference Log file reporting Robust parsing Island grammars Program comprehension
3	Naive Bayesian Spam Filters for Log File Analysis Havens, Russel William 13 July 2011 (has links) (PDF) As computer system usage grows in our world, system administrators need better visibility into the workings of computer systems, especially when those systems have problems or go down. Most system components, from hardware, through OS, to application server and application, write log files of some sort, be it system-standardized logs such syslog or application specific logs. These logs very often contain valuable clues to the nature of system problems and outages, but their verbosity can make them difficult to utilize. Statistical data mining methods could help in filtering and classifying log entries, but these tools are often out of the reach of administrators. This research tests the effectiveness of three off-the-shelf Bayesian spam email filters (SpamAssassin, SpamBayes and Bogofilter) for effectiveness as log entry classifiers. A simple scoring system, the Filter Effectiveness Scale (FES), is proposed and used to compare these filters. These filters are tested in three stages: 1) the filters were tested with the SpamAssassin corpus, with various manipulations made to the messages, 2) the filters were tested for their ability to differentiate two types of log entries taken from actual production systems, and 3) the filters were trained on log entries from actual system outages and then tested on effectiveness for finding similar outages via the log files. For stage 1, messages were tested with normalized bodies, normalized headers and with each sentence from each message body as a separate message with a standardized message. The impact of each manipulation is presented. For stages 2 and 3, log entries were tested with digits normalized to zeros, with words chained together to various lengths and one or all levels of word chains used together. The impacts of these manipulations are presented. In each of these stages, it was found that these widely available Bayesian content filters were effective in differentiating log entries. Tables of correct match percentages or score graphs, according to the nature of tests and numbers of entries are presented, are presented, and FES scores are assigned to the filters according to the attributes impacting their effectiveness. This research leads to the suggestion that simple, off-the-shelf Bayesian content filters can be used to assist system administrators and log mining systems in sifting log entries to find entries related to known conditions (for which there are example log entries), and to exclude outages which are not related to specific known entry sets. Russel Havens log file analysis Bayesian content filter spam filter SpamAssassin SpamBayes Bogofilter filter effectiveness scale fes Computer Sciences
4	ML enhanced interpretation of failed test result Pechetti, Hiranmayi January 2023 (has links) This master thesis addresses the problem of classifying test failures in Ericsson AB’s BAIT test framework, specifically distinguishing between environment faults and product faults. The project aims to automate the initial defect classification process, reducing manual work and facilitating faster debugging. The significance of this problem lies in the potential time and cost savings it offers to Ericsson and other companies utilizing similar test frameworks. By automating the classification of test failures, developers can quickly identify the root cause of an issue and take appropriate action, leading to improved efficiency and productivity. To solve this problem, the thesis employs machine learning techniques. A dataset of test logs is utilized to evaluate the performance of six classification models: logistic regression, support vector machines, k-nearest neighbors, naive Bayes, decision trees, and XGBoost. Precision and macro F1 scores are used as evaluation metrics to assess the models’ performance. The results demonstrate that all models perform well in classifying test failures, achieving high precision values and macro F1 scores. The decision tree and XGBoost models exhibit perfect precision scores for product faults, while the naive Bayes model achieves the highest macro F1 score. These findings highlight the effectiveness of machine learning in accurately distinguishing between environment faults and product faults within the Bait framework. Developers and organizations can benefit from the automated defect classification system, reducing manual effort and expediting the debugging process. The successful application of machine learning in this context opens up opportunities for further research and development in automated defect classification algorithms. / Detta examensarbete tar upp problemet med att klassificera testfel i Ericsson AB:s BAIT-testramverk, där man specifikt skiljer mellan miljöfel och produktfel. Projektet syftar till att automatisera den initiala defekten klassificeringsprocessen, vilket minskar manuellt arbete och underlättar snabbare felsökning. Betydelsen av detta problem ligger i de potentiella tids- och kostnadsbesparingarna det erbjuder till Ericsson och andra företag som använder liknande testramar. Förbi automatisera klassificeringen av testfel, kan utvecklare snabbt identifiera grundorsaken till ett problem och vidta lämpliga åtgärder, vilket leder till förbättrad effektivitet och produktivitet. För att lösa detta problem använder avhandlingen maskininlärningstekniker. A datauppsättning av testloggar används för att utvärdera prestandan för sex klassificeringar modeller: logistisk regression, stödvektormaskiner, k-närmaste grannar, naiva Bayes, beslutsträd och XGBoost. Precision och makro F1 poäng används som utvärderingsmått för att bedöma modellernas prestanda. Resultaten visar att alla modeller presterar bra i klassificeringstest misslyckanden, uppnå höga precisionsvärden och makro F1-poäng. Beslutet tree- och XGBoost-modeller uppvisar perfekta precision-spoäng för produktfel, medan den naiva Bayes-modellen uppnår högsta makro F1-poäng. Dessa resultat belyser effektiviteten av maskininlärning när det gäller att exakt särskilja mellan miljöfel och produktfel inom Bait-ramverket. Utvecklare och organisationer kan dra nytta av den automatiska defektklassificeringen system, vilket minskar manuell ansträngning och påskyndar felsöknings-processen. De framgångsrik tillämpning av maskininlärning i detta sammanhang öppnar möjligheter för vidare forskning och utveckling inom automatiserade defektklassificeringsalgoritmer. Data Parsing Machine Learning Log file Analysis Text Classification Supervised Classification Dataanalys maskininlärning loggfilsanalys textklassificering Övervakad klassificering Computer and Information Sciences Data- och informationsvetenskap
5	Outpatient Portal (OPP) Use Among Pregnant Women: Cross-Sectional, Temporal, and Cluster Analysis of Use Morgan, Evan M. 09 November 2021 (has links) No description available. Biomedical Research Health Care Management Information Science Obstetrics Public Health Systems Design Systems Science
6	Detecting Synchronisation Problems in Networked Lockstep Games / Upptäcka synkroniseringsproblem i nätverksuppkopplade lockstep-spel Liljekvist, Hampus January 2016 (has links) The complexity associated with development of networked video games creates a need for tools for verifying a consistent player experience. Some networked games achieve consistency through the lockstep protocol, which requires identical execution of sent commands for players to stay synchronised. In this project a method for testing networked multiplayer lockstep games for synchronisation problems related to nondeterministic behaviour is formulated and evaluated. An integrated fuzzing AI is constructed which tries to cause desynchronisation in the tested game and generate data for analysis using log files. Scripts are used for performing semi-automated test runs and parsing the data. The results show that the test system has potential for finding synchronisation problems if the fuzzing AI is used in conjunction with the regular AI in the tested game, but not for finding the origins of said problems. / Komplexiteten förenad med utveckling av nätverksuppkopplade dataspel skapar ett behov av verktyg för att verifiera en konsistent spelarupplevelse. Vissa nätverksspel hålls konsistenta med hjälp av lockstep-protokollet, vilket kräver identisk exekvering av skickade kommandon för att spelarna ska hållas synkroniserade. I detta projekt formuleras och evalueras en metod för att testa om nätverksuppkopplade flerspelarspel lider av synkroniseringsproblem relaterade till ickedeterministiskt beteende. En integrerad fuzzing-AI konstrueras som försöka orsaka desynkronisering i det testade spelet och generera data för analys med hjälp av loggfiler. Skript används för att utföra halvautomatiserade testkörningar och tolka data. Resultaten visar att testsystemet har potential för att hitta synkroniseringsproblem om fuzzing-AI:n används tillsammans med den vanliga AI:n i det testade spelet, men inte för att hitta de bakomliggande orsakerna till dessa problem. Detecting Finding Synchronisation Desynchronisation Out-of-Sync Problems Issues Bugs Errors Network Lockstep Protocol Multiplayer Video Games Fuzzing AI Artificial Intelligence Testing Debugging Nondeterminism Log File Analysis Dynamic Program Analysis Test Automation Checksums Computer Sciences Datavetenskap (datalogi)
7	Country and language level differences in multilingual digital libraries Gäde, Maria 07 April 2014 (has links) Während die Bedeutung von mehrsprachigem Zugang zu Informationssystemen unumstritten ist, bleibt es unklar, ob und in welchem Umfang Systemfunktionalitäten und -oberflächen sowie das Interaktionsdesign an länder- bzw. sprachspezifisches Nutzerverhalten angepasst werden muss und sollte. Die Dissertation legt den Fokus auf die Identifikation von länder- und sprachspezifischen Unterschieden in Interaktionen mit dem Informationssystem als entscheidende Voraussetzung für die Entwicklung von mehrsprachigen Digitalen Bibliotheken. Durch den Mangel an vergleichbaren Studien -und Analyseansätzen, identifiziert die Studie zunächst Indikatoren, die auf Unterschiede im Verhalten von Nutzern aus unterschiedlichen Ländern und aus unterschiedlichen Sprachgruppen hinweisen können. Basierend auf der Selektion von Indikatoren wurde für die Arbeit ein individuell auf die Problematik von mehrsprachigem Zugang zu Informationssystemen angepasstes Logformat und Analysetool entwickelt, der Europeana Language Logger (ELL). Auf der Grundlage aller Variablen wurden Länderprofile erstellt und grafisch umgesetzt. Diese eignen sich für die Beschreibung und den Vergleich von länder- und sprachspezifischen Interaktionen innerhalb eines bestimmten Systems. Um die Erkenntnisse aus der Fallstudie verallgemeinern können, wurde auf der Basis einer Clusteranalyse eine Gewichtung von starken und schwachen Variablen für die Identifizierung von länder- und sprachspezifischen Unterschieden vorgenommen. / While the importance of multilingual access to information systems is unquestioned, it remains unclear if and to what extent system functionalities, interfaces or interaction patterns need to be adapted according to country or language specific user behaviors. This dissertation postulates that the identification of country and language level differences in user interactions is a crucial step for designing effective multilingual digital libraries. Due to the lack of comparable studies and analysis approaches, the research in this dissertation identifies indicators that could show differences in the interactions of users from different countries or languages. A customized logging format and logger (Europeana Language Logger) is developed in order to trace these variables in a digital library. For each investigated variable, differences between country groups are presented and discussed. Country profiles are developed as a tool to visualize different characteristics in comparison. To generalize the findings from the case study, the individual variables are prioritized by determining which ones show the most significant country and language level differences. Digitale Bibliothek Europeana Nutzerstudie Merhsprachiger Zugang Log File Analyse Digital Library Europeana User Study Multilingual Information Access Log File Analysis AN 73700 ddc:020
8	Touching the Essence of Life : Haptic Virtual Proteins for Learning Bivall, Petter January 2010 (has links) This dissertation presents research in the development and use of a multi-modal visual and haptic virtual model in higher education. The model, named Chemical Force Feedback (CFF), represents molecular recognition through the example of protein-ligand docking, and enables students to simultaneously see and feel representations of the protein and ligand molecules and their force interactions. The research efforts have been divided between educational research aspects and development of haptic feedback techniques. The CFF model was evaluated in situ through multiple data-collections in a university course on molecular interactions. To isolate possible influences of haptics on learning, half of the students ran CFF with haptics, and the others used the equipment with force feedback disabled. Pre- and post-tests showed a significant learning gain for all students. A particular influence of haptics was found on students reasoning, discovered through an open-ended written probe where students' responses contained elaborate descriptions of the molecular recognition process. Students' interactions with the system were analyzed using customized information visualization tools. Analysis revealed differences between the groups, for example, in their use of visual representations on offer, and in how they moved the ligand molecule. Differences in representational and interactive behaviours showed relationships with aspects of the learning outcomes. The CFF model was improved in an iterative evaluation and development process. A focus was placed on force model design, where one significant challenge was in conveying information from data with large force differences, ranging from very weak interactions to extreme forces generated when atoms collide. Therefore, a History Dependent Transfer Function (HDTF) was designed which adapts the translation of forces derived from the data to output forces according to the properties of the recently derived forces. Evaluation revealed that the HDTF improves the ability to haptically detect features in volumetric data with large force ranges. To further enable force models with high fidelity, an investigation was conducted to determine the perceptual Just Noticeable Difference (JND) in force for detection of interfaces between features in volumetric data. Results showed that JNDs vary depending on the magnitude of the forces in the volume and depending on where in the workspace the data is presented. Haptics Educational Research Biomolecular Education Life Science JND Just Noticeable Difference Protein-ligand Docking Haptic docking Visualization Haptic Transfer Functions Volume Data Haptics History Dependent Transfer Function Log file analysis Molecular Recognition Force Feedback Virtual Reality Other information technology Övrig informationsteknik

Search results