Global ETD Search

31	Session-based Intrusion Detection System To Map Anomalous Network Traffic Caulkins, Bruce 01 January 2005 (has links) Computer crime is a large problem (CSI, 2004; Kabay, 2001a; Kabay, 2001b). Security managers have a variety of tools at their disposal -- firewalls, Intrusion Detection Systems (IDSs), encryption, authentication, and other hardware and software solutions to combat computer crime. Many IDS variants exist which allow security managers and engineers to identify attack network packets primarily through the use of signature detection; i.e., the IDS recognizes attack packets due to their well-known "fingerprints" or signatures as those packets cross the network's gateway threshold. On the other hand, anomaly-based ID systems determine what is normal traffic within a network and reports abnormal traffic behavior. This paper will describe a methodology towards developing a more-robust Intrusion Detection System through the use of data-mining techniques and anomaly detection. These data-mining techniques will dynamically model what a normal network should look like and reduce the false positive and false negative alarm rates in the process. We will use classification-tree techniques to accurately predict probable attack sessions. Overall, our goal is to model network traffic into network sessions and identify those network sessions that have a high-probability of being an attack and can be labeled as a "suspect session." Subsequently, we will use these techniques inclusive of signature detection methods, as they will be used in concert with known signatures and patterns in order to present a better model for detection and protection of networks and systems. Data Mining Intrusion Detection Systems Anomaly Detection Network Modeling Categorical Data Analysis
32	Exploration and Statistical Modeling of Profit Gibson, Caleb 01 December 2023 (has links) (PDF) For any company involved in sales, maximization of profit is the driving force that guides all decision-making. Many factors can influence how profitable a company can be, including external factors like changes in inflation or consumer demand or internal factors like pricing and product cost. Understanding specific trends in one's own internal data, a company can readily identify problem areas or potential growth opportunities to help increase profitability. In this discussion, we use an extensive data set to examine how a company might analyze their own data to identify potential changes the company might investigate to drive better performance. Based upon general trends in the data, we recommend potential actions the company could take. Additionally, we examine how a company can utilize predictive modeling to help them adapt their decision-making process as the trends identified from the initial analysis of the data evolve over time. Applied Mathematics Applied Statistics Categorical Data Analysis Data Science Probability Statistical Methodology Statistical Models Statistical Theory
33	Time Series Forecasting and Analysis: A Study of American Clothing Retail Sales Data Huang, Weijun 01 January 2019 (has links) This paper serves to address the effect of time on the sales of clothing retail, from 2010 to May 2019. The data was retrieved from the US Census, where N=113 observations were used, which were plotted to observe their trends. Once outliers and transformations were performed, the best model was fit, and diagnostic review occurred. Inspections for seasonality and forecasting was also conducted. The final model came out to be an ARIMA (2,0,1). Slight seasonality was present, but not enough to drastically influence the trends. Our results serve to highlight the economic growth of clothing retail sales for the past 8 years, cementing the significance of the production economy's stability. The quarterly GDP data was collected in order to find out the relationship with the differenced clothing data. Some observations of GDP data were affected by the clothing data before removing the seasonality. After removing the seasonality, the clothing expense is white noise and not predictable from the historical GDP. ARIMA economy GDP seasonality forecasting Clothing Retail Sales Categorical Data Analysis Statistics and Probability
34	A Contrast Pattern based Clustering Algorithm for Categorical Data Fore, Neil Koberlein 13 October 2010 (has links) No description available. Computer Science clustering contrast pattern frequent pattern frequent itemset categorical data categorical attributes discrete data
35	Bayesian Probit Regression Models for Spatially-Dependent Categorical Data Berrett, Candace 02 November 2010 (has links) No description available. Statistics spatial statistics latent variable methods binary data categorical data data augmentation MCMC classification
36	Visualizing Categorical Time Series Data with Applications to Computer and Communications Network Traces Ribler, Randy L. 04 April 1997 (has links) Visualization tools allow scientists to comprehend very large data sets and to discover relationships which are otherwise difficult to detect. Unfortunately, not all types of data can be visualized easily using existing tools. In particular, long sequences of nonnumeric data cannot be visualized adequately. Examples of this type of data include trace files of computer performance information, the nucleotides in a genetic sequence, a record of stocks traded over a period of years, and the sequence of words in this document. The term categorical time series is defined and used to describe this family of data. When visualizations designed for numerical time series are applied to categorical time series, the distortions which result from the arbitrary conversion of unordered categorical values to totally ordered numerical values can be profound. Examples of this phenomenon are presented and explained. Several new, general purpose techniques for visualizing categorical time series data have been developed as part of this work and have been incorporated into the Chitra perfor- mance analysis and visualization system. All of these new visualizations can be produced in O(n) time. The new visualizations for categorical time series provide general purpose techniques for visualizing aspects of categorical data which are commonly of interest. These include periodicity, stationarity, cross-correlation, autocorrelation, and the detection of recurring patterns. The effective use of these visualizations is demonstrated in a number of application domains, including performance analysis, World Wide Web traffic analysis, network routing simulations, document comparison, pattern detection, and the analysis of the performance of genetic algorithms. / Ph. D. visualization categorical data time series data mining performance analysis information visualization
37	Enhancing NFL Game Insights: Leveraging XGBoost For Advanced Football Data Analytics To Quantify Multifaceted Aspects Of Gameplay Schoborg, Christopher P 01 January 2024 (has links) (PDF) XGBoost, renowned for its efficacy in various statistical domains, offers enhanced precision and efficiency. Its versatility extends to both regression and categorization tasks, rendering it a valuable asset in predictive modeling. In this dissertation, I aim to harness the power of XGBoost to forecast and rank performances within the National Football League (NFL). Specifically, my research focuses on predicting the next play in NFL games based on pre-snap data, optimizing the draft ranking process by integrating data from the NFL combine, and collegiate statistics, creating a player rating system that can be compared across all positions, and evaluating strategic decisions for NFL teams when crossing the 50-yard line, including the feasibility of attempting a first down conversion versus opting for a field goal attempt. NFL Analytics XGBoost Prediction Fourth Down Categorical Data Analysis Data Science
38	The Strucplot Framework: Visualizing Multi-way Contingency Tables with vcd Hornik, Kurt, Zeileis, Achim, Meyer, David 10 1900 (has links) (PDF) This paper describes the "strucplot" framework for the visualization of multi-way contingency tables. Strucplot displays include hierarchical conditional plots such as mosaic, association, and sieve plots, and can be combined into more complex, specialized plots for visualizing conditional independence, GLMs, and the results of independence tests. The framework's modular design allows flexible customization of the plots' graphical appearance, including shading, labeling, spacing, and legend, by means of "graphical appearance control" functions. The framework is provided by the R package vcd.
39	Statistické usuzování v analýze kategoriálních dat / Statistical inference for categorical data analysis Kocáb, Jan January 2010 (has links) This thesis introduces statistical methods for categorical data. These methods are especially used in social sciences such as sociology, psychology and political science, but their importance has increased also in medical and technical sciences. In the first part there is mentioned statistical inference for a proportion. Here is written about classical, exact and Bayesian methods for estimating and hypothesis testing. If we have a large sample then we can approximate exact distribution by normal distribution but if we have a small sample cannot use this approximation and it is necessary to use discrete distribution which makes inference more complicated. The second part deals with two categorical variables analysis in contingency tables. Here are explained measures of association for 2 x 2 contingency tables such as difference of proportion and odds ratio and also presented how we can test independence in the case of large sample and small one. If we have small sample we are not allowed to use classical chi-squared tests and it is necessary to use alternative methods. This part contains variety of exact tests of independence and Bayesian approach for the 2 x 2 table too. In the end of this part there is written about a table for two dependent samples and we are interested whether two variables give identical results which occurs when marginal proportions are equal. In the last part there are methods used on data and discussed results.
40	Modelos para dados categorizados ordinais com efeito aleatório: uma aplicação à análise sensorial / Models for ordinal categorical data with random effects: an application to the sensory analysis Fatoretto, Maíra Blumer 12 January 2016 (has links) Os modelos para dados categorizados ordinais são extensões dos Modelos Lineares Generalizados e suas suposições e inferências são fundamentadas por esta classe de modelos. Os Modelos de Logitos Cumulativos, em que a função de ligação é constituída de probabilidades acumuladas, são muito utilizados para este tipo de variável, sendo uma de suas simplificações, os Modelos de Chances Proporcionais, em que para todas as covaríaveis no modelo há um crescimento linear nas razões de chances, porém, neste caso, é necessária a verificação da suposição de paralelismo. Outros modelos como o Modelo de Chances Proporcionais Parciais, o Modelo de Categorias Adjacentes e o Modelo Logito de Razão Contínua também podem ser utilizados. Em diversos estudos deste tipo, é necessário a utilização de modelos mistos, seja pelo tipo de um fator ou a dependência entre observações da variável resposta. Objetivou-se, neste trabalho, o estudo de modelos para variável resposta ordinal com a inclusão de um ou mais efeitos aleatórios. Esses modelos são ilustrados com a utilização de dados reais de análise sensorial, cuja variável resposta é constituída de uma escala ordinal e deseja-se saber dentre duas variedades de tomates desidratados (Italiano e Sweet Grape), qual teve melhor aceitação pelos consumidores. Nesse experimento os provadores avaliaram uma única vez cada uma das variedades, sendo as repetições constituídas pelas avaliações dadas por diferentes provadores. Nesse caso, é necessária a inclusão de um efeito aleatório por provador, para que o modelo consiga capturar as diferenças entre esses provadores não treinados. O Modelo de Chances Proporcionais ajustou-se de maneira satisfatória aos dados, podendo-se fazer uso das estimativas de probabilidades e razões de chances para a interpretação dos resultados e concluindo-se que o sabor da variedade Sweet Grape foi o que mais agradou os provadores, independente do sexo. / Models for ordinal categorical data are extensions of the Generalized Linear Models and their assumptions and inferences are based on this class of models. The Cumulative Logit Models in wich the link function consists of accumulated probabilities are more used for this type of variable, with one of its simplifications are the Proportional Odds Model, in wich for all covariates in the model there is a linear growth in odds ratios, but in this case, checking the parallelism assumption is required. Other models such as the Partial Proportional Odds Model, the Adjacent-Categories Logits and Continuation-Ratio Logits model can also be used. In several of such studies, the use of mixed models is required, either by type of factor or dependence between the response variable observations. The aim of this work is studying models for ordinal variable response with the inclusion of one or more random effects. These models are illustrated by using real data of sensory analysis, the response variable consists of an ordinal scale and we want to know from two varieties of dried tomatoes, Italian and Sweet Grape, which had better acceptance by consumers. In this experiment, the panelists evaluated each variety once, and the repetitions constituted by the ratings given by different tasters. In this case, the inclusion of a random effect by taster is required so that the model can capture the difference between these untrained tasters. The Proportional Odds Model fitted satisfactorily to the data and it is possible to make use of the estimates of probabilities and odds ratios for the interpretation of results and concluding that the taste of the variety Sweet Grape was the one that most pleased the tasters regardless of sex. Categorical Data Cumulative Logit Models Dados categorizados Generalized Linear Mixed Models Modelos de logitos cumulativos Modelos lineares generalizados mistos

Search results