Global ETD Search

471	Klasifikace příspěvků ve webových diskusích / Classification of Web Forum Entries Margold, Tomáš January 2008 (has links) This thesis is dealing text ranking on the internet background. There are described available methods for classification and splitting of the text reports. The part of this thesis is implementation of Bayes naive algorithm and classifier using neuron nets. Selected methods are compared considering their error rate or other ranking features.
472	Bayesian Model Selections for Log-binomial Regression Zhou, Wei January 2018 (has links) No description available. Statistics Log-binomial Regression Bayesian Model Selection Bayesian Variable Selection Monte Carlo methods Bayes factor Relative Risk
473	The Safety Impact of Raising Speed Limit on Rural Freeways In Ohio Olufowobi, Oluwaseun Temitope 01 September 2020 (has links) No description available. Transportation Transportation Planning Civil Engineering Speed Limit Traffic Safety Rural Freeways Empirical Bayes Negative Binomial Safety Performance Function
474	Approximations of Bayes Classifiers for Statistical Learning of Clusters Ekdahl, Magnus January 2006 (has links) It is rarely possible to use an optimal classifier. Often the classifier used for a specific problem is an approximation of the optimal classifier. Methods are presented for evaluating the performance of an approximation in the model class of Bayesian Networks. Specifically for the approximation of class conditional independence a bound for the performance is sharpened. The class conditional independence approximation is connected to the minimum description length principle (MDL), which is connected to Jeffreys’ prior through commonly used assumptions. One algorithm for unsupervised classification is presented and compared against other unsupervised classifiers on three data sets. / <p>Report code: LiU-TEK-LIC 2006:11.</p> Pattern Recognition Stochastic Complexity Naïve Bayes Bayesian Network Classification Clustering Chow-Liu trees Probability Theory and Statistics Sannolikhetsteori och statistik
475	Variational Inference for Data-driven Stochastic Programming Prateek Jaiswal (11210091) 30 July 2021 (has links) <div>Stochastic programs are standard models for decision-making under uncertainty and have been extensively studied in the operations research literature. In general, stochastic programming involves minimizing an expected cost function, where the expectation is with respect to fully specified stochastic models that quantify the aleatoric or `inherent' uncertainty in the decision-making problem. In practice, however, the stochastic models are unknown but can be estimated from data, introducing an additional epistemic uncertainty into the decision-making problem. The Bayesian framework provides a coherent way to quantify the epistemic uncertainty through the posterior distribution by combining prior beliefs of the decision-makers with the observed data. Bayesian methods have been used for data-driven decision-making in various applications such as inventory management, portfolio design, machine learning, optimal scheduling, and staffing, etc.</div><div> </div><div>Bayesian methods are challenging to implement, mainly due to the fact that the posterior is computationally intractable, necessitating the computation of approximate posteriors. Broadly speaking, there are two methods in the literature implementing approximate posterior inference. First are sampling-based methods such as Markov Chain Monte Carlo. Sampling-based methods are theoretically well understood, but they suffer from various issues like high variance, poor scalability to high-dimensional problems, and have complex diagnostics. Consequently, we propose to use optimization-based methods collectively known as variational inference (VI) that use information projections to compute an approximation to the posterior. Empirical studies have shown that VI methods are computationally faster and easily scalable to higher-dimensional problems and large datasets. However, the theoretical guarantees of these methods are not well understood. Moreover, VI methods are empirically and theoretically less explored in the decision-theoretic setting.</div><div><br></div><div> In this thesis, we first propose a novel VI framework for risk-sensitive data-driven decision-making, which we call risk-sensitive variational Bayes (RSVB). In RSVB, we jointly compute a risk-sensitive approximation to the `true' posterior and the optimal decision by solving a minimax optimization problem. The RSVB framework includes the naive approach of first computing a VI approximation to the true posterior and then using it in place of the true posterior for decision-making. We show that the RSVB approximate posterior and the corresponding optimal value and decision rules are asymptotically consistent, and we also compute their rate of convergence. We illustrate our theoretical findings in both parametric as well as nonparametric setting with the help of three examples: the single and multi-product newsvendor model and Gaussian process classification. Second, we present the Bayesian joint chance-constrained stochastic program (BJCCP) for modeling decision-making problems with epistemically uncertain constraints. We discover that using VI methods for posterior approximation can ensure the convexity of the feasible set in (BJCCP) unlike any sampling-based methods and thus propose a VI approximation for (BJCCP). We also show that the optimal value computed using the VI approximation of (BJCCP) are statistically consistent. Moreover, we derive the rate of convergence of the optimal value and compute the rate at which a VI approximate solution of (BJCCP) is feasible under the true constraints. We demonstrate the utility of our approach on an optimal staffing problem for an M/M/c queue. Finally, this thesis also contributes to the growing literature in understanding statistical performance of VI methods. In particular, we establish the frequentist consistency of an approximate posterior computed using a well known VI method that computes an approximation to the posterior distribution by minimizing the Renyi divergence from the ‘true’ posterior.</div> Statistics Operations Research Applied Statistics Stochastic programming. Variational Inference Variational Bayes Decision making Chance-Constrained Optimization
476	Crash Prediction Models on Truck-Related Crashes on Two-lane Rural Highways with Vertical Curves Vavilikolanu, Srutha January 2008 (has links) No description available. Civil Engineering Transportation Vertical curves truck crashes full bayes approach
477	Polar Sea Ice Mapping for SeaWinds Anderson, Hyrum Spencer 30 May 2003 (has links) (PDF) In recent years, the scientific community has expressed interest in the ability to observe global climate indicators such as polar sea ice. Advances in microwave remote sensing technology have allowed a large-scale and detailed study of sea ice characteristics. This thesis provides the analysis and development of sea ice mapping algorithms for the SeaWinds scatterometer. First, an in-depth analysis of the Remund Long (RL) algorithm for SeaWinds is performed. From this study, several improvements are made to the RL algorithm which enhance its performance. In addition, a new method for automated polar sea ice mapping is developed for the SeaWinds instrument. This method is rooted in Bayes decision theory, and incorporates an adaptive model for seasonally fluctuating sea ice and ocean microwave signatures. The new approach is compared to the RL algorithm, to passive microwave data, and to high-resolution SAR imagery for validation. polar sea ice mapping SeaWinds scatterometer classification remote sensing QSCAT QuikSCAT Bayes Brigham Young electrical engineering Electrical and Computer Engineering
478	Improving Filtering of Email Phishing Attacks by Using Three-Way Text Classifiers Trevino, Alberto 13 March 2012 (has links) (PDF) The Internet has been plagued with endless spam for over 15 years. However, in the last five years spam has morphed from an annoying advertising tool to a social engineering attack vector. Much of today's unwanted email tries to deceive users into replying with passwords, bank account information, or to visit malicious sites which steal login credentials and spread malware. These email-based attacks are known as phishing attacks. Much has been published about these attacks which try to appear real not only to users and subsequently, spam filters. Several sources indicate traditional content filters have a hard time detecting phishing attacks because the emails lack the traditional features and characteristics of spam messages. This thesis tests the hypothesis that by separating the messages into three categories (ham, spam and phish) content filters will yield better filtering performance. Even though experimentation showed three-way classification did not improve performance, several additional premises were tested, including the validity of the claim that phishing emails are too much like legitimate emails and the ability of Naive Bayes classifiers to properly classify emails. email spam filtering phish phishing attacks support vector machines maximum entropy naive bayes bayesian filters Information Security
479	Crash Prediction Modeling for Curved Segments of Rural Two-Lane Two-Way Highways in Utah Knecht, Casey Scott 01 December 2014 (has links) (PDF) This thesis contains the results of the development of crash prediction models for curved segments of rural two-lane two-way highways in the state of Utah. The modeling effort included the calibration of the predictive model found in the Highway Safety Manual (HSM) as well as the development of Utah-specific models developed using negative binomial regression. The data for these models came from randomly sampled curved segments in Utah, with crash data coming from years 2008-2012. The total number of randomly sampled curved segments was 1,495. The HSM predictive model for rural two-lane two-way highways consists of a safety performance function (SPF), crash modification factors (CMFs), and a jurisdiction-specific calibration factor. For this research, two sample periods were used: a three-year period from 2010 to 2012 and a five-year period from 2008 to 2012. The calibration factor for the HSM predictive model was determined to be 1.50 for the three-year period and 1.60 for the five-year period. These factors are to be used in conjunction with the HSM SPF and all applicable CMFs. A negative binomial model was used to develop Utah-specific crash prediction models based on both the three-year and five-year sample periods. A backward stepwise regression technique was used to isolate the variables that would significantly affect highway safety. The independent variables used for negative binomial regression included the same set of variables used in the HSM predictive model along with other variables such as speed limit and truck traffic that were considered to have a significant effect on potential crash occurrence. The significant variables at the 95 percent confidence level were found to be average annual daily traffic, segment length, total truck percentage, and curve radius. The main benefit of the Utah-specific crash prediction models is that they provide a reasonable level of accuracy for crash prediction yet only require four variables, thus requiring much less effort in data collection compared to using the HSM predictive model. Highway Safety Manual safety performance functions crash modification factors negative binomial empirical Bayes safety horizontal curvature Civil and Environmental Engineering
480	Evaluating Statistical MachineLearning and Deep Learning Algorithms for Anomaly Detection in Chat Messages / Utvärdering av statistiska maskininlärnings- och djupinlärningsalgoritmer för anomalitetsdetektering i chattmeddelanden Freberg, Daniel January 2018 (has links) Automatically detecting anomalies in text is of great interest for surveillance entities as vast amounts of data can be analysed to find suspicious activity. In this thesis, three distinct machine learning algorithms are evaluated as a chat message classifier is being implemented for the purpose of market surveillance. Naive Bayes and Support Vector Machine belong to the statistical class of machine learning algorithms being evaluated in this thesis and both require feature selection, a side objective of the thesis is thus to find a suitable feature selection technique to ensure mentioned algorithms achieve high performance. Long Short-Term Memory network is the deep learning algorithm being evaluated in the thesis, rather than depend on feature selection, the deep neural network will be evaluated as it is trained using word embeddings. Each of the algorithms achieved high performance but the findings ofthe thesis suggest Naive Bayes algorithm in conjunction with a feature counting feature selection technique is the most suitable choice for this particular learning problem. / Att automatiskt kunna upptäcka anomalier i text har stora implikationer för företag och myndigheter som övervakar olika sorters kommunikation. I detta examensarbete utvärderas tre olika maskininlärningsalgoritmer för chattmeddelandeklassifikation i ett marknadsövervakningsystem. Naive Bayes och Support Vector Machine tillhör båda den statistiska klassen av maskininlärningsalgoritmer som utvärderas i studien och bådar kräver selektion av vilka särdrag i texten som ska användas i algoritmen. Ett sekundärt mål med studien är således att hitta en passande selektionsteknik för att de statistiska algoritmerna ska prestera så bra som möjligt. Long Short-Term Memory Network är djupinlärningsalgoritmen som utvärderas i studien. Istället för att använda en selektionsteknik kommer djupinlärningsalgoritmen nyttja ordvektorer för att representera text. Resultaten visar att alla utvärderade algoritmer kan nå hög prestanda för ändamålet, i synnerhet Naive Bayes tillsammans med termfrekvensselektion. machine learning NLP deep learning word vectors naive bayes support vector machine LSTM Computer Sciences Datavetenskap (datalogi)

Search results