Spelling suggestions: "subject:"bayesian"" "subject:"eayesian""
651 |
Open source software maturity model based on linear regression and Bayesian analysisZhang, Dongmin 15 May 2009 (has links)
Open Source Software (OSS) is widely used and is becoming a significant and
irreplaceable part of the software engineering community. Today a huge number of OSS
exist. This becomes a problem if one needs to choose from such a large pool of OSS
candidates in the same category. An OSS maturity model that facilitates the software
assessment and helps users to make a decision is needed. A few maturity models have
been proposed in the past. However, the parameters in the model are assigned not based
on experimental data but on human experiences, feelings and judgments. These models
are subjective and can provide only limited guidance for the users at the best.
This dissertation has proposed a quantitative and objective model which is built
from the statistical perspective. In this model, seven metrics are chosen as criteria for
OSS evaluation. A linear multiple-regression model is created to assign a final score
based on these seven metrics. This final score provides a convenient and objective way
for the users to make a decision. The coefficients in the linear multiple-regression model
are calculated from 43 OSS. From the statistical perspective, these coefficients are considered random variables. The joint distribution of the coefficients is discussed based
on Bayesian statistics. More importantly, an updating rule is established through
Bayesian analysis to improve the joint distribution, and thus the objectivity of the
coefficients in the linear multiple-regression model, according to new incoming data.
The updating rule provides the model the ability to learn and improve itself continually.
|
652 |
Probabilistic Analysis of the Compressibility of SoilsJung, Byoung C. 2009 May 1900 (has links)
Geotechnical engineers are always faced with uncertainties and spatial variations in
material parameters. In this work, we propose to develop a framework able to account
for different types of uncertainties in a formal and logical manner, to incorporate all
available sources of information, and to integrate the uncertainty in an estimate of the
probability.
In geotechnical engineering, current soil classification charts based on CPT data
may not provide an accurate prediction of soil type, even though soil classification is an
essential component in the design process. As a cheaper and faster alternative to sample
retrieval and testing, field methods such as the cone penetration test (CPT) can be used.
A probabilistic soil classification approach is proposed here to improve soil
classification based on CPT. The proposed approach provides a simple and
straightforward tool that allows updating the soil classification charts based on sitespecific
data.
In general, settlements can be the result of surface loads or variable soil deposits.
In current practice, the analysis to determine settlements is deterministic. It assumes that the soil profile at a site is uniform from location to location, and only allows limited
consideration of the variations of the material properties and initial conditions within soil
layers in spite of the wide range of compositions, gradations, and water contents in
natural soils. A Bayesian methodology is used to develop an unbiased probabilistic
model that accurately predicts the settlements and accounts for all the prevailing
uncertainties. The proposed probabilistic model is used to estimate the settlements of
the foundation of a structure in the Venice Lagoon, Italy. The conditional probability
(fragility) of exceeding a specified settlement threshold for a given vertical pressure is
estimated. A predictive fragility and confidence intervals are developed with special
attention given to the treatment and quantification of aleatory and epistemic
uncertainties. Sensitivity and importance measures are computed to identify the key
parameters and random variables in the model.
|
653 |
Bayesian Nonparametric Methods for Protein Structure PredictionLennox, Kristin Patricia 2010 August 1900 (has links)
The protein structure prediction problem consists of determining a protein’s three-dimensional
structure from the underlying sequence of amino acids. A standard approach for predicting
such structures is to conduct a stochastic search of conformation space in an attempt to find
a conformation that optimizes a scoring function. For one subclass of prediction protocols,
called template-based modeling, a new protein is suspected to be structurally similar to
other proteins with known structure. The solved related proteins may be used to guide the
search of protein structure space.
There are many potential applications for statistics in this area, ranging from the development
of structure scores to improving search algorithms. This dissertation focuses on
strategies for improving structure predictions by incorporating information about closely
related “template” protein structures into searches of protein conformation space. This is
accomplished by generating density estimates on conformation space via various simplifications
of structure models. By concentrating a search for good structure conformations
in areas that are inhabited by similar proteins, we improve the efficiency of our search and
increase the chances of finding a low-energy structure.
In the course of addressing this structural biology problem, we present a number of advances to the field of Bayesian nonparametric density estimation. We first develop a
method for density estimation with bivariate angular data that has applications to characterizing
protein backbone conformation space. We then extend this model to account for
multiple angle pairs, thereby addressing the problem of modeling protein regions instead
of single sequence positions. In the course of this analysis we incorporate an informative
prior into our nonparametric density estimate and find that this significantly improves performance
for protein loop prediction. The final piece of our structure prediction strategy is
to connect side-chain locations to our torsion angle representation of the protein backbone.
We accomplish this by using a Bayesian nonparametric model for dependence that can link
together two or more multivariate marginals distributions. In addition to its application for
our angular-linear data distribution, this dependence model can serve as an alternative to
nonparametric copula methods.
|
654 |
The empirical study of applying Technical Analysis on DJI, HSI and Taiwan Stock MarketIeong, KuongCheong 20 June 2007 (has links)
Stock Market is always being the most important role in modern capital market. And Stock Market is becoming one the most popular investment tools these days. Because of the Globalization of capital markets, the spreading of capital becomes faster and easier. The development of capital markets evoke the interesting of scholars and the field of stock market prediction attract scholars and researchers from different background. There are two approaches of predicting stock market - fundamental analysis and technical analysis. The purpose of my work was to predict three stock markets in the world, namely Taiwan Weighted Index (IDXWT), Hong Kong Hang Seng Index (HSI) and Dow Jones Industrial Average (DJI) using technical analysis and Dynamic Bayesian Network (DBN).This thesis is based on Wang¡¦s thesis [Wan05] ¡§Investment Decision Support with Dynamic Bayesian Networks¡¨. According to different characteristic of 3 stock markets, we divide 3 different markets into 3 experiments. For each market, we expect we can find the best indicators and trading signals. The first experiment involves Taiwan Weighted Index as our prediction target; the second one uses Hong Kong Hang Seng Index and the third experiment employs Dow Jones Industrial Average. As a result, Taiwan Stock market (both 15-day and 20-day Moving Average)can make higher returns than buy-and-hold, RSI_6 and KD. And we also have the same conclusion of Hang Seng Index and Dow Jones Industrial Average. The best return from 15-day MA and 20-day MA of Taiwan Stock market is 47.95% and 60.21%, respectively. Moreover, the best result of Hang Seng Index is 60.06% for 4 years and 25.83% for Dow Jones Industrial Average. All of the best results can make higher returns than each of their buy-and-hold, RSI_6 and KD. In the conclusion, we may say that this paper can provide a direction to investors while they are using these technical indicators to predict these particular stock markets.
|
655 |
Constructing Bayesian Networks with Sequential Patterns for HemodialysisWang, Woei-Ru 05 August 2002 (has links)
In this thesis, I introduce a multivariate discretization algorithm to discretize the continuous variables of clinical pathways of Hemodialysis and use the clustering algorithm to shift time stamps to reduce the number of nodes of Bayesian networks. The generalized sequential patterns algorithm is used to find the possible patterns, which have far-reaching effect on the next nodes of the Bayesian networks of Hemodialysis. Bayesian network is a graphical model that encodes probabilistic relationships among variables of interest, and easily incorporates with new instances to maintain rules up to date. Bayesian networks are used to represent knowledge of frequent state transitions in medical logs. Bayesian networks and sequential patterns algorithms can only handle discrete or categorical data. Therefore, we have to discretize the continuous variables with suitable technique to generalize the node, and shift the time stamps of nodes to reduce the variations in time. With these generalizations, we improve the problem of over-fitting of the Bayesian networks of Hemodialysis. We expect the discovered patterns can give more information to medical professionals and help them to build the reciprocal cycle of knowledge management of Hemodialysis.
|
656 |
Bayesian model-based approaches with MCMC computation to some bioinformatics problemsBae, Kyounghwa 29 August 2005 (has links)
Bioinformatics applications can address the transfer of information at several stages
of the central dogma of molecular biology, including transcription and translation.
This dissertation focuses on using Bayesian models to interpret biological data in
bioinformatics, using Markov chain Monte Carlo (MCMC) for the inference method.
First, we use our approach to interpret data at the transcription level. We propose
a two-level hierarchical Bayesian model for variable selection on cDNA Microarray
data. cDNA Microarray quantifies mRNA levels of a gene simultaneously so has
thousands of genes in one sample. By observing the expression patterns of genes under
various treatment conditions, important clues about gene function can be obtained.
We consider a multivariate Bayesian regression model and assign priors that favor
sparseness in terms of number of variables (genes) used. We introduce the use of
different priors to promote different degrees of sparseness using a unified two-level
hierarchical Bayesian model. Second, we apply our method to a problem related to
the translation level. We develop hidden Markov models to model linker/non-linker
sequence regions in a protein sequence. We use a linker index to exploit differences
in amino acid composition between regions from sequence information alone. A goal
of protein structure prediction is to take an amino acid sequence (represented as
a sequence of letters) and predict its tertiary structure. The identification of linker
regions in a protein sequence is valuable in predicting the three-dimensional structure.
Because of the complexities of both models encountered in practice, we employ the
Markov chain Monte Carlo method (MCMC), particularly Gibbs sampling (Gelfand
and Smith, 1990) for the inference of the parameter estimation.
|
657 |
Ecosystem health at the Texas coastal bend: a spatial analysis of exposure and responseBissett, Wesley Thurlow, Jr. 10 October 2008 (has links)
This dissertation investigated locational risks to ecosystem health associated with
proximity to industrial complexes. The study was performed at the behest of ranchers
and citizens living and working down-prevailing wind from the Formosa Plastics, Inc.
and ALCOA facilities located in Calhoun County, Texas. Concerns expressed were for
potential genotoxicity resulting from exposure to complex chemical mixtures released by
the facilities. Exposure assessment of the marine environment was performed with
sediments and oysters from Lavaca Bay being analyzed. Numerous chemicals were
found to be present at concentrations considered likely to result in adverse responses in
exposed populations. Bayesian geostatistical analysis was performed to determine if the
concentrations were affected by a spatial process. Mercury and polycyclic aromatic
hydrocarbons were the most notable of the chemicals found to be present at elevated
concentrations and affected by a spatial process. Evaluation of maps generated from
spatial modeling revealed that proximity to ALCOA resulted in elevated risks for
exposure to harmful concentrations of pollutants. Genotoxicity was measured in two
sentinel species. Oysters (Crassostrea virginica) were utilized for evaluation of the
marine environment and cattle (Bos taurus and Bos taurus crossbred cattle) were chosen
for evaluation of the terrestrial environment. Chromosomal aberration analysis was
performed on oyster hematocytes. Analysis of the results failed to demonstrate the
presence of an important generalized spatial process but some specific locations close to
the ALCOA plant had elevations in this measure of genotoxicity. Stress as measured by
the lysosomal destabilization assay was also performed on oyster hematocytes. These results were found to be affected by a significant spatial process with the highest degree
of destabilization occurring in close proximity to ALCOA. Genotoxicity in cattle was
evaluated with the single cell gel electrophoresis assay and chromosomal aberration
analysis. Bayesian geostatistical analyis revealed the presence of important spatial
processes. DNA-protein cross-linkage was the most notable with a strong indication of
increased damage down-prevailing wind from the industrial complexes. Results
indicated that proximity to industrial facilities increased the risk for harmful exposures,
genotoxicity, and lysosomal destabilization.
|
658 |
Robust manufacturing system design using petri nets and bayesian methodsSharda, Bikram 10 October 2008 (has links)
Manufacturing system design decisions are costly and involve significant
investment in terms of allocation of resources. These decisions are complex, due to
uncertainties related to uncontrollable factors such as processing times and part
demands. Designers often need to find a robust manufacturing system design that meets
certain objectives under these uncertainties. Failure to find a robust design can lead to
expensive consequences in terms of lost sales and high production costs. In order to find
a robust design configuration, designers need accurate methods to model various
uncertainties and efficient ways to search for feasible configurations.
The dissertation work uses a multi-objective Genetic Algorithm (GA) and Petri net
based modeling framework for a robust manufacturing system design. The Petri nets are
coupled with Bayesian Model Averaging (BMA) to capture uncertainties associated with
uncontrollable factors. BMA provides a unified framework to capture model, parameter
and stochastic uncertainties associated with representation of various manufacturing
activities. The BMA based approach overcomes limitations associated with uncertainty representation using classical methods presented in literature. Petri net based modeling is
used to capture interactions among various subsystems, operation precedence and to
identify bottleneck or conflicting situations. When coupled with Bayesian methods, Petri
nets provide accurate assessment of manufacturing system dynamics and performance in
presence of uncertainties. A multi-objective Genetic Algorithm (GA) is used to search
manufacturing system designs, allowing designers to consider multiple objectives. The
dissertation work provides algorithms for integrating Bayesian methods with Petri nets.
Two manufacturing system design examples are presented to demonstrate the proposed
approach. The results obtained using Bayesian methods are compared with classical
methods and the effect of choosing different types of priors is evaluated.
In summary, the dissertation provides a new, integrated Petri net based modeling
framework coupled with BMA based approach for modeling and performance analysis
of manufacturing system designs. The dissertation work allows designers to obtain
accurate performance estimates of design configurations by considering model,
parameter and stochastic uncertainties associated with representation of uncontrollable
factors. Multi-objective GA coupled with Petri nets provide a flexible and time saving
approach for searching and evaluating alternative manufacturing system designs.
|
659 |
Risk Based Maintenance Optimization using Probabilistic Maintenance Quantification Models of Circuit BreakerNatti, Satish 14 January 2010 (has links)
New maintenance techniques for circuit breakers are studied in this dissertation by proposing a probabilistic maintenance model and a new methodology to assess circuit breaker condition utilizing its control circuit data. A risk-based decision approach is proposed at system level making use of the proposed new methodology, for optimizing the maintenance schedules and allocation of resources.
This dissertation is focused on developing optimal maintenance strategies for circuit breakers, both at component and system level. A probabilistic maintenance model is proposed using similar approach recently introduced for power transformers. Probabilistic models give better insight into the interplay among monitoring techniques, failure modes and maintenance techniques of the component. The model is based on the concept of representing the component life time by several deterioration stages. Inspection and maintenance is introduced at each stage and model parameters are defined. A sensitivity analysis is carried to understand the importance of model parameters in obtaining optimal maintenance strategies. The analysis covers the effect of inspection rate calculated for each stage and its impact on failure probability, inspection cost, maintenance cost and failure cost. This maintenance model is best suited for long-term maintenance planning. All simulations are carried in MATLAB and how the analysis results may be used to achieve optimal maintenance schedules is discussed.
A new methodology is proposed to convert data from the control circuit of a breaker into condition of the breaker by defining several performance indices for breaker assemblies. Control circuit signal timings are extracted and a probability distribution is fitted to each timing parameter. Performance indices for various assemblies such as, trip coil, close coil, auxiliary contacts etc. are defined based on the probability distributions. These indices are updated using Bayesian approach as the new data arrives. This process can be made practical by approximating the Bayesian approach calculating the indices on-line. The quantification of maintenance is achieved by computing the indices after a maintenance action and comparing with those of previously estimated ones.
A risk-based decision approach to maintenance planning is proposed based on the new methodology developed for maintenance quantification. A list of events is identified for the test system under consideration, and event probability, event consequence, and hence the risk associated with each event is computed. Optimal maintenance decisions are taken based on the computed risk levels for each event.
Two case studies are presented to evaluate the performance of the proposed new methodology for maintenance quantification. The risk-based decision approach is tested on IEEE Reliability Test System. All simulations are carried in MATLAB and the discussions of results are provided.
|
660 |
Bayesian Unit Root Test ¡V Application for Exchange Rate MarketLiao, Siang-kai 24 June 2008 (has links)
There should be more interpretations which are derived from data, presented by those professional analysts.
The empirical rules and knowledge do help as making statistical inference in Econometrics.
The approaches from classical statistical analysis make judges simply resulting from historical data.
To be frank, the advantage of this analysis is the objectivity, but there is a fatal drawback. That is, it does not pay attention to some logically extra information.
This paper is born for the applications of Bayesian, which has the essential characteristic of accepting subjective outlook, applying empirical rules to study unit root test on exchange rate market.
Furthermore, the various distributions of data may have direct effect on the classical statistical inference we use, such as Dickey-Fuller and Phillips-Perron test. To take those defects into consideration, this paper tends not to take the assumption of disturbances in normal distribution as granted.
For instance, it is quite common for us to confront the heavy-tailed distribution when studying some data of time series related to stocks and targets of investment. Hence, we will apply more generalized model to do research on Bayesian unit root test.
Use the model of Schotman and Van Dijk (1991) and assuming disturbance shaped as independent student-t distribution to revise the unit root test, next, applying to exchange rate market. This is the motif of this paper.
|
Page generated in 0.0617 seconds