Global ETD Search

651	Open source software maturity model based on linear regression and Bayesian analysis Zhang, Dongmin 15 May 2009 (has links) Open Source Software (OSS) is widely used and is becoming a significant and irreplaceable part of the software engineering community. Today a huge number of OSS exist. This becomes a problem if one needs to choose from such a large pool of OSS candidates in the same category. An OSS maturity model that facilitates the software assessment and helps users to make a decision is needed. A few maturity models have been proposed in the past. However, the parameters in the model are assigned not based on experimental data but on human experiences, feelings and judgments. These models are subjective and can provide only limited guidance for the users at the best. This dissertation has proposed a quantitative and objective model which is built from the statistical perspective. In this model, seven metrics are chosen as criteria for OSS evaluation. A linear multiple-regression model is created to assign a final score based on these seven metrics. This final score provides a convenient and objective way for the users to make a decision. The coefficients in the linear multiple-regression model are calculated from 43 OSS. From the statistical perspective, these coefficients are considered random variables. The joint distribution of the coefficients is discussed based on Bayesian statistics. More importantly, an updating rule is established through Bayesian analysis to improve the joint distribution, and thus the objectivity of the coefficients in the linear multiple-regression model, according to new incoming data. The updating rule provides the model the ability to learn and improve itself continually. Open Source Software Maturity Model Regression Bayesian Analyisis
652	Probabilistic Analysis of the Compressibility of Soils Jung, Byoung C. 2009 May 1900 (has links) Geotechnical engineers are always faced with uncertainties and spatial variations in material parameters. In this work, we propose to develop a framework able to account for different types of uncertainties in a formal and logical manner, to incorporate all available sources of information, and to integrate the uncertainty in an estimate of the probability. In geotechnical engineering, current soil classification charts based on CPT data may not provide an accurate prediction of soil type, even though soil classification is an essential component in the design process. As a cheaper and faster alternative to sample retrieval and testing, field methods such as the cone penetration test (CPT) can be used. A probabilistic soil classification approach is proposed here to improve soil classification based on CPT. The proposed approach provides a simple and straightforward tool that allows updating the soil classification charts based on sitespecific data. In general, settlements can be the result of surface loads or variable soil deposits. In current practice, the analysis to determine settlements is deterministic. It assumes that the soil profile at a site is uniform from location to location, and only allows limited consideration of the variations of the material properties and initial conditions within soil layers in spite of the wide range of compositions, gradations, and water contents in natural soils. A Bayesian methodology is used to develop an unbiased probabilistic model that accurately predicts the settlements and accounts for all the prevailing uncertainties. The proposed probabilistic model is used to estimate the settlements of the foundation of a structure in the Venice Lagoon, Italy. The conditional probability (fragility) of exceeding a specified settlement threshold for a given vertical pressure is estimated. A predictive fragility and confidence intervals are developed with special attention given to the treatment and quantification of aleatory and epistemic uncertainties. Sensitivity and importance measures are computed to identify the key parameters and random variables in the model. Bayesian soil classification compressibility of soils statistical analysis settlement
653	Bayesian Nonparametric Methods for Protein Structure Prediction Lennox, Kristin Patricia 2010 August 1900 (has links) The protein structure prediction problem consists of determining a protein’s three-dimensional structure from the underlying sequence of amino acids. A standard approach for predicting such structures is to conduct a stochastic search of conformation space in an attempt to find a conformation that optimizes a scoring function. For one subclass of prediction protocols, called template-based modeling, a new protein is suspected to be structurally similar to other proteins with known structure. The solved related proteins may be used to guide the search of protein structure space. There are many potential applications for statistics in this area, ranging from the development of structure scores to improving search algorithms. This dissertation focuses on strategies for improving structure predictions by incorporating information about closely related “template” protein structures into searches of protein conformation space. This is accomplished by generating density estimates on conformation space via various simplifications of structure models. By concentrating a search for good structure conformations in areas that are inhabited by similar proteins, we improve the efficiency of our search and increase the chances of finding a low-energy structure. In the course of addressing this structural biology problem, we present a number of advances to the field of Bayesian nonparametric density estimation. We first develop a method for density estimation with bivariate angular data that has applications to characterizing protein backbone conformation space. We then extend this model to account for multiple angle pairs, thereby addressing the problem of modeling protein regions instead of single sequence positions. In the course of this analysis we incorporate an informative prior into our nonparametric density estimate and find that this significantly improves performance for protein loop prediction. The final piece of our structure prediction strategy is to connect side-chain locations to our torsion angle representation of the protein backbone. We accomplish this by using a Bayesian nonparametric model for dependence that can link together two or more multivariate marginals distributions. In addition to its application for our angular-linear data distribution, this dependence model can serve as an alternative to nonparametric copula methods. Bayesian statistics Nonparametric statistics Density esitimation Angular data
654	The empirical study of applying Technical Analysis on DJI, HSI and Taiwan Stock Market Ieong, KuongCheong 20 June 2007 (has links) Stock Market is always being the most important role in modern capital market. And Stock Market is becoming one the most popular investment tools these days. Because of the Globalization of capital markets, the spreading of capital becomes faster and easier. The development of capital markets evoke the interesting of scholars and the field of stock market prediction attract scholars and researchers from different background. There are two approaches of predicting stock market - fundamental analysis and technical analysis. The purpose of my work was to predict three stock markets in the world, namely Taiwan Weighted Index (IDXWT), Hong Kong Hang Seng Index (HSI) and Dow Jones Industrial Average (DJI) using technical analysis and Dynamic Bayesian Network (DBN).This thesis is based on Wang¡¦s thesis [Wan05] ¡§Investment Decision Support with Dynamic Bayesian Networks¡¨. According to different characteristic of 3 stock markets, we divide 3 different markets into 3 experiments. For each market, we expect we can find the best indicators and trading signals. The first experiment involves Taiwan Weighted Index as our prediction target; the second one uses Hong Kong Hang Seng Index and the third experiment employs Dow Jones Industrial Average. As a result, Taiwan Stock market (both 15-day and 20-day Moving Average)can make higher returns than buy-and-hold, RSI_6 and KD. And we also have the same conclusion of Hang Seng Index and Dow Jones Industrial Average. The best return from 15-day MA and 20-day MA of Taiwan Stock market is 47.95% and 60.21%, respectively. Moreover, the best result of Hang Seng Index is 60.06% for 4 years and 25.83% for Dow Jones Industrial Average. All of the best results can make higher returns than each of their buy-and-hold, RSI_6 and KD. In the conclusion, we may say that this paper can provide a direction to investors while they are using these technical indicators to predict these particular stock markets. stock market prediction technical analysis dynamic bayesian network data mining
655	Constructing Bayesian Networks with Sequential Patterns for Hemodialysis Wang, Woei-Ru 05 August 2002 (has links) In this thesis, I introduce a multivariate discretization algorithm to discretize the continuous variables of clinical pathways of Hemodialysis and use the clustering algorithm to shift time stamps to reduce the number of nodes of Bayesian networks. The generalized sequential patterns algorithm is used to find the possible patterns, which have far-reaching effect on the next nodes of the Bayesian networks of Hemodialysis. Bayesian network is a graphical model that encodes probabilistic relationships among variables of interest, and easily incorporates with new instances to maintain rules up to date. Bayesian networks are used to represent knowledge of frequent state transitions in medical logs. Bayesian networks and sequential patterns algorithms can only handle discrete or categorical data. Therefore, we have to discretize the continuous variables with suitable technique to generalize the node, and shift the time stamps of nodes to reduce the variations in time. With these generalizations, we improve the problem of over-fitting of the Bayesian networks of Hemodialysis. We expect the discovered patterns can give more information to medical professionals and help them to build the reciprocal cycle of knowledge management of Hemodialysis. knowledge management data mining Bayesian network Hemodialysis clustering sequential pattern
656	Bayesian model-based approaches with MCMC computation to some bioinformatics problems Bae, Kyounghwa 29 August 2005 (has links) Bioinformatics applications can address the transfer of information at several stages of the central dogma of molecular biology, including transcription and translation. This dissertation focuses on using Bayesian models to interpret biological data in bioinformatics, using Markov chain Monte Carlo (MCMC) for the inference method. First, we use our approach to interpret data at the transcription level. We propose a two-level hierarchical Bayesian model for variable selection on cDNA Microarray data. cDNA Microarray quantifies mRNA levels of a gene simultaneously so has thousands of genes in one sample. By observing the expression patterns of genes under various treatment conditions, important clues about gene function can be obtained. We consider a multivariate Bayesian regression model and assign priors that favor sparseness in terms of number of variables (genes) used. We introduce the use of different priors to promote different degrees of sparseness using a unified two-level hierarchical Bayesian model. Second, we apply our method to a problem related to the translation level. We develop hidden Markov models to model linker/non-linker sequence regions in a protein sequence. We use a linker index to exploit differences in amino acid composition between regions from sequence information alone. A goal of protein structure prediction is to take an amino acid sequence (represented as a sequence of letters) and predict its tertiary structure. The identification of linker regions in a protein sequence is valuable in predicting the three-dimensional structure. Because of the complexities of both models encountered in practice, we employ the Markov chain Monte Carlo method (MCMC), particularly Gibbs sampling (Gelfand and Smith, 1990) for the inference of the parameter estimation.
657	Ecosystem health at the Texas coastal bend: a spatial analysis of exposure and response Bissett, Wesley Thurlow, Jr. 10 October 2008 (has links) This dissertation investigated locational risks to ecosystem health associated with proximity to industrial complexes. The study was performed at the behest of ranchers and citizens living and working down-prevailing wind from the Formosa Plastics, Inc. and ALCOA facilities located in Calhoun County, Texas. Concerns expressed were for potential genotoxicity resulting from exposure to complex chemical mixtures released by the facilities. Exposure assessment of the marine environment was performed with sediments and oysters from Lavaca Bay being analyzed. Numerous chemicals were found to be present at concentrations considered likely to result in adverse responses in exposed populations. Bayesian geostatistical analysis was performed to determine if the concentrations were affected by a spatial process. Mercury and polycyclic aromatic hydrocarbons were the most notable of the chemicals found to be present at elevated concentrations and affected by a spatial process. Evaluation of maps generated from spatial modeling revealed that proximity to ALCOA resulted in elevated risks for exposure to harmful concentrations of pollutants. Genotoxicity was measured in two sentinel species. Oysters (Crassostrea virginica) were utilized for evaluation of the marine environment and cattle (Bos taurus and Bos taurus crossbred cattle) were chosen for evaluation of the terrestrial environment. Chromosomal aberration analysis was performed on oyster hematocytes. Analysis of the results failed to demonstrate the presence of an important generalized spatial process but some specific locations close to the ALCOA plant had elevations in this measure of genotoxicity. Stress as measured by the lysosomal destabilization assay was also performed on oyster hematocytes. These results were found to be affected by a significant spatial process with the highest degree of destabilization occurring in close proximity to ALCOA. Genotoxicity in cattle was evaluated with the single cell gel electrophoresis assay and chromosomal aberration analysis. Bayesian geostatistical analyis revealed the presence of important spatial processes. DNA-protein cross-linkage was the most notable with a strong indication of increased damage down-prevailing wind from the industrial complexes. Results indicated that proximity to industrial facilities increased the risk for harmful exposures, genotoxicity, and lysosomal destabilization. spatial analysis genotoxicity Bayesian analysis sentinnel species biomarkers
658	Robust manufacturing system design using petri nets and bayesian methods Sharda, Bikram 10 October 2008 (has links) Manufacturing system design decisions are costly and involve significant investment in terms of allocation of resources. These decisions are complex, due to uncertainties related to uncontrollable factors such as processing times and part demands. Designers often need to find a robust manufacturing system design that meets certain objectives under these uncertainties. Failure to find a robust design can lead to expensive consequences in terms of lost sales and high production costs. In order to find a robust design configuration, designers need accurate methods to model various uncertainties and efficient ways to search for feasible configurations. The dissertation work uses a multi-objective Genetic Algorithm (GA) and Petri net based modeling framework for a robust manufacturing system design. The Petri nets are coupled with Bayesian Model Averaging (BMA) to capture uncertainties associated with uncontrollable factors. BMA provides a unified framework to capture model, parameter and stochastic uncertainties associated with representation of various manufacturing activities. The BMA based approach overcomes limitations associated with uncertainty representation using classical methods presented in literature. Petri net based modeling is used to capture interactions among various subsystems, operation precedence and to identify bottleneck or conflicting situations. When coupled with Bayesian methods, Petri nets provide accurate assessment of manufacturing system dynamics and performance in presence of uncertainties. A multi-objective Genetic Algorithm (GA) is used to search manufacturing system designs, allowing designers to consider multiple objectives. The dissertation work provides algorithms for integrating Bayesian methods with Petri nets. Two manufacturing system design examples are presented to demonstrate the proposed approach. The results obtained using Bayesian methods are compared with classical methods and the effect of choosing different types of priors is evaluated. In summary, the dissertation provides a new, integrated Petri net based modeling framework coupled with BMA based approach for modeling and performance analysis of manufacturing system designs. The dissertation work allows designers to obtain accurate performance estimates of design configurations by considering model, parameter and stochastic uncertainties associated with representation of uncontrollable factors. Multi-objective GA coupled with Petri nets provide a flexible and time saving approach for searching and evaluating alternative manufacturing system designs. Petri nets manufacturing systems Bayesian methods Robust design
659	Risk Based Maintenance Optimization using Probabilistic Maintenance Quantification Models of Circuit Breaker Natti, Satish 14 January 2010 (has links) New maintenance techniques for circuit breakers are studied in this dissertation by proposing a probabilistic maintenance model and a new methodology to assess circuit breaker condition utilizing its control circuit data. A risk-based decision approach is proposed at system level making use of the proposed new methodology, for optimizing the maintenance schedules and allocation of resources. This dissertation is focused on developing optimal maintenance strategies for circuit breakers, both at component and system level. A probabilistic maintenance model is proposed using similar approach recently introduced for power transformers. Probabilistic models give better insight into the interplay among monitoring techniques, failure modes and maintenance techniques of the component. The model is based on the concept of representing the component life time by several deterioration stages. Inspection and maintenance is introduced at each stage and model parameters are defined. A sensitivity analysis is carried to understand the importance of model parameters in obtaining optimal maintenance strategies. The analysis covers the effect of inspection rate calculated for each stage and its impact on failure probability, inspection cost, maintenance cost and failure cost. This maintenance model is best suited for long-term maintenance planning. All simulations are carried in MATLAB and how the analysis results may be used to achieve optimal maintenance schedules is discussed. A new methodology is proposed to convert data from the control circuit of a breaker into condition of the breaker by defining several performance indices for breaker assemblies. Control circuit signal timings are extracted and a probability distribution is fitted to each timing parameter. Performance indices for various assemblies such as, trip coil, close coil, auxiliary contacts etc. are defined based on the probability distributions. These indices are updated using Bayesian approach as the new data arrives. This process can be made practical by approximating the Bayesian approach calculating the indices on-line. The quantification of maintenance is achieved by computing the indices after a maintenance action and comparing with those of previously estimated ones. A risk-based decision approach to maintenance planning is proposed based on the new methodology developed for maintenance quantification. A list of events is identified for the test system under consideration, and event probability, event consequence, and hence the risk associated with each event is computed. Optimal maintenance decisions are taken based on the computed risk levels for each event. Two case studies are presented to evaluate the performance of the proposed new methodology for maintenance quantification. The risk-based decision approach is tested on IEEE Reliability Test System. All simulations are carried in MATLAB and the discussions of results are provided.
660	Bayesian Unit Root Test ¡V Application for Exchange Rate Market Liao, Siang-kai 24 June 2008 (has links) There should be more interpretations which are derived from data, presented by those professional analysts. The empirical rules and knowledge do help as making statistical inference in Econometrics. The approaches from classical statistical analysis make judges simply resulting from historical data. To be frank, the advantage of this analysis is the objectivity, but there is a fatal drawback. That is, it does not pay attention to some logically extra information. This paper is born for the applications of Bayesian, which has the essential characteristic of accepting subjective outlook, applying empirical rules to study unit root test on exchange rate market. Furthermore, the various distributions of data may have direct effect on the classical statistical inference we use, such as Dickey-Fuller and Phillips-Perron test. To take those defects into consideration, this paper tends not to take the assumption of disturbances in normal distribution as granted. For instance, it is quite common for us to confront the heavy-tailed distribution when studying some data of time series related to stocks and targets of investment. Hence, we will apply more generalized model to do research on Bayesian unit root test. Use the model of Schotman and Van Dijk (1991) and assuming disturbance shaped as independent student-t distribution to revise the unit root test, next, applying to exchange rate market. This is the motif of this paper. t Distribution Empirical Rule Unit Root Bayesian Exchange Rate Market

Search results