Global ETD Search

231	Penalized mixed-effects ordinal response models for high-dimensional genomic data in twins and families Gentry, Amanda E. 01 January 2018 (has links) The Brisbane Longitudinal Twin Study (BLTS) was being conducted in Australia and was funded by the US National Institute on Drug Abuse (NIDA). Adolescent twins were sampled as a part of this study and surveyed about their substance use as part of the Pathways to Cannabis Use, Abuse and Dependence project. The methods developed in this dissertation were designed for the purpose of analyzing a subset of the Pathways data that includes demographics, cannabis use metrics, personality measures, and imputed genotypes (SNPs) for 493 complete twin pairs (986 subjects.) The primary goal was to determine what combination of SNPs and additional covariates may predict cannabis use, measured on an ordinal scale as: “never tried,” “used moderately,” or “used frequently”. To conduct this analysis, we extended the ordinal Generalized Monotone Incremental Forward Stagewise (GMIFS) method for mixed models. This extension includes allowance for a unpenalized set of covariates to be coerced into the model as well as flexibility for user-specified correlation patterns between twins in a family. The proposed methods are applicable to high-dimensional (genomic or otherwise) data with ordinal response and specific, known covariance structure within clusters. ordinal regression penalization mixed models twin modeling cannabis use GWAS Applied Statistics Biostatistics Categorical Data Analysis Medical Genetics Other Applied Mathematics Other Public Health Personality and Social Contexts Psychiatric and Mental Health Statistical Models Substance Abuse and Addiction
232	On the Performance of some Poisson Ridge Regression Estimators Zaldivar, Cynthia 28 March 2018 (has links) Multiple regression models play an important role in analyzing and making predictions about data. Prediction accuracy becomes lower when two or more explanatory variables in the model are highly correlated. One solution is to use ridge regression. The purpose of this thesis is to study the performance of available ridge regression estimators for Poisson regression models in the presence of moderately to highly correlated variables. As performance criteria, we use mean square error (MSE), mean absolute percentage error (MAPE), and percentage of times the maximum likelihood (ML) estimator produces a higher MSE than the ridge regression estimator. A Monte Carlo simulation study was conducted to compare performance of the estimators under three experimental conditions: correlation, sample size, and intercept. It is evident from simulation results that all ridge estimators performed better than the ML estimator. We proposed new estimators based on the results, which performed very well compared to the original estimators. Finally, the estimators are illustrated using data on recreational habits. Statistics Ridge Regression Poisson Regression Monte Carlo Simulation Poisson Multicollinearity Correlation Poisson Ridge Regression Applied Statistics Multivariate Analysis Other Statistics and Probability Statistical Methodology Statistical Models Statistical Theory Theory and Algorithms
233	Geographic Factors of Residential Burglaries - A Case Study in Nashville, Tennessee Hall, Jonathan A. 01 November 2010 (has links) This study examines geographic patterns and geographic factors of residential burglary at the Nashville, TN area for a twenty year period at five year interval starting in 1988. The purpose of this study is to identify what geographic factors have impacted on residential burglary rates, and if there were changes in the geographic patterns of residential burglary over the study period. Several criminological theories guide this study, with the most prominent being Social Disorganization Theory and Routine Activities Theory. Both of these theories focus on the relationships of place and crime. A number of spatial analysis methods are hence adopted to analyze residential burglary rates at block group level for each of the study year. Spatial autocorrelation approaches, particularly Global and Local Moran's I statistics, are utilized to detect the hotspots of residential burglary. To understand the underlying geographic factors of residential burglary, both OLS and GWR regression analyses are conducted to examine the relationships between residential burglary rates and various geographic factors, such as Percentages of Minorities, Singles, Vacant Housing Units, Renter Occupied Housing Units, and Persons below Poverty Line. The findings indicate that residential burglaries exhibit clustered patterns by forming various hotspots around the study area, especially in the central city and over time these hotspots tended to move in a northeasterly direction during the study period of 1988-2008. Overall, four of the five geographic factors under examination show positive correlations with the rate of residential burglary at block group level. Percentages of Vacant Housing Units and Persons below Poverty Line (both are indicators of neighbor economic well-being) are the strong indicators of crime, while Percentages of Minorities (ethnic heterogeneity indictor) and Renter Occupied Housing Units (residential turnover indictor) only show modest correlation in a less degree. Counter-intuitively, Percentage of Singles (another indicator of residential turnover) is in fact a deterrent of residential burglary; however, the reason for this deterrence is not entirely clear. Nashville Tennessee residential burglary security measures crime prevention spatial statistics spatial regression Community-based Research Criminology Demography, Population, and Ecology Other Social and Behavioral Sciences Race and Ethnicity Social Control, Law, Crime, and Deviance Statistical Methodology Statistical Models
234	Computational identification of genes: ab initio and comparative approaches Parra Farré, Genís 03 December 2004 (has links) El trabajo que aquí se presenta, estudia el reconocimiento de las señales que delimitan y definen los genes que codifican para proteínas, así como su aplicabilidad en los programas de predicción de genes. La tesis que aquí se presenta, también explora la utilitzación de la genómica comparativa para mejorar la identificación de genes en diferentes especies simultaniamente. También se explica el desarrollo de dos programas de predicción computacional de genes: geneid y sgp2. El programa geneid identifica los genes codificados en una secuencia anónima de DNA basandose en sus propiedades intrínsecas (principalmente las señales de splicing y el uso diferencial de codones). sgp2 permite utilitzar la comparación entre dos genomas, que han de estar a una cierta distancia evolutiva óptima, para mejorar la predicción de genes, bajo la hipotesis que las regiones codificantes están mas conservadas que las regiones que no codifican para proteínas. / The motivation of this thesis is to give a little insight in how genes are encoded and recognized by the cell machinery and to use this information to find genes in unannotated genomic sequences. One of the objectives is the development of tools to identify eukaryotic genes through the modeling and recognition of their intrinsic signals and properties. This thesis addresses another problem: how the sequence of related genomes can contribute to the identification of genes. The value of comparative genomics is illustrated by the sequencing of the mouse genome for the purpose of annotating the human genome. Comparative gene predictions programs exploit this data under the assumption that conserved regions between related species correspond to functional regions (coding genes among them). Thus, this thesis also describes a gene prediction program that combines ab initio gene prediction with comparative information between two genomes to improve the accuracy of the predictions. anotación de genomas gene prediction geneid sgp2 y modelos estadísticos bioinformatics genómica comparativa genome annotation comparative genomics spicing signals coding statistics geneid sgp2 and statistical models estadísticos codificantes bioinformática señales de splicing predicción de genes 575
235	Conditional Streamflow Probabilities Roefs, T. G., Clainos, D. M. 23 April 1971 (has links) From the Proceedings of the 1971 Meetings of the Arizona Section - American Water Resources Assn. and the Hydrology Section - Arizona Academy of Science - April 22-23, 1971, Tempe, Arizona / Streamflows of monthly or shorter time periods, are, in most parts of the world, conditionally dependent. In studies of planning, commitment and operation decisions concerning reservoirs, it is probably most computationally efficient to use simulation routines for decisions of low dimensions, as planning and commitment, and optimization routines for the highly dimensional operation rule decisions. This presents the major problem of combining the 2 routines, since streamflow dependencies in simulation routines are continuous while the direct stochastic optimization routines are discrete. A stochastic streamflow synthesis routine is described consisting of 2 parts: streamflow probability distribution and dependency analysis and a streamflow generation using the relationships developed. A discrete dependency matrix between streamflow amounts was then sought. Setting as the limits of interest the class 400-500 thousand acre ft in January and 500-600 thousand acre ft in February, and using the transforms specified, the appropriate normal deviates were determined. The next serious problem was calculating the conditional dependency based on the bivariate normal distribution. In order to calculate the joint probability exactly, double integrations would be required and these use too much computer time. For the problem addressed, therefore, the use of 1-dimensional conditional probabilities based on the flow interval midpoint is an adequate and effective procedure. Water resources development -- Arizona. Hydrology -- Arizona. Hydrology -- Southwestern states. Probability Stochastic processes Streamflow Decision making Statistical models Mathematical studies Flow characteristics Variability Simulation analysis Optimization Reservoirs Planning Conditional probabilities Flow intervals
236	A Solution to Small Sample Bias in Flood Estimation Metler, William 06 May 1972 (has links) From the Proceedings of the 1972 Meetings of the Arizona Section - American Water Resources Assn. and the Hydrology Section - Arizona Academy of Science - May 5-6, 1972, Prescott, Arizona / In order to design culverts and bridges, it is necessary to compute an estimate of the design flood. Regionalization of flows by regression analysis is currently the method advocated by the U.S. Geological Survey to provide an estimate of the culvert and bridge design floods. In the regression analysis a set of simultaneous equations is solved for the regression coefficients which will be used to compute a design flood prediction for a construction site. The dependent variables in the set of simultaneous equations are the historical estimates of the design flood computed from the historical records of gaged sites in a region. If a log normal distribution of the annual peak flows is assumed, then the historical estimate of the design flood for site i may be computed by the normal as log Q(d,i) = x(i) + k(d)s(i). However because of the relatively small samples of peak flows commonly used in this problem, this paper shows that the historical estimate should be computed by to log Q(d,i) = X(i) + t(d,n-1) √((n+1)/n) s(i) where t(d,n-1) is obtained from tables of the Student's t. This t-estimate when used as input to the regression analysis provides a more realistic prediction in light of the small sample size, than the estimate yielded by the normal. Hydrology -- Arizona. Water resources development -- Arizona. Hydrology -- Southwestern states. Flood forecasting Sampling Algorithms Design criteria Statistical models Culverts Bridges Regression analysis Equations History Stream gages Average flow Peak discharge Regional flood Regional development Missouri
237	The Arizona Water Commission's Central Arizona Project Water Allocation Model System Briggs, Philip C. 16 April 1977 (has links) From the Proceedings of the 1977 Meetings of the Arizona Section - American Water Resources Assn. and the Hydrology Section - Arizona Academy of Science - April 15-16, 1977, Las Vegas, Nevada / The purpose and operation of the Central Arizona Project water allocation model system are described, based on a system analysis approach developed over the past 30 years into an interdisciplinary science for the study and resolution of complex technical management problems. The system utilizes mathematical and other simulation models designed for computer operations to effectively solve such problems as the CAP faces including those concerned with social and economic considerations. The model is composed of two major components: (1) a linear program designed to determine the optimal allocation of all sources of water to all demands and, (2) a hydrologic simulator capable of reflecting the impact of distribution alternatives on per-unit cost of delivery. The model, currently being use, has substantially contributed to a greater understanding of water usage potential in Arizona. Hydrology -- Arizona. Water resources development -- Arizona. Hydrology -- Southwestern states. Water management (Applied) Water allocation (Policy) Computer models Model studies Statistical models Linear programming Administration Central Arizona Project Arizona
238	Statistical Models and Methods for Rivers in the Southwest Hagan, Robert M. 16 April 1977 (has links) From the Proceedings of the 1977 Meetings of the Arizona Section - American Water Resources Assn. and the Hydrology Section - Arizona Academy of Science - April 15-16, 1977, Las Vegas, Nevada / Riverflow modeling is believed useful for purposes of decision making with respect to reservoir control, irrigation planning, and flood forecasting and design of structures to contain floods. This author holds the view that present riverflow models in vogue are unsatisfactory because, for one thing, sample simulations according to these models do not resemble observed southwestern river records. The purpose of this paper is to outline a general Markov model which assumes only that rivers have a finite memory. We show how to calibrate the model from river records and then present evidence to support our contention that some success has been realized in mimicking typical flows by our simulation procedure. Hydrology -- Arizona. Water resources development -- Arizona. Hydrology -- Southwestern states. Statistical models Model studies Simulation analysis Computer models Hydrologic data River flow River forecasting Streamflow forecasting Southwest U.S
239	A Utility Criterion for Real-time Reservoir Operation Duckstein, Lucien, Krzysztofowicz, Roman 16 April 1977 (has links) From the Proceedings of the 1977 Meetings of the Arizona Section - American Water Resources Assn. and the Hydrology Section - Arizona Academy of Science - April 15-16, 1977, Las Vegas, Nevada / A dual purpose reservoir control problem can logically be modelled as a game against nature. The first purpose of the reservoir is flood control under uncertain inflow, which corresponds to short -range operation (SRO); the second purpose, which the present model imbeds into the first one, is water supply after the flood has receded, and corresponds to long-range operation (LRO). The reservoir manager makes release decisions based on his SRO risk. The trade-offs involved in his decision are described by a utility function, which is constructed within the framework of Keeney's multiattribute utility theory. The underlying assumptions appear to be quite natural for the reservoir control problem. To test the model, an experiment assessing the utility criterion of individuals has been performed; the results tend to confirm the plausibility of the approach. In particular, most individuals appear to have a risk-averse attitude for small floods and a risk-taking attitude for large ones. Hydrology -- Arizona. Water resources development -- Arizona. Hydrology -- Southwestern states. Model studies Administration Risks Flood control Reservoir operation Water resources Planning Mathematical models Statistical models Water control Water supply Reservoir releases
240	Vers l'intégration de post-éditions d'utilisateurs pour améliorer les systèmes de traduction automatiques probabilistes / Towards the integration of users' post-editions to improve phrase-based machine translation systems Potet, Marion 09 April 2013 (has links) Les technologies de traduction automatique existantes sont à présent vues comme une approche prometteuse pour aider à produire des traductions de façon efficace et à coût réduit. Cependant, l'état de l'art actuel ne permet pas encore une automatisation complète du processus et la coopération homme/machine reste indispensable pour produire des résultats de qualité. Une pratique usuelle consiste à post-éditer les résultats fournis par le système, c'est-à-dire effectuer une vérification manuelle et, si nécessaire, une correction des sorties erronées du système. Ce travail de post-édition effectué par les utilisateurs sur les résultats de traduction automatique constitue une source de données précieuses pour l'analyse et l'adaptation des systèmes. La problématique abordée dans nos travaux s'intéresse à développer une approche capable de tirer avantage de ces retro-actions (ou post-éditions) d'utilisateurs pour améliorer, en retour, les systèmes de traduction automatique. Les expérimentations menées visent à exploiter un corpus d'environ 10 000 hypothèses de traduction d'un système probabiliste de référence, post-éditées par des volontaires, par le biais d'une plateforme en ligne. Les résultats des premières expériences intégrant les post-éditions, dans le modèle de traduction d'une part, et par post-édition automatique statistique d'autre part, nous ont permis d'évaluer la complexité de la tâche. Une étude plus approfondie des systèmes de post-éditions statistique nous a permis d'évaluer l'utilisabilité de tels systèmes ainsi que les apports et limites de l'approche. Nous montrons aussi que les post-éditions collectées peuvent être utilisées avec succès pour estimer la confiance à accorder à un résultat de traduction automatique. Les résultats de nos travaux montrent la difficulté mais aussi le potentiel de l'utilisation de post-éditions d'hypothèses de traduction automatiques comme source d'information pour améliorer la qualité des systèmes probabilistes actuels. / Nowadays, machine translation technologies are seen as a promising approach to help produce low cost translations. However, the current state of the art does not allow the full automation of the process and human intervention remains essential to produce high quality results. To ensure translation quality, system's results are commonly post-edited : the outputs are manually checked and, if necessary, corrected by the user. This user's post-editing work can be a valuable source of data for systems analysis and improvement. Our work focuses on developing an approach able to take advantage of these users' feedbacks to improve and update a statistical machine translation (SMT) system. The conducted experiments aim to exploit a corpus of about 10,000 SMT translation hypotheses post-edited by volunteers through a crowdsourcing platform. The first experiments integrated post-editions into the translation model on the one hand, and on the system outputs by automatic post-editing on another hand, and allowed us to evaluate the complexity of the task. Our further detailed study of automatic statistical post-editions systems evaluate the usability, the benefits and limitations of the approach. We also show that the collected post-editions can be successfully used to estimate the confidence of a given result of automatic translation. The obtained results show that the use of automatic translation hypothese post-editions as a source of information is a difficult but promising way to improve the quality of current probabilistic systems. Traduction automatique Modèles probabilistes Post-édition humaine Post-édition statistique Collecte de données Estimation de confiance Automatic translation Statistical models Human post-edits Statistical post-edition (SPE) Data collection Confidence estimation 004

Search results