661 |
Analysis of repayment ability for agricultural loans in Virginia using a qualitative choice modelPark, William N. January 1986 (has links)
Agricultural loans issued to farmers in Virginia from the years 1980-1985 are examined to determine the factors which significantly predict repayment ability. Through a review of literature, extension meetings, conferences and informal conversation with agricultural lenders in the state, a list of financial variables and operation characteristics is compiled and analyzed. Results of the analysis are considered in terms of their immediate and potential assistance to lenders in making loan decisions.
Using data from various commercial banks, Production Credit Associations and Farmers Home Administrations throughout Virginia, a model is developed to determine repayment ability of a borrower. Results indicate that several factors are significant in determining this process. Financial ratios such as percent equity, current debt, cash flow I and cash expense-cash receipt are important in determining if a borrower will repay his loan as scheduled. A number of operation characteristics were also found significant. These include: the number of creditors of the borrower, the amount of diversification of the operation and the amount of non-farm income. The results of the study should prove to be a significant aid to lenders and implies need for further research in the loan repayment area. / M.S.
|
662 |
Cell parameter systematics of the binary silicate olivines: methods for the determination of composition and intracrystalline cation orderingMiller, Mark L. January 1985 (has links)
Multiple linear regression analysis has been used to determine the relationship between the unit cell parameters and the average radii, rM1 and rM2, of the octahedral cations in Fe-Mn, Ni-Fe, Mg-Mn, and Mg-, Fe-, and Mn-Ca binary olivines, using data from the literature. The resulting regression equations are given below. The coefficients of correlation exceed 0.997 in all cases.
Fe-Mn binary olivines
a= 3.527 + 1.341rM1 + 0.317rM2
b = 8.586 - 1.856rM1 + 4.281rM2
Mg-Mn binary olivines
a= 3.798 + 1.014rM1 + 0.314rM2
b = 7.552 + 0.745rM1 + 2.922rM2
Ni-Fe binary olivines
a= 4.007 + 1.052rM1 - 0.012rM2
b = 7.331 + 1.375rM1 + 2.636rM2
Mg-Ca binary olivines
a= 3.919 + 0.814rM1 + 0.245rM2
b = 7.546 + 0.418rM1 + 3.261rM2
Fe-Ca binary olivines
a = 4.153 + 0. 781 rM1 + 0.144rM2
b = 7.551 + 0.142rM1 + 2.532rM2
Mn-Ca binary olivines
a= 4.048 + 0.905rM1 + 0. 124rM2
b = 7.494 + 0.352rM1 + 3.378rM2
The Mg-Fe binary yielded inconsistent regression results, most likely due to ambiguous site assignments.
Because for both systems the a cell edge is mainly dependent on rM1 and b on rM2, these equations may be used to construct a vs. b diagrams which may be contoured for bulk composition, M2 site occupancy, and the distribution coefficient, K<sub>D</sub> . Similar determinative diagrams may be drawn using calculated cl-spacings of the 130 peak, which is sensitive to cation order in the M1 and M2 site, and the 112 peak, which is composition dependent.
Tests of these diagrams, using Shinno's (1980) d₁₃₀-spacings and cation site occupancy data for numerous synthetic Fe-Mn olivines, indicate that this model is reliable; agreement of predicted and observed Mn content of the M2 site is within 0.03 atoms on the average. Eleven natural olivines (Fa<sub>45-85</sub> Te<sub>50-8</sub> Fo<sub>2-11</sub>), for which cell dimensions and, in several cases, site refinements were available, were critically evaluated using the equations and diagrams for the Fe-Mn binary, with mixed results.
A preliminary investigation was undertaken to assess the use of Rietveld analysis to determine unit cell parameters, atomic positional and thermal parameters, and site distribution of divalent cations in the Mg-Mn binary olivine system. Results indicate that the method can be used successfully; however, systematic errors inherent in the diffractometer prevented refinement of useful (error-free) data. / M.S.
|
663 |
Statistical methods for transcriptomics: From microarrays to RNA-seqTarazona Campos, Sonia 30 March 2015 (has links)
La transcriptómica estudia el nivel de expresión de los genes en distintas condiciones experimentales para tratar de identificar los genes asociados a un fenotipo dado así como las relaciones de regulación entre distintos genes. Los datos ómicos se caracterizan por contener información de miles de variables en una muestra con pocas observaciones. Las tecnologías de alto rendimiento más comunes para medir el nivel de expresión de miles de genes simultáneamente son los microarrays y, más recientemente, la secuenciación de RNA (RNA-seq).
Este trabajo de tesis versará sobre la evaluación, adaptación y desarrollo de modelos estadísticos para el análisis de datos de expresión génica, tanto si ha sido estimada mediante microarrays o bien con RNA-seq. El estudio se abordará con herramientas univariantes y multivariantes, así como con métodos tanto univariantes como multivariantes. / Tarazona Campos, S. (2014). Statistical methods for transcriptomics: From microarrays to RNA-seq [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/48485 / Premios Extraordinarios de tesis doctorales
|
664 |
Growth Mixture Modeling with Non-Normal Distributions - Implications for Class ImbalanceHan, Lu January 2024 (has links)
Previous simulation studies on the non-normal GMM are very limited with respect to examining effects of a high degree of class imbalance. To extend previous studies, the present study aims to examine through Monte Carlo simulation the impact of a higher degree of imbalanced class proportion (i.e., 0.90/0.10) on the performance of different distribution methods (i.e., normal, t, skew-normal, and skew-t) in estimating non-normal GMMs.
To fulfill this purpose, a Monte Carlo simulation was based on a two-class skew-t growth mixture model under different conditions of sample sizes (1000, 3000), class proportions (0.90/0.10, 0.50/0.50), skewness for intercept (1, 4), kurtosis (2, 6), and class separations (high, low), using the four different distributions (i.e., normal, t, skew-normal, and skew-t). Furthermore, another aim of the present study was to assess the ability of various model fit indices and LRT-based tests (i.e., AIC, BIC, sample size-adjusted BIC, LMR-LRT, LMR-adjusted LRT, and entropy) for detection non-normal GMMs under a higher degree of class imbalance (0.90/0.10).
The results indicate that (1) the skew-t distribution is highly recommended for estimating non-normal GMMs under high-class separation with highly imbalanced class proportions of 0.90/0.10, irrespective of sample size, skewness for intercept, and kurtosis; (2) For low-class separation with high class imbalance (0.90/0.10), the normal distribution is highly recommended based on the AIC, BIC, and sample size-adjusted BIC, while the skew-t distribution is most recommended based on the entropy; (3) poor class separation significantly reduces the performance of every distribution for estimating non-normal GMMs with high class imbalance, especially for the skew-t and t GMMs; (4) insufficient sample size significantly reduces the performance of the skew-t and t distributions for estimating non-normal GMMs with high class imbalance; (5) high class imbalance (0.90/0.10) and poor class separation significantly reduces the ability of the LRT-based tests for all distributions across different conditions; (6) excessive levels of skewness for the intercept significantly decreases the ability of most fit indices for the skew-t distribution (BIC and LRT-based tests), t (AIC, BIC, sBIC, and LRT-based tests), skew-normal (AIC and BIC), and normal (LRT-based tests) distributions when estimating non-normal GMMs with high class imbalance; (7) excessive levels of kurtosis has a partial negative effect on the performance of the skew-t (AIC, BIC, and LRT-based tests) and t (AIC, BIC, sBIC, and LRT-based tests) distributions when the level of skewness for intercept is excessive; and (8) for the highly imbalanced class proportions of 0.90/0.10, the sBIC and entropy for the skew-t distribution outperform the other fit indices under high-class separation, while the AIC, BIC, and sample size-adjusted BIC for the normal distribution and the entropy for the skew-t distribution are the most reliable fit indices under low-class separation.
|
665 |
On Modeling Spatial Time-to-Event Data with Missing Censoring TypeLu, Diane January 2024 (has links)
Time-to-event data, a common occurrence in medical research, is also pertinent in the ecological context, exemplified by leaf desiccation studies using innovative optical vulnerability techniques. Such data can unveil valuable insights into the influence of various factors on the event of interest. Leveraging both spatial and temporal information, spatial survival modeling can unravel the intricate spatiotemporal dynamics governing event occurrences. Existing spatial survival models often assume the availability of the censoring type for censored cases. Various approaches have been employed to address scenarios where a "subset" of cases lacks a known "censoring indicator" (i.e., whether they are right-censored or uncensored). This uncertainty in the subset pertains to missing information regarding the censoring status. However, our study specifically centers on situations where the missing information extends to "all" censored cases, rendering them devoid of a known censoring "type" indicator (i.e., whether they are right-censored or left-censored).
The genesis of this challenge emerged from leaf hydraulic data, specifically embolism data, where the observation of embolism events is limited to instances when leaf veins transition from water-filled to air-filled during the observation period. Although it is known that all veins eventually embolize when the entire plant dries up, the critical information of whether a censored leaf vein embolized before or after the observation period is absent. In other words, the censoring type indicator is missing.
To address this challenge, we developed a Gibbs sampler for a Bayesian spatial survival model, aiming to recover the missing censoring type indicator. This model incorporates the essential embolism formation mechanism theory, accounting for dynamic patterns observed in the embolism data. The model assumes spatial smoothness between connected leaf veins and incorporates vein thickness information. Our Gibbs sampler effectively infers the missing censoring type indicator, as demonstrated on both simulated and real-world embolism data. In applying our model to real data, we not only confirm patterns aligning with existing phytological literature but also unveil novel insights previously unexplored due to limitations in available statistical tools.
Additionally, our results suggest the potential for building hierarchical models with species-level parameters focusing solely on the temporal component. Overall, our study illustrates that the proposed Gibbs sampler for the spatial survival model successfully addresses the challenge of missing censoring type indicators, offering valuable insights into the underlying spatiotemporal dynamics.
|
666 |
Learning from Optimal Actions: Theory and Empirical Analysis in Digital PlatformsResende Fonseca, Yuri January 2024 (has links)
This thesis focuses on learning from revealed preferences and their implications across operations management problems through an Inverse Problem perspective.
For the first part of the thesis, we focus on decentralized platforms facilitating many-to-many matches between two sides of a marketplace. In the absence of direct matching, inefficiency in market outcomes can easily arise. For instance, popular supply agents may garner many units from the demand side, while other supply units may not receive any match. A central question for the platform is how to manage congestion and improve market outcomes.
In Chapter One, we study the impact of a detail-free lever: the disclosure of information to agents on current competition levels. How large are the effects of this lever, and how do they affect overall market outcomes? We answer this question empirically. We partner with the largest service marketplace in Latin America, which sells non-exclusive labor market leads to workers. The key innovation in our approach is the proposal of a structural model that allows agents (workers) to respond to competitors through beliefs about competition at the lead level, which in turn implies an equilibrium at the platform level under the assumption of rational expectations. In this problem, we observe agents' best responses (actions), and from that, we need to infer their structural parameters. Identification follows from an exogenous intervention that changes agents' contextual information and the platform equilibrium. We then conduct counterfactual analyses to study the impact of signaling competition on workers' lead purchasing decisions, the platform's revenue, and the expected number of matches. We find that signaling competition is a powerful lever for the platform to reduce congestion, redirect demand, and ultimately improve the expected number of matches for the markets we analyze.
For the second part of the thesis, we discuss both parametric and modelling approaches in Inverse Problems. In Chapter Two, we focus on Inverse Optimization Problems in a single-agent setting. Specifically, we study offline and online contextual optimization with feedback information, where instead of observing the loss, we observe, after-the-fact, the optimal action an oracle with full knowledge of the objective function would have taken. We aim to minimize regret, which is defined as the difference between our losses and the ones incurred by an all-knowing oracle. In the offline setting, the decision-maker has information available from past periods and needs to make one decision, while in the online setting, the decision-maker optimizes decisions dynamically over time based on a new set of feasible actions and contextual functions in each period. For the offline setting, we characterize the optimal minimax policy, establishing the performance that can be achieved as a function of the underlying geometry of the information induced by the data. In the online setting, we leverage this geometric characterization to optimize the cumulative regret. We develop an algorithm that yields the first regret bound for this problem, which is logarithmic in the time horizon. Furthermore, we show via simulation that our proposed algorithms outperform previous methods from the literature.
Finally, in Chapter Three, we consider data-driven methods for general Inverse Problem formulations under a statistical framework (Statistical Inverse Problem-SIP) and demonstrate how Stochastic Gradient Descent (SGD) algorithms can be used to solve linear SIP. We provide consistency and finite sample bounds for the excess risk. We exemplify the algorithm in the Functional Linear Regression setting with an empirical application in predicting illegal activity from bitcoin wallets. We also discuss additional applications and extensions.
|
667 |
Genetic and phenotypic relationships among fifteen measures of reproduction in dairy cattleMeland, Ole Mervin January 1984 (has links)
Reproductive data from 30 research herds were on 31,132 breeding periods of 11,347 dairy cows. Cows were sired by 1,101 sires and had 66,184 services to 1,320 service sires. Several measures of reproductive pe.rformance were calculated. These included conception rate, number of services, service period length, days open, age at first breeding, calving interval, days between services, and return to estrus lag. First, second and third service period were each analyzed separately, while fourth and later service periods were pooled.
Heritability was estimated using the sire component of variance and the estimate of the total variance derived from MIVQUEO and maximum likelihood analyses. The data set was restricted to daughters of sires used in multiple herds. Heritability estimates were less than .07 for all traits in the heifer service period except age at first breeding (.2 by maximum likelihood and .13 by MIVQUEO). Similarly, with the exception of conception rate, none of the measures of reproduction had heritabilities greater than .05 for all three remaining service period groups. Conception rate measured as a trait of the male (service sire) ranged from .08 to .135 for second and third service periods. Conception rate as female trait (sire) had heritabilities ranging from .09 to .249 for second and third service periods.
Low heritability estimates obtained in this and other studies suggest that large progeny or service sire groups will be necessary to identify the small genetic differences between bulls.
Many genetic and phenotypic correlations were forced positive due to a part-whole relationship or due to the fact they were simply different bounds for the same measure. A few correlations were in the range from .50 to .90, but many were not significantly different from zero due to large approximate standard errors.
Repeatabilities based upon pairwise comparisons were in the range from 0 to .13. Repeatabilities for the reproductive performance of virgin heifers with first parity ranged from .01 to .06 and were generally smaller than later parities. Repeatabilities based upon repeated measures on the same cow ranged from 0 to .12.
Predicted Differences for female (sire) and male (service sire) reproduction were calculated by Best Linear Unbiased Prediction. This analysis included 207 bulls which were in the data both as sire and service sire. Correlations between proofs for male and female reproduction ranged from -.13 to .13. These results suggest limited genetic relationships between male and female fertility. / Ph. D.
|
668 |
Nonparametric procedures for process control when the control value is not specifiedPark, Changsoon January 1984 (has links)
In industrial production processes, control charts have been developed to detect changes in the parameters specifying the quality of the production so that some rectifying action can be taken to restore the parameters to satisfactory values. Examples of the control charts are the Shewhart chart and the cumulative sum control chart (CUSUM chart). In designing a control chart, the exact distribution of the observations, e.g. normal distribution, is usually assumed to be known. But, when there is not sufficient information in determining the distribution, nonparametric procedures are appropriate. In such cases, the control value for the parameter may not be given because of insufficient information.
To construct a control chart when the control value is not given, a standard sample must be obtained when the process is known to be under control so that the quality of the product can be maintained at the same level as that of the standard sample. For this purpose, samples of fixed size are observed sequentially, and at each time a sample is observed a two-sample nonparametric statistic is obtained from the standard sample and the sequentially observed sample. With these sequentially obtained statistics, the usual process control procedure can be done. The truncation point is applied to denote the finite run length or the time at which sufficient information about the distribution of the observations and/or the control value is obtained so that the procedure may be switched to a parametric procedure or a nonparametric procedure with a control value.
To lessen the difficulties in the dependent structure of the statistics we use the fact that conditioned on the standard sample the statistics are i.i.d. random variables. Upper and lower bounds of the run length distribution are obtained for the Shewhart chart. A Brownian motion process is used to approximate the discrete time process of the CUSUM chart. The exact run length distribution of the approximated CUSUM chart is derived by using the inverse Laplace transform. Applying an appropriate correction to the boundary improves the approximation. / Ph. D.
|
669 |
An investigation of a bivariate distribution approach to modeling diameter distributions at two points in timeKnoebel, Bruce R. January 1985 (has links)
A diameter distribution prediction procedure for single species stands was developed based on the bivariate S<sub>B</sub> distribution model. The approach not only accounted for and described the relationships between initial and future diameters and their distributions, but also assumed future diameter given initial diameter to be a random variable. While this method was the most theoretically correct, comparable procedures based on the definition of growth equations which assumed future diameter given initial diameter to be a constant, sometimes provided somewhat better results. Both approaches performed as well, and in some cases, better than the established methods of diameter distribution prediction such as parameter recovery, percentile prediction, and parameter prediction.
The approaches based on the growth equations are intuitively and biologically appealing in that the future distribution is determined from an initial distribution and a specified initial-future diameter relationship. ln most appropriate. While this result simplified some procedures, it also implied that the initial and future diameter distributions differed only in location and scale, not in shape. This is a somewhat unrealistic assumption, however, due to the relatively short growth periods and the alterations in stand structure and growth due to the repeated thinnings, the data did not provide evidence against the linear growth equation assumption.
The growth equation procedures not only required the initial and future diameter distributions to be of a particular form, but they also restricted the initial-future diameter relationship to be of a particular form. The individual tree model, which required no distributional assumptions or restrictions on the growth equation, proved to be the better approach to use in terms of predicting future stand tables as it performed better than all of the distribution-based approaches.
For the bivariate distribution, the direct fit, parameter recovery, parameter prediction and percentile prediction diameter distribution prediction techniques, implied diameter relationships were defined. Evaluations revealed that these equations were both accurate and precise, indicating that the accurate specification of the initial distribution and the diameter diameter distribution. / Ph. D.
|
670 |
Forecasting corporate performanceHarrington, Robert P. January 1985 (has links)
For the past twenty years, the usefulness of accounting information has been emphasized. In 1966 the American Accounting Association in its State of Basic Accounting Theory asserted that usefulness is the primary purpose of external financial reports. In 1978 the State of Financial Accounting Concepts, No. 1 affirmed the usefulness criterion. "Financial reporting should provide information that is useful to present and potential investors and creditors and other users..."
Information is useful if it facilitates decision making. Moreover, all decisions are future-oriented; they are based on a prognosis of future events. The objective of this research, therefore, is to examine some factors that affect the decision maker's ability to use financial information to make good predictions and thereby good decisions.
There are two major purposes of the study. The first is to gain insight into the amount of increase in prediction accuracy that is expected to be achieved when a model replaces the human decision-maker in the selection of cues. The second major purpose is to examine the information overload phenomenon to provide research evidence to determine the point at which additional information may contaminate prediction accuracy.
The research methodology is based on the lens model developed by Eyon Brunswick in 1952. Multiple linear regression equations are used to capture the participants’ models, and correlation statistics are used to measure prediction accuracy. / Ph. D.
|
Page generated in 0.0858 seconds