Global ETD Search

431	Improved permeability prediction using multivariate analysis methods Xie, Jiang 15 May 2009 (has links) Predicting rock permeability from well logs in uncored wells is an important task in reservoir characterization. Due to the high costs of coring and laboratory analysis, typically cores are acquired in only a few wells. Since most wells are logged, the common practice is to estimate permeability from logs using correlation equations developed from limited core data. Most commonly, permeability is estimated from various well logs using statistical regression. For sandstones, often the logs of permeability can be correlated with porosity, but in carbonates the porosity permeability relationship tends to be much more complex and erratic. For this reason permeability prediction is a critical aspect of reservoir characterization in complex reservoirs such as carbonate reservoirs. In order to improve the permeability estimation in these reservoirs, several statistical regression techniques have already been tested in previous work to correlate permeability with different well logs. It has been shown that statistical regression for data correlation is quite promising in predicting complex reservoirs. But using all the possible well logs to predict permeability is not appropriate because the possibility of spurious correlation increases if you use more well logs. In statistics, variable selection is used to remove unnecessary independent variables and give a better prediction. So we apply variable selection to the permeability prediction procedures in order to further improve permeability estimation. We present three approaches to further improve reservoir permeability prediction based on well logs via data correlation and variable selection in this research. The first is a combination of stepwise algorithm with ACE technique. The second approach is the application of tree regression and cross-validation. The third is multivariate adaptive regression splines. Three methods are tested and compared at two complex carbonate reservoirs in west Texas: Salt Creek Field Unit (SCFU) and North Robertson Unit (NRU). The result of SCFU shows that permeability prediction is improved by applying variable selection to non-parametric regression ACE while tree regression is unable to predict permeability because it can not preserve the continuity of permeability. In NRU, none of these three methods can predict permeability accurately. This is due to the high complexity of NRU reservoir and measurement accuracy. In this reservoir, high permeability is discrete from low permeability, which makes prediction even more difficult. Permeability predictions based on well logs in complex carbonate reservoirs can be further improved by selecting appropriate well logs for data correlation. In comparing the relative predictive performance of the three regression methods, the stepwise with ACE method appears to outperform the other two methods. Permeability prediction well logs multivariate regression methods
432	Co-relation of Variables Involved in the Occurrence of Crane Accidents in U.S. through Logit Modeling. Bains, Amrit Anoop Singh 2010 August 1900 (has links) One of the primary reasons of the escalating rates of injuries and fatalities in the construction industry is the ever so complex, dynamic and continually changing nature of construction work. Use of cranes has become imperative to overcome technical challenges, which has lead to escalation of danger on a construction site. Data from OSHA show that crane accidents have increased rapidly from 2000 to 2004. By analyzing the characteristics of all the crane accident inspections, we can better understand the significance of the many variables involved in a crane accident. For this research, data were collected from the U.S. Department of Labor website via the OSHA database. The data encompass crane accident inspections for all the states. The data were divided into categories with respect to accident types, construction operations, degree of accident, fault, contributing factors, crane types, victim’s occupation, organs affected and load. Descriptive analysis was performed to compliment the previous studies, the only difference being that both fatal and non-fatal accidents have been considered. Multinomial regression has been applied to derive probability models and correlation between different accident types and the factors involved for each crane accident type. A log likelihood test as well as chi-square test was performed to validate the models. The results show that electrocution, crane tip over and crushed during assembly/disassembly have more probability of occurrence than other accident types. Load is not a significant factor for the crane accidents, and manual fault is more probable a cause for crane accident than is technical fault. Construction operations identified in the research were found to be significant for all the crane accident types. Mobile crawler crane, mobile truck crane and tower crane were found to be more susceptible. These probability models are limited as far as the inculcation of unforeseen variables in construction accidents are concerned. In fact, these models utilize the past to portray the future, and therefore significant change in the variables involved is required to be added to attain correct and expedient results. Cranes Accidents Multinomial Regression Logit Modeling
433	SMVCIR Dimensionality Test Lindsey, Charles D. 2010 May 1900 (has links) The original SMVCIR algorithm was developed by Simon J. Sheather, Joseph W. McKean, and Kimberly Crimin. The dissertation first presents a new version of this algorithm that uses the scaling standardization rather than the Mahalanobis standardization. This algorithm takes grouped multivariate data as input and then outputs a new coordinate space that contrasts the groups in location, scale, and covariance. The central goal of research is to develop a method to determine the dimension of this space with statistical confidence. A dimensionality test is developed that can be used to make this determination. The new SMVCIR algorithm is compared with two other inverse regression algorithms, SAVE and SIR in the process of developing the dimensionality test and testing it. The dimensionality test is based on the singular values of the kernel of the spanning set of the vector space. The asymptotic distribution of the spanning set is found by using the central limit theorem, delta method, and finally Slutsky's Theorem with a permutation matrix. This yields a mean adjusted asymptotic distribution of the spanning set. Theory by Eaton, Tyler, and others is then used to show an equivalence between the singular values of the mean adjusted spanning set statistic and the singular values of the spanning set statistic. The test statistic is a sample size scaled sum of squared singular values of the spanning set. This statistic is asymptotically equivalent in distribution to that of a linear combination of independent 21 random variables. Simulations are performed to corroborate these theoretic findings. Additionally, based on work by Bentler and Xie, an approximation to the test statistic reference distribution is proposed and tested. This is also corroborated with simulations. Examples are performed that demonstrate how SMVCIR is used and how the developed tests for dimensionality are performed. Finally, further directions of research are hinted at for SMVCIR and the dimensionality test. One of the more interesting directions is explored by briefly examining how SMVCIR can be used to identify potentially complex functions that link predictors and a continuous response variable. multivariate data group discrimination inverse regression
434	Logistic Regression Model applies to resignation factors for commissioned and non-commissioned officers in Chinese Marine Corps¡XTake southern Marine forces as examples Chang, Wei-kuo 18 July 2006 (has links) High quality defense personnel have decisive influence at modern war, and therefore it is the benefit for national security, and the root, garuantee for enhancing military combat power. For years, highly personnel resignation rate has been an important issue for militart personnel resources management. Abnormal resignation rate will not only influences the quality of organizational operation but also disr pts the experience of personnel of the organizational structure.Especially for military services,it will effect our national security and combat power as a whole. General studies of probing resignation were most focuset on factors of resignation will,tendency as probing issues,seldom studies were focused on systematic stuies of resignation rate. Therefore, it is a respond of human resources policies to probe resignation rate in an appropriate way. In this stay, the commissioned and non-commissioned offices in Chinese Marine Corp who stationed in southern Taiwan were taken as probing factors. The predictable capability of Logistic Regression Model has been used in this study as well in order to create the calculation model mode for resignanation rate. The result of the study has been comfirmed that educational level, part-time studies, seniority, marriage, ranking, branch of military services, salary, unit character, welfare and so on were all resignationrelared. Also it is acceptable to predict resignation rate by utilizing this method. predicts resignation rate Logistic Regression Model resignation
435	A study on borrower¡¦s background condition related to the risk of home loan Lin, Ch-ye 01 August 2006 (has links) ABSTRACT This research aims at evaluating the risk of home loan, financial institutions generally think that makes enterprises to grant the loan and have a big risk in recent years , turning to and developing consumption financial transaction one after another, the personal home loan has already become the strategic point of every financial institution. The characteristic of the home loan is small for amount of money , there are many pieces , need to spend a large amount of manpower maintaining business operation , so it is fast and clear to verify the way , avoid lacking subjectively and reduce the risk , namely become one of the keys whether this business could be succeeded in promoting. So, the financial institution should set up a set of objective just awarding the way of evaluating the risk, is it verify personnel check and ratify loan amount fast, interest rate and award creed one while being other to help, reach the quantity, the goals of the standardization and automation. This research regards a domestic commercial bank as the main research object, and owe the parent in order to sample of case that is put of all loans now at present with this bank , 2431 normal samples , 381 bad samples of the above three months overdue , total 2812, and award the basic materials of the borrower of forms contained of letter application , and Joint Credit Information Center seek the credit materials that letter in the center inquires is the research range , analyse the background of different borrowers and overdue relation that make loans. Research this real example result can be summed up for as follows. The parameter of risk of showing of the loan is the sex, the age, academic credentials, grace period, family's annual income, round number of the borrowed money, the number of the cash card, interest rate, guarantee debts, whether it is the housing loan of a large number of types, collateral kind credit risk Logistic regression home loan
436	The relationship between information frequency and financial distress prediction Hung, Chia-ching 20 June 2007 (has links) This thesis is based on the stock listing electronic companies in TSE and OTC. There are two purposes of this paper. First, to understand what the difference between failure and non-failure firms under financial factors and corporate governance indicators. And second, to compare with the different material frequency, the predictive ability and the correlation regarding the enterprise crisis reveals the variable whether has a difference. The experiment results show that: By independent-sample t test and logistic regression, we find that under the quarterly financial statements, the profit index is the most manifest factor and the next is debt ratio. The closer to the time of the distress, the more factors in operating efficiency that make the two kinds of the firms differ. Financial distress firms have the higher account receivable turnover rate. In corporate governance factors, the proportion of family members holdings and the rate of directors¡¦ shareholding are the most two important variables. The results from yearly financial reports are similar to which from quarterly financial statements. Profit index and liquidity index can be the prior indications to judge whether a firm gets financial crisis or not. In independent-sample t test, the cash flow from operation ratio and times interest earned are marked variables in the first and second year before bankrupt. The diversity of traditional financial index and the corporate governance variables between failure firms and normal firms are very obvious in the first year previous to failure. In corporate governance factors, the proportion of family members holdings and the extent of the shares as collateral by the board of directors are the most important variables. Regardless of yearly or quarterly financial statements, the closer to the time of the distress, the more different variables appear. The average percentage of correctly classified firms is 80.13% from the 8th to 5th quarter previous to the distress better than 2nd year previous to the distress. Compared with the average accurate prediction rate from the 4th to 1st quarter, the predicting ability from 1st yearly financial statement is better. But the 1st and 2nd accurate rate are 92.54% and 93.44%, the average is 93%. In other words, we can overcome the time lag and raise the predictive ability by using quarterly reports rather than yearly financial statements. logistic regression corporate governance financial distress
437	A Test Data Evolution Strategy under Program Changes Hsu, Chang-ming 23 July 2007 (has links) Since the cost of software testing has continuously accounted for large proportion of the software development total cost, automatic test data generation becomes a hot topic in recent software testing research. These researches attempt to reduce the cost of software testing by generating test data automatically, but they are discussed only for the single version programs not for the programs which are needed re-testing after changing. On the other hand, the regression testing researches discuss about how to re-test programs after changing, but they don¡¦t talk about how to generate test data automatically. Therefore, we propose an automatic test data evolution strategy in this paper. We use the method of regression testing to find out the part of programs which need re-testing, then automatic evolutes the test data by hybrid genetic algorithm. According to the experiment result, our strategy has the same or better testing ability but needs less cost than the other strategies. Hybrid Genetic Algorithm Regression testing Software Testing
438	Nesting ecology of dickcissels on reclaimed surface-mined lands in Freestone County, Texas Dixon, Thomas Pingul 17 February 2005 (has links) Surface mining and subsequent reclamation often results in the establishment of large areas of grassland that can benefit wildlife. Grasslands have declined substantially over the last 150 years, resulting in declines of many grassland birds. The dickcissel (Spiza americana), a neotropical migrant, is one such bird whose numbers have declined in the last 30 years due to habitat loss, increased nest predation and parasitism, and over harvest (lethally controlled as an agricultural pest on its wintering range in Central and South America). Reclaimed surface-mined lands have been documented to provide important breeding habitat for dickcissels in the United States, emphasizing the importance of reclamation efforts. Objectives were to understand specific aspects of dickcissel nesting ecology (i.e., nest-site selection, nest success, and nest parasitism, and identification of nest predators) on 2 spatial scales on TXU Energys Big Brown Mine, near Fairfield, Texas, and to subsequently provide TXU Energy with recommendations to improve reclaimed areas as breeding habitat for dickcissels. I examined the influence of nest-site vegetation characteristics and the effects of field-level spatial factors on dickcissel nesting ecology on 2 sites reclaimed as wildlife habitat. Additionally, I developed a novel technique to identify predators at active nests during the 2003 field season. During 20022003, 119 nests were monitored. On smaller spatial scales, dickcissels were likely to select nest-sites with low vegetation, high densities of bunchgrasses and tall forbs, and areas with higher clover content. Probability of nest success increased with nest heights and vegetation heights above the nest, characteristics associated with woody nesting substrates. Woody nesting substrates were selected and bunchgrasses were avoided. Oak (Quercus spp.) saplings remained an important nesting substrate throughout the breeding season. On a larger scale, nest-site selection was likely to occur farther from wooded riparian areas and closer to recently-reclaimed areas. Nest parasitism was likely to occur near roads and wooded riparian areas. Results suggest reclaimed areas could be improved by planting more bunchgrasses, tall forbs (e.g., curly-cup gumweed [Grindelia squarrosa] and sunflower [Helianthus spp.]), clover (Trifolium spp.), and oaks (a preferred nesting substrate associated with higher survival rates). Larger-scale analysis suggests that larger tracts of wildlife areas should be created with wooded riparian areas comprising a minimal portion of a fields edge. dickcissels nest success logistic regression reclamation
439	Logistic regression models for predicting trip reporting accuracy in GPS-enhanced household travel surveys Forrest, Timothy Lee 25 April 2007 (has links) This thesis presents a methodology for conducting logistic regression modeling of trip and household information obtained from household travel surveys and vehicle trip information obtained from global positioning systems (GPS) to better understand the trip underreporting that occurs. The methodology presented here builds on previous research by adding additional variables to the logistic regression model that might be significant in contributing to underreporting, specifically, trip purpose. Understanding the trip purpose is crucial in transportation planning because many of the transportation models used today are based on the number of trips in a given area by the purpose of a trip. The methodology used here was applied to two study areas in Texas, Laredo and Tyler-Longview. In these two study areas, household travel survey data and GPS-based vehicle tracking data was collected over a 24-hour period for 254 households and 388 vehicles. From these 254 households, a total of 2,795 trips were made, averaging 11.0 trips per household. By comparing the trips reported in the household travel survey with those recorded by the GPS unit, trips not reported in the household travel survey were identified. Logistic regression was shown to be effective in determining which household- and trip-related variables significantly contributed to the likelihood of a trip being reported. Although different variables were identified as significant in each of the models tested, one variable was found to be significant in all of them - trip purpose. It was also found that the household residence type and the use of household vehicles for commercial purposes did not significantly affect reporting rates in any of the models tested. The results shown here support the need for modeling trips by trip purpose, but also indicate that, from urban area to urban area, there are different factors contributing to the level of underreporting that occurs. An analysis of additional significant variables in each urban area found combinations that yielded trip reporting rates of 0%. Similar to the results of Zmud and Wolf (2003), trip duration and the number of vehicles available were also found to be significant in a full model encompassing both study areas. logistic regression GPS CATI travel surveys GIS
440	A Monte Carlo Investigation of Three Different Estimation Methods in Multilevel Structural Equation Modeling Under Conditions of Data Nonnormality and Varied Sample Sizes Byrd, Jimmy 14 January 2010 (has links) The purpose of the study was to examine multilevel regression models in the context of multilevel structural equation modeling (SEM) in terms of accuracy of parameter estimates, standard errors, and fit indices in normal and nonnormal data under various sample sizes and differing estimators (maximum likelihood, generalized least squares, and weighted least squares). The finding revealed that the regression coefficients were estimated with little to no bias among the study design conditions investigated. However, the number of clusters (group level) appeared to have the greatest impact on bias among the parameter estimate standard errors at both level-1 and level-2. In small sample sizes (i.e., 300 and 500) the standard errors were negatively biased. When the number of clusters was 30 and cluster size was held at 10, the level-1 standard errors were biased downward by approximately 20% for the maximum likelihood and generalized least squares estimators, while the weighted least squares estimator produced level-1 standard errors that were negatively biased by 25%. Regarding the level-2 standard errors, the level-2 standard errors were biased downward by approximately 24% in nonnormal data, especially when the correlation among variables was fixed at .5 and kurtosis was held constant at 7. In this same setting (30 clusters with cluster size fixed at 10), when kurtosis was fixed at 4 and the correlation among variables was held at .7, both the maximum likelihood and generalized least squares estimators resulted in standard errors that were biased downward by approximately 11%. Regarding fit statistics, negative bias was noted among each of the fit indices investigated when the number of clusters ranged from 30 to 50 and cluster size was fixed at 10. The least amount of bias was associated with the maximum likelihood estimator in each of the data normality conditions examined. As sample size increased, bias decreased to near zero when the sample size was equal to or greater than 1,500 with similar results reported across estimation methods. Recommendations for the substantive researcher are presented and areas of future research are presented.

Search results