Spelling suggestions: "subject:"heighted"" "subject:"eighted""
511 |
Analysis of Longitudinal Surveys with Missing ResponsesCarrillo Garcia, Ivan Adolfo January 2008 (has links)
Longitudinal surveys have emerged in recent years as an important data collection tool for population studies where the primary interest is to examine population changes over time at the individual level. The National Longitudinal Survey of Children and Youth (NLSCY), a large scale survey with a complex sampling design and conducted by Statistics Canada, follows a large group of children and youth over time and collects measurement on various indicators related to their educational, behavioral and psychological development. One of the major objectives of the study is to explore how such development is related to or affected by familial, environmental and economical factors.
The generalized estimating equation approach, sometimes better known as the GEE method, is the most popular statistical inference tool for longitudinal studies. The vast majority of existing literature on the GEE method, however, uses the method for non-survey settings; and issues related to complex sampling designs are ignored.
This thesis develops methods for the analysis of longitudinal surveys when the response variable contains missing values. Our methods are built within the GEE framework, with a major focus on using the GEE method when missing responses are handled through hot-deck imputation. We first argue why, and further show how, the survey weights can be incorporated into the so-called Pseudo GEE method under a joint randomization framework. The consistency of the resulting Pseudo GEE estimators with complete responses is established under the proposed framework.
The main focus of this research is to extend the proposed pseudo GEE method to cover cases where the missing responses are imputed through the hot-deck method. Both weighted and unweighted hot-deck imputation procedures are considered. The consistency of the pseudo GEE estimators under imputation for missing responses is established for both procedures. Linearization variance estimators are developed for the pseudo GEE estimators under the assumption that the finite population sampling fraction is small or negligible, a scenario often held for large scale population surveys.
Finite sample performances of the proposed estimators are investigated through an extensive simulation study. The results show that the pseudo GEE estimators and the linearization variance estimators perform well under several sampling designs and for both continuous response and binary response.
|
512 |
Training of Template-Specific Weighted Energy Function for Sequence-to-Structure AlignmentLee, En-Shiun Annie January 2008 (has links)
Threading is a protein structure prediction method that uses a library of template protein structures in the following steps: first the target sequence is matched to the template library and the best template structure is selected, secondly the predicted target structure of the target sequence is modeled by this selected template structure. The deceleration of new folds which are added to the protein data bank promises completion of the template structure library. This thesis uses a new set of template-specific weights to improve the energy function for sequence-to-structure alignment in the template selection step of the threading process. The weights are estimated using least squares methods with the quality of the modelling step in the threading process as the label. These new weights show an average 12.74% improvement in estimating the label. Further family analysis show a correlation between the performance of the new weights to the number of seeds in pFam.
|
513 |
Value at Risk: A Standard Tool in Measuring Risk : A Quantitative Study on Stock PortfolioOfe, Hosea, Okah, Peter January 2011 (has links)
The role of risk management has gained momentum in recent years most notably after the recent financial crisis. This thesis uses a quantitative approach to evaluate the theory of value at risk which is considered a benchmark to measure financial risk. The thesis makes use of both parametric and non parametric approaches to evaluate the effectiveness of VAR as a standard tool in measuring risk of stock portfolio. This study uses the normal distribution, student t-distribution, historical simulation and the exponential weighted moving average at 95% and 99% confidence levels on the stock returns of Sonny Ericsson, Three Months Swedish Treasury bill (STB3M) and Nordea Bank. The evaluations of the VAR models are based on the Kupiec (1995) Test. From a general perspective, the results of the study indicate that VAR as a proxy of risk measurement has some imprecision in its estimates. However, this imprecision is not all the same for all the approaches. The results indicate that models which assume normality of return distribution display poor performance at both confidence levels than models which assume fatter tails or have leptokurtic characteristics. Another finding from the study which may be interesting is the fact that during the period of high volatility such as the financial crisis of 2008, the imprecision of VAR estimates increases. For the parametric approaches, the t-distribution VAR estimates were accurate at 95% confidence level, while normal distribution approach produced inaccurate estimates at 95% confidence level. However both approaches were unable to provide accurate estimates at 99% confidence level. For the non parametric approaches the exponentially weighted moving average outperformed the historical simulation approach at 95% confidence level, while at the 99% confidence level both approaches tend to perform equally. The results of this study thus question the reliability on VAR as a standard tool in measuring risk on stock portfolio. It also suggest that more research should be done to improve on the accuracy of VAR approaches, given that the role of risk management in today’s business environment is increasing ever than before. The study suggest VAR should be complemented with other risk measures such as Extreme value theory and stress testing, and that more than one back testing techniques should be used to test the accuracy of VAR.
|
514 |
New Calibration Approaches in Solid Phase Microextraction for On-Site AnalysisChen, Yong January 2004 (has links)
Calibration methods for quantitative on-site sampling using solid phase microextraction (SPME) were developed based on diffusion mass transfer theory. This was investigated using adsorptive polydimethylsiloxane/divinylbenzene (PDMS/DVB) and Carboxen/polydimethylsiloxane (CAR/PDMS) SPME fiber coatings with volatile aromatic hydrocarbons (BTEX: benzene, toluene, ethylbenzene, and o-xylene) as test analytes. Parameters that affected the extraction process (sampling time, analyte concentration, water velocity, and temperature) were investigated. Very short sampling times (10-300 s) and sorbents with a strong affinity and large capacity were used to ensure a 'zero sink' effect calibrate process. It was found that mass uptake of analyte changed linearly with concentration. Increase of water velocity increased mass uptake, though the increase is not linear. Temperature did not affect mass uptake significantly under typical field sampling conditions. To further describe rapid SPME analysis of aqueous samples, a new model translated from heat transfer to a circular cylinder in cross flow was used. An empirical correlation to this model was used to predict the mass transfer coefficient. Findings indicated that the predicted mass uptake compared well with experimental mass uptake. The new model also predicted rapid air sampling accurately. To further integrate the sampling and analysis processes, especially for on-site or <i>in-vivo</i> investigations where the composition of the sample matrix is very complicated and/or agitation of the sample matrix is variable or unknown, a new approach for calibration was developed. This involved the loading internal standards onto the extraction fiber prior to the extraction step. During sampling, the standard partially desorbs into the sample matrix and the rate at which this process occurs, was for calibration. The kinetics of the absorption/desorption was investigated, and the isotropy of the two processes was demonstrated, thus validating this approach for calibration. A modified SPME device was used as a passive sampler to determine the time-weighted average (TWA) concentration of volatile organic compounds (VOCs) in air. The sampler collects the VOCs by the mechanism of molecular diffusion and sorption on to a coated fiber as collection medium. This process was shown to be described by Fick's first law of diffusion, whereby the amount of analyte accumulated over time enable measurement of the TWA concentration to which the sampler was exposed. TWA passive sampling with a SPME device was shown to be almost independent of face velocity, and to be more tolerant of high and low analyte concentrations and long and short sampling times, because of the ease with which the diffusional path length could be changed. Environmental conditions (temperature, pressure, relative humidity, and ozone) had little or no effect on sampling rate. When the SPME device was tested in the field and the results compared with those from National Institute of Occupational Health and Safety (NIOSH) method 1501 good agreement was obtained. To facilitate the use of SPME for field sampling, a new field sampler was designed and tested. The sampler was versatile and user-friendly. The SPME fiber can be positioned precisely inside the needle for TWA sampling, or exposed completely outside the needle for rapid sampling. The needle is protected within a shield at all times hereby eliminating the risk of operator injury and fiber damage. A replaceable Teflon cap is used to seal the needle to preserve sample integrity. Factors that affect the preservation of sample integrity (sorbent efficiency, temperature, and sealing materials) were studied. The use of a highly efficient sorbent is recommended as the first choice for the preservation of sample integrity. Teflon was a good material for sealing the fiber needle, had little memory effect, and could be used repeatedly. To address adsorption of high boiling point compounds on fiber needles, several kinds of deactivated needles were evaluated. RSC-2 blue fiber needles were the more effective. A preliminary field sampling investigation demonstrated the validity of the new SPME device for field applications.
|
515 |
Analysis of Longitudinal Surveys with Missing ResponsesCarrillo Garcia, Ivan Adolfo January 2008 (has links)
Longitudinal surveys have emerged in recent years as an important data collection tool for population studies where the primary interest is to examine population changes over time at the individual level. The National Longitudinal Survey of Children and Youth (NLSCY), a large scale survey with a complex sampling design and conducted by Statistics Canada, follows a large group of children and youth over time and collects measurement on various indicators related to their educational, behavioral and psychological development. One of the major objectives of the study is to explore how such development is related to or affected by familial, environmental and economical factors.
The generalized estimating equation approach, sometimes better known as the GEE method, is the most popular statistical inference tool for longitudinal studies. The vast majority of existing literature on the GEE method, however, uses the method for non-survey settings; and issues related to complex sampling designs are ignored.
This thesis develops methods for the analysis of longitudinal surveys when the response variable contains missing values. Our methods are built within the GEE framework, with a major focus on using the GEE method when missing responses are handled through hot-deck imputation. We first argue why, and further show how, the survey weights can be incorporated into the so-called Pseudo GEE method under a joint randomization framework. The consistency of the resulting Pseudo GEE estimators with complete responses is established under the proposed framework.
The main focus of this research is to extend the proposed pseudo GEE method to cover cases where the missing responses are imputed through the hot-deck method. Both weighted and unweighted hot-deck imputation procedures are considered. The consistency of the pseudo GEE estimators under imputation for missing responses is established for both procedures. Linearization variance estimators are developed for the pseudo GEE estimators under the assumption that the finite population sampling fraction is small or negligible, a scenario often held for large scale population surveys.
Finite sample performances of the proposed estimators are investigated through an extensive simulation study. The results show that the pseudo GEE estimators and the linearization variance estimators perform well under several sampling designs and for both continuous response and binary response.
|
516 |
Training of Template-Specific Weighted Energy Function for Sequence-to-Structure AlignmentLee, En-Shiun Annie January 2008 (has links)
Threading is a protein structure prediction method that uses a library of template protein structures in the following steps: first the target sequence is matched to the template library and the best template structure is selected, secondly the predicted target structure of the target sequence is modeled by this selected template structure. The deceleration of new folds which are added to the protein data bank promises completion of the template structure library. This thesis uses a new set of template-specific weights to improve the energy function for sequence-to-structure alignment in the template selection step of the threading process. The weights are estimated using least squares methods with the quality of the modelling step in the threading process as the label. These new weights show an average 12.74% improvement in estimating the label. Further family analysis show a correlation between the performance of the new weights to the number of seeds in pFam.
|
517 |
Composition of Tree Series TransformationsMaletti, Andreas 12 November 2012 (has links) (PDF)
Tree series transformations computed by bottom-up and top-down tree series transducers are called bottom-up and top-down tree series transformations, respectively. (Functional) compositions of such transformations are investigated. It turns out that the class of bottomup tree series transformations over a commutative and complete semiring is closed under left-composition with linear bottom-up tree series transformations and right-composition with boolean deterministic bottom-up tree series transformations. Moreover, it is shown that the class of top-down tree series transformations over a commutative and complete semiring is closed under right-composition with linear, nondeleting top-down tree series transformations. Finally, the composition of a boolean, deterministic, total top-down tree series transformation with a linear top-down tree series transformation is shown to be a top-down tree series transformation.
|
518 |
Algorithm Design and Analysis for Large-Scale Semidefinite Programming and Nonlinear ProgrammingLu, Zhaosong 24 June 2005 (has links)
The limiting behavior of weighted paths associated with the semidefinite program (SDP) map $X^{1/2}SX^{1/2}$ was studied and some applications to error bound analysis and superlinear convergence of a class of
primal-dual interior-point methods were provided. A new approach for solving large-scale well-structured sparse SDPs via a saddle point mirror-prox algorithm with ${cal O}(epsilon^{-1})$ efficiency was developed based on exploiting sparsity structure and reformulating SDPs into smooth convex-concave saddle point problems. An iterative solver-based
long-step primal-dual infeasible path-following algorithm for convex quadratic programming (CQP) was developed. The search directions of
this algorithm were computed by means of a preconditioned iterative linear solver. A uniform bound, depending only on the CQP data, on
the number of iterations performed by a preconditioned iterative linear solver was established. A polynomial bound on the number of
iterations of this algorithm was also obtained. One efficient ``nearly exact' type of method for solving large-scale ``low-rank' trust region
subproblems was proposed by completely avoiding the computations of Cholesky or partial Cholesky factorizations. A computational study of this method was also provided by applying it to solve some large-scale nonlinear programming problems.
|
519 |
Analysis of Taiwan Stock Exchange high frequency transaction dataHao Hsu, Chia- 06 July 2012 (has links)
Taiwan Security Market is a typical order-driven market. The electronic trading system of Taiwan Security Market launched in 1998 significantly reduces the trade matching time (the current matching time is around 20 seconds) and promptly provides updated online trading information to traders. In this study, we establish an online transaction simulation system which can be applied to predict trade prices and study market efficiency. Models are established for the times and volumes of the newly added bid/ask orders on the match list. Exponentially weighted moving average (EWMA) method is adopted to update the model parameters. Match prices are predicted dynamically based on the EWMA updated models. Further, high frequency bid/ask order data are used to find the supply and demand curves as well as the equilibrium prices. Differences between the transaction prices and the equilibrium prices are used to investigate the efficiency of Taiwan Security Market. Finally, EWMA and cusum control charts are used to monitor the market efficiency. In empirical study, we analyze the intra-daily (April, 2005) high frequency match data of Uni-president Enterprises Corporation and Formosa Plastics Corporation.
|
520 |
A characterization of weight function for construction of minimally-supported D-optimal designs for polynomial regression via differential equationChang, Hsiu-ching 13 July 2006 (has links)
In this paper we investigate (d + 1)-point D-optimal designs for d-th degree polynomial
regression with weight function w(x) > 0 on the interval [a, b]. Suppose that w'(x)/w(x) is a rational function and the information of whether the optimal support
contains the boundary points a and b is available. Then the problem of constructing
(d + 1)-point D-optimal designs can be transformed into a differential equation
problem leading us to a certain matrix with k auxiliary unknown constants. We characterize the weight functions corresponding to the cases when k= 0 and k= 1.
Then, we can solve (d + 1)-point D-optimal designs directly from differential equation
(k = 0) or via eigenvalue problems (k = 1). The numerical results show us an interesting relationship between optimal designs and ordered eigenvalues.
|
Page generated in 0.0501 seconds