Spelling suggestions: "subject:"simulationpercolation"" "subject:"miniextrapolation""
1 |
Measuring Student Growth with the Conditional Growth Chart MethodShang, Yi January 2009 (has links)
Thesis advisor: Henry Braun / The measurement of student academic growth is one of the most important statistical tasks in an educational accountability system. The current methods of measuring student growth adopted in most states have various drawbacks in terms of sensitivity, accuracy, and interpretability. In this thesis, we apply the conditional growth chart method, a well-developed diagnostic tool in pediatrics, to student longitudinal test data to produce descriptive and diagnostic statistics about students' academic growth trajectory. We also introduce an innovative simulation-extrapolation (SIMEX) method which corrects for measurement error-induced bias in the estimation of the conditional growth model. Our simulation study shows that the proposed method has an advantage in terms of mean squared error of the estimators, when compared with the growth model that ignores measurement error. Our data analysis demonstrates that the conditional growth chart method, when combined with the SIMEX method, can be a powerful tool in the educational accountability system. It produces more sensitive and accurate measures of student growth than the other currently available methods; it provides diagnostic information that is easily understandable to teachers, parents and students themselves; the individual level growth measures can also be aggregated to school level as an indicator of school growth. / Thesis (PhD) — Boston College, 2009. / Submitted to: Boston College. Lynch School of Education. / Discipline: Educational Research, Measurement, and Evaluation.
|
2 |
A Bayesian approach to energy monitoring optimizationCarstens, Herman January 2017 (has links)
This thesis develops methods for reducing energy Measurement and Verification (M&V) costs through
the use of Bayesian statistics. M&V quantifies the savings of energy efficiency and demand side
projects by comparing the energy use in a given period to what that use would have been, had no
interventions taken place. The case of a large-scale lighting retrofit study, where incandescent lamps
are replaced by Compact Fluorescent Lamps (CFLs), is considered. These projects often need to be
monitored over a number of years with a predetermined level of statistical rigour, making M&V very
expensive.
M&V lighting retrofit projects have two interrelated uncertainty components that need to be addressed,
and which form the basis of this thesis. The first is the uncertainty in the annual energy use of the
average lamp, and the second the persistence of the savings over multiple years, determined by the
number of lamps that are still functioning in a given year. For longitudinal projects, the results from
these two aspects need to be obtained for multiple years.
This thesis addresses these problems by using the Bayesian statistical paradigm. Bayesian statistics is
still relatively unknown in M&V, and presents an opportunity for increasing the efficiency of statistical
analyses, especially for such projects.
After a thorough literature review, especially of measurement uncertainty in M&V, and an introduction
to Bayesian statistics for M&V, three methods are developed. These methods address the three types
of uncertainty in M&V: measurement, sampling, and modelling. The first method is a low-cost energy
meter calibration technique. The second method is a Dynamic Linear Model (DLM) with Bayesian
Forecasting for determining the size of the metering sample that needs to be taken in a given year.
The third method is a Dynamic Generalised Linear Model (DGLM) for determining the size of the
population survival survey sample.
It is often required by law that M&V energy meters be calibrated periodically by accredited laboratories.
This can be expensive and inconvenient, especially if the facility needs to be shut down for meter
installation or removal. Some jurisdictions also require meters to be calibrated in-situ; in their operating
environments. However, it is shown that metering uncertainty makes a relatively small impact to
overall M&V uncertainty in the presence of sampling, and therefore the costs of such laboratory
calibration may outweigh the benefits. The proposed technique uses another commercial-grade meter
(which also measures with error) to achieve this calibration in-situ. This is done by accounting for the
mismeasurement effect through a mathematical technique called Simulation Extrapolation (SIMEX).
The SIMEX result is refined using Bayesian statistics, and achieves acceptably low error rates and
accurate parameter estimates.
The second technique uses a DLM with Bayesian forecasting to quantify the uncertainty in metering
only a sample of the total population of lighting circuits. A Genetic Algorithm (GA) is then applied
to determine an efficient sampling plan. Bayesian statistics is especially useful in this case because
it allows the results from previous years to inform the planning of future samples. It also allows for
exact uncertainty quantification, where current confidence interval techniques do not always do so.
Results show a cost reduction of up to 66%, but this depends on the costing scheme used. The study
then explores the robustness of the efficient sampling plans to forecast error, and finds a 50% chance
of undersampling for such plans, due to the standard M&V sampling formula which lacks statistical
power.
The third technique uses a DGLM in the same way as the DLM, except for population survival
survey samples and persistence studies, not metering samples. Convolving the binomial survey result
distributions inside a GA is problematic, and instead of Monte Carlo simulation, a relatively new
technique called Mellin Transform Moment Calculation is applied to the problem. The technique is
then expanded to model stratified sampling designs for heterogeneous populations. Results show a
cost reduction of 17-40%, although this depends on the costing scheme used.
Finally the DLM and DGLM are combined into an efficient overall M&V plan where metering and
survey costs are traded off over multiple years, while still adhering to statistical precision constraints.
This is done for simple random sampling and stratified designs. Monitoring costs are reduced by
26-40% for the costing scheme assumed.
The results demonstrate the power and flexibility of Bayesian statistics for M&V applications, both in
terms of exact uncertainty quantification, and by increasing the efficiency of the study and reducing
monitoring costs. / Hierdie proefskrif ontwikkel metodes waarmee die koste van energiemonitering en verifieëring (M&V)
deur Bayesiese statistiek verlaag kan word. M&V bepaal die hoeveelheid besparings wat deur
energiedoeltreffendheid- en vraagkantbestuurprojekte behaal kan word. Dit word gedoen deur die
energieverbruik in ’n gegewe tydperk te vergelyk met wat dit sou wees indien geen ingryping plaasgevind
het nie. ’n Grootskaalse beligtingsretrofitstudie, waar filamentgloeilampe met fluoresserende
spaarlampe vervang word, dien as ’n gevallestudie. Sulke projekte moet gewoonlik oor baie jare met
’n vasgestelde statistiese akkuuraatheid gemonitor word, wat M&V duur kan maak.
Twee verwante onsekerheidskomponente moet in M&V beligtingsprojekte aangespreek word, en vorm
die grondslag van hierdie proefskrif. Ten eerste is daar die onsekerheid in jaarlikse energieverbruik
van die gemiddelde lamp. Ten tweede is daar die volhoubaarheid van die besparings oor veelvoudige
jare, wat bepaal word deur die aantal lampe wat tot in ’n gegewe jaar behoue bly. Vir longitudinale
projekte moet hierdie twee komponente oor veelvoudige jare bepaal word.
Hierdie proefskrif spreek die probleem deur middel van ’n Bayesiese paradigma aan. Bayesiese
statistiek is nog relatief onbekend in M&V, en bied ’n geleentheid om die doeltreffendheid van
statistiese analises te verhoog, veral vir bogenoemde projekte.
Die proefskrif begin met ’n deeglike literatuurstudie, veral met betrekking tot metingsonsekerheid
in M&V. Daarna word ’n inleiding tot Bayesiese statistiek vir M&V voorgehou, en drie metodes
word ontwikkel. Hierdie metodes spreek die drie hoofbronne van onsekerheid in M&V aan: metings,
opnames, en modellering. Die eerste metode is ’n laekoste energiemeterkalibrasietegniek. Die
tweede metode is ’n Dinamiese Linieêre Model (DLM) met Bayesiese vooruitskatting, waarmee meter
opnamegroottes bepaal kan word. Die derde metode is ’n Dinamiese Veralgemeende Linieêre Model
(DVLM), waarmee bevolkingsoorlewing opnamegroottes bepaal kan word.
Volgens wet moet M&V energiemeters gereeld deur erkende laboratoria gekalibreer word. Dit kan
duur en ongerieflik wees, veral as die aanleg tydens meterverwydering en -installering afgeskakel moet
word. Sommige regsgebiede vereis ook dat meters in-situ gekalibreer word; in hul bedryfsomgewings.
Tog word dit aangetoon dat metingsonsekerheid ’n klein deel van die algehele M&V onsekerheid
beslaan, veral wanneer opnames gedoen word. Dit bevraagteken die kostevoordeel van laboratoriumkalibrering.
Die voorgestelde tegniek gebruik ’n ander kommersieële-akkuurraatheidsgraad meter
(wat self ’n nie-weglaatbare metingsfout bevat), om die kalibrasie in-situ te behaal. Dit word gedoen
deur die metingsfout deur SIMulerings EKStraptolering (SIMEKS) te verminder. Die SIMEKS resultaat
word dan deur Bayesiese statistiek verbeter, en behaal aanvaarbare foutbereike en akkuurate
parameterafskattings.
Die tweede tegniek gebruik ’n DLM met Bayesiese vooruitskatting om die onsekerheid in die meting
van die opnamemonster van die algehele bevolking af te skat. ’n Genetiese Algoritme (GA) word
dan toegepas om doeltreffende opnamegroottes te vind. Bayesiese statistiek is veral nuttig in hierdie
geval aangesien dit vorige jare se uitslae kan gebruik om huidige afskattings te belig Dit laat ook
die presiese afskatting van onsekerheid toe, terwyl standaard vertrouensintervaltegnieke dit nie doen
nie. Resultate toon ’n kostebesparing van tot 66%. Die studie ondersoek dan die standvastigheid van
kostedoeltreffende opnameplanne in die teenwoordigheid van vooruitskattingsfoute. Dit word gevind
dat kostedoeltreffende opnamegroottes 50% van die tyd te klein is, vanweë die gebrek aan statistiese
krag in die standaard M&V formules.
Die derde tegniek gebruik ’n DVLM op dieselfde manier as die DLM, behalwe dat bevolkingsoorlewingopnamegroottes
ondersoek word. Die saamrol van binomiale opname-uitslae binne die GA skep ’n
probleem, en in plaas van ’n Monte Carlo simulasie word die relatiewe nuwe Mellin Vervorming
Moment Berekening op die probleem toegepas. Die tegniek word dan uitgebou om laagsgewyse
opname-ontwerpe vir heterogene bevolkings te vind. Die uitslae wys ’n 17-40% kosteverlaging,
alhoewel dit van die koste-skema afhang.
Laastens word die DLM en DVLM saamgevoeg om ’n doeltreffende algehele M&V plan, waar meting
en opnamekostes teen mekaar afgespeel word, te ontwerp. Dit word vir eenvoudige en laagsgewyse
opname-ontwerpe gedoen. Moniteringskostes word met 26-40% verlaag, maar hang van die aangenome
koste-skema af.
Die uitslae bewys die krag en buigsaamheid van Bayesiese statistiek vir M&V toepassings, beide vir
presiese onsekerheidskwantifisering, en deur die doeltreffendheid van die dataverbruik te verhoog en
sodoende moniteringskostes te verlaag. / Thesis (PhD)--University of Pretoria, 2017. / National Research Foundation / Department of Science and Technology / National Hub for the Postgraduate
Programme in Energy Efficiency and Demand Side Management / Electrical, Electronic and Computer Engineering / PhD / Unrestricted
|
3 |
Dependent Berkson errors in linear and nonlinear modelsAlthubaiti, Alaa Mohammed A. January 2011 (has links)
Often predictor variables in regression models are measured with errors. This is known as an errors-in-variables (EIV) problem. The statistical analysis of the data ignoring the EIV is called naive analysis. As a result, the variance of the errors is underestimated. This affects any statistical inference that may subsequently be made about the model parameter estimates or the response prediction. In some cases (e.g. quadratic polynomial models) the parameter estimates and the model prediction is biased. The errors can occur in different ways. These errors are mainly classified into classical (i.e. occur in observational studies) or Berkson type (i.e. occur in designed experiments). This thesis addresses the problem of the Berkson EIV and their effect on the statistical analysis of data fitted using linear and nonlinear models. In particular, the case when the errors are dependent and have heterogeneous variance is studied. Both analytical and empirical tools have been used to develop new approaches for dealing with this type of errors. Two different scenarios are considered: mixture experiments where the model to be estimated is linear in the parameters and the EIV are correlated; and bioassay dose-response studies where the model to be estimated is nonlinear. EIV following Gaussian distribution, as well as the much less investigated non-Gaussian distribution are examined. When the errors occur in mixture experiments both analytical and empirical results showed that the naive analysis produces biased and inefficient estimators for the model parameters. The magnitude of the bias depends on the variances of the EIV for the mixture components, the model and its parameters. First and second Scheffé polynomials are used to fit the response. To adjust for the EIV, four different approaches of corrections are proposed. The statistical properties of the estimators are investigated, and compared with the naive analysis estimators. Analytical and empirical weighted regression calibration methods are found to give the most accurate and efficient results. The approaches require the error variance to be known prior to the analysis. The robustness of the adjusted approaches for misspecified variance was also examined. Different error scenarios of EIV in the settings of concentrations in bioassay dose-response studies are studied (i.e. dependent and independent errors). The scenarios are motivated by real-life examples. Comparisons between the effects of the errors are illustrated using the 4-prameter Hill model. The results show that when the errors are non-Gaussian, the nonlinear least squares approach produces biased and inefficient estimators. An extension of the well-known simulation-extrapolation (SIMEX) method is developed for the case when the EIV lead to biased model parameters estimators, and is called Berkson simulation-extrapolation (BSIMEX). BSIMEX requires the error variance to be known. The robustness of the adjusted approach for misspecified variance is examined. Moreover, it is shown that BSIMEX performs better than the regression calibration methods when the EIV are dependent, while the regression calibration methods are preferable when the EIV are independent.
|
Page generated in 0.2124 seconds