Spelling suggestions: "subject:"locally heighted degression"" "subject:"locally heighted aregression""
1 |
Optimal weight settings in locally weighted regression: A guidance through cross-validation approachPuri, Roshan January 2023 (has links)
Locally weighted regression is a powerful tool that allows the estimation of different sets of coefficients for
each location in the underlying data, challenging the assumption of stationary regression coefficients across
a study region. The accuracy of LWR largely depends on how a researcher establishes the relationship across
locations, which is often constructed using a weight matrix or function. This paper explores the different
kernel functions used to assign weights to observations, including Gaussian, bi-square, and tri-cubic, and
how the choice of weight variables and window size affects the accuracy of the estimates. We guide this
choice through the cross-validation approach and show that the bi-square function outperforms the choice of
other kernel functions. Our findings demonstrate that an optimal window size for LWR models depends on
the cross-validation (CV) approach employed. In our empirical application, the full-sample CV guides the
choice of a higher window-size case, and CV by proxy guides the choice of a lower window size. Since the CV
by Proxy approach focuses on the predictive ability of the model in the vicinity of one specific point (usually
a policy point/site), we note that guiding a model choice through this approach makes more intuitive sense
when the aim of the researcher is to predict the outcome in one specific site (policy or target point). To
identify the optimal weight variables, while we suggest exploring various combinations of weight variables,
we argue that an efficient alternative is to merge all continuous variables in the dataset into a single weight
variable. / M.A. / Locally weighted regression (LWR) is a statistical technique that establishes a relationship between dependent
and explanatory variables, focusing primarily on data points in proximity to a specific point of
interest/target point. This technique assigns varying degrees of importance to the observations that are in
proximity to the target point, thereby allowing for the modeling of relationships that may exhibit spatial
variability within the dataset.
The accuracy of LWR largely depends on how researchers define relationships across different locations/studies,
which is often done using a “weight setting”. We define weight setting as a combination of weight
functions (determines how the observations around a point of interest are weighted before they enter the
model), weight variables (determines proximity between the point of interest and all other observations), and
window sizes (determines the number of observations that can be allowed in the local regression). To find
which weight setting is an optimal one or which combination of weight functions, weight variables, and window
sizes generates the lowest predictive error, researchers often employ a cross-validation (CV) approach.
Cross-validation is a statistical method used to assess and validate the performance of a predictive model. It
entails removing a host observation (a point of interest), predicting that point, and evaluating the accuracy
of such predicted point by comparing it with its actual value.
In our study, we employ two CV approaches. The first one is a full-sample CV approach, where we remove
a host observation, and predict it using the full set of observations used in the given local regression. The
second one is the CV by proxy approach, which uses a similar mechanism as full-sample CV to check the
accuracy of the prediction, however, by focusing only on the vicinity points that share similar characteristics
as a target point.
We find that the bi-square function consistently outperforms the choice of Gaussian and tri-cubic weight
functions, regardless of the CV approaches. However, the choice of an optimal window size in LWR models
depends on the CV approach that we employ. While the full-sample CV method guides us toward the
selection of a larger window size, the CV by proxy directs us toward a smaller window size. In the context of
identifying the optimal weight variables, we recommend exploring various combinations of weight variables.
However, we also propose an efficient alternative, which involves using all continuous variables within the
dataset into a single-weight variable instead of striving to identify the best of thousands of different weight
variable settings.
|
2 |
Methodological advances in benefit transfer and hedonic analysisPuri, Roshan 19 September 2023 (has links)
This dissertation introduces advanced statistical and econometric methods in two distinct areas of non-market valuation: benefit transfer (BT) and hedonic analysis. While the first and the third chapters address the challenge of estimating the societal benefits of prospective environmental policy changes by adopting locally weighted regression (LWR) technique in an environmental valuation context, the second chapter combines the output from traditional hedonic regression and matching estimators and provides guidance on the choice of model with low risk of bias in housing market studies.
The economic and societal benefits associated with various environmental conservation programs, such as improvement in water quality, or increment in wetland acreages, can be directly estimated using primary studies. However, conducting primary studies can be highly resource-intensive and time-consuming as they typically involve extensive data collection, sophisticated models, and a considerable investment of financial and human resources. As a result, BT offers a practical alternative, which involves employing valuation estimates, functions, or models from prior primary studies to predict the societal benefit of conservation policies at a policy site. Existing studies typically fit one single regression model to all observations within the given metadata and generate a single set of coefficients to predict welfare (willingness-to-pay) in a prospective policy site. However, a single set of coefficients may not reflect the true relationship between dependent and independent variables, especially when multiple source studies/locations are involved in the data-generating process which, in turn, degrades the predictive accuracy of the given meta-regression model (MRM). To address this shortcoming, we employ the LWR technique in an environmental valuation context. LWR allows an estimation of a different set of coefficients for each location to be used for BT prediction. However, the empirical exercise carried out in the existing literature is rigorous from a computational perspective and is cumbersome for practical adaptation.
In the first chapter, we simplify the experimental setup required for LWR-BT analysis by taking a closer look at the choice of weight variables for different window sizes and weight function settings. We propose a pragmatic solution by suggesting "universal weights" instead of striving to identify the best of thousands of different weight variable settings. We use the water quality metadata employed in the published literature and show that our universal weights generate more efficient and equally plausible BT estimates for policy sites than the best weight variable settings that emerge from a time-consuming cross-validation search over the entire universe of individual variable combinations.
The third chapter expands the scope of LWR to wetland meta-data. We use a conceptually similar set of weight variables as in the first chapter and replicate the methodological approach of that chapter. We show that LWR, under our proposed weight settings, generates substantial gain in both predictive accuracy and efficiency compared to the one generated by standard globally-linear MRM.
Our second chapter delves into a separate yet interrelated realm of non-market valuation, i.e., hedonic analysis. Here, we explore the combined inferential power of traditional hedonic regression and matching estimators to provide guidance on model choice for housing market studies where researchers aim to estimate an unbiased binary treatment effect in the presence of unobserved spatial and temporal effects. We examine the potential sources of bias within both hedonic regression and basic matching. We discuss the theoretical routes to mitigate these biases and assess their feasibility in practical contexts. We propose a novel route towards unbiasedness, i.e., the "cancellation effect" and illustrate its empirical feasibility while estimating the impact of flood hazards on housing prices. / Doctor of Philosophy / This dissertation introduces novel statistical and econometric methods to better understand the value of environmental resources that do not have an explicit market price, such as the benefits we get from the changes in water quality, size of wetlands, or the impact of flood risk zoning in the sales price of residential properties.
The first and third chapters tackle the challenge of estimating the value of environmental changes, such as cleaner water or more wetlands. To figure out how much people benefit from these changes, we can look at how much they would be willing to pay for such improved water quality or increased wetland area. This typically requires conducting a primary survey, which is expensive and time-consuming. Instead, researchers can draw insights from prior studies to predict welfare in a new policy site. This approach is analogous to applying a methodology and/or findings from one research work to another. However, the direct application of findings from one context to another assumes uniformity across the different studies which is unlikely, especially when past studies are associated with different spatial locations. To address this, we propose a ``locally-weighting" technique. This places greater emphasis on the studies that closely align with the characteristics of the new (policy) context. Determining the weight variables/factors that dictate this alignment is a question that requires an empirical investigation.
One recent study attempts this locally-weighting technique to estimate the benefits of improved water quality and suggests experimenting with different factors to find the similarity between the past and new studies. However, their approach is computationally intensive, making it impractical for adaptation. In our first chapter, we propose a more pragmatic solution---using a "universal weight" that does not require assessing multiple factors. With our proposed weights in an otherwise similar context, we find more efficient and equally plausible estimates of the benefits as previous studies. We expand the scope of the local weighting to the valuation of gains or losses in wetland areas in the third chapter. We use a conceptually similar set of weight variables and replicate the empirical exercise from the first chapter. We show that the local-weighting technique, under our proposed settings, substantially improves the accuracy and efficiency of estimated benefits associated with the change in wetland acreage. This highlights the diverse potential of the local weighting technique in an environmental valuation context.
The second chapter of this dissertation attempts to understand the impact of flood risk on housing prices. We can use "hedonic regression" to understand how different features of a house, like its size, location, sales year, amenities, and flood zone location affect its price. However, if we do not correctly specify this function, then the estimates will be misleading. Alternatively, we can use "matching" technique where we pair the houses inside and outside of the flood zone in all observable characteristics, and differentiate their price to estimate the flood zone impact. However, finding identical houses in all aspects of household and neighborhood characteristics is practically impossible. We propose that any leftover differences in features of the matched houses can be balanced out by considering where the houses are located (school zone, for example) and when they were sold. We refer to this route as the "cancellation effect" and show that this can indeed be achieved in practice especially when we pair a single house in a flood zone with many houses outside that zone. This not only allows us to accurately estimate the effect of flood zones on housing prices but also reduces the uncertainty around our findings.
|
3 |
Metody dynamické analýzy složení portfolia / Methods of dynamical analysis of portfolio compositionMeňhartová, Ivana January 2012 (has links)
Title: Methods of dynamical analysis of portfolio composition Author: Ivana Meňhartová Department: Department of Probability and Mathematical Statistics Supervisor: Mgr. Tomáš Hanzák, KPMS, MFF UK Abstract: In the presented thesis we study methods used for dynamic analysis of portfolio based on it's revenues. The thesis focuses on Kalman filter and local- ly weighted regression as two basic methods for dynamic analysis. It describes in detail theory for these methods as well as their utilization and it discusses their proper settings. Practical applications of both methods on artificial data and real data from Prague stock-exchange are presented. Using artificial data we demonstrate practical importance of Kalman filter's assumptions. Afterwards we introduce term multicolinearity as a possible complication to real data applicati- ons. At the end of the thesis we compare results and usage of both methods and we introduce possibility of enhancing Kalman filter by projection of estimations or by CUSUM tests (change detection tests). Keywords: Kalman filter, locally weighted regression, multicollinearity, CUSUM test
|
4 |
Active Learning with Statistical ModelsCohn, David A., Ghahramani, Zoubin, Jordan, Michael I. 21 March 1995 (has links)
For many types of learners one can compute the statistically 'optimal' way to select data. We review how these techniques have been used with feedforward neural networks. We then show how the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate.
|
5 |
Autonomous Motion Learning for Near Optimal ControlJennings, Alan Lance 21 August 2012 (has links)
No description available.
|
6 |
Změny délek odobí s charakteristickými teplotami vzduchu / Changes of length of periods with characteristic temperaturesČernochová, Eva January 2006 (has links)
Title: Changes of lengths of periods with characteristic air temperatures Author: Eva Černochová Department: Department of Meteorology and Environment Protection Supervisor: doc. RNDr. Jaroslava Kalvová, CSc. Supervisor's e-mail address: jaroslava.kalvova@mff.cuni.cz Abstract: Lengths of periods with characteristic air temperatures were derived using two different methods (linear interpolation, robust locally weighted regression) for 10 stations in the Czech Republic and for output data of regional climate models HIRHAM and RCAO in 4 grid points. Averages for a forty-year period (1961-2000) and for a thirty-year period (1961-1990) were computed as well as averages for every decade. Considerable attention was also paid to the analysis of methods used in the research. Most stations showed lengthening of growing season and summer during the twentieth century. Decennary average length of growing season and summer shortened in the years 1971-1980. The comparison of output data of regional climate models HIRHAM and RCAO and measured station data showed that the thirty-year average lengths of growing season and summer estimated by the two models were reasonably accurate approximately half of all cases. The models' estimates were not accurate at all concerning decennary averages. Keywords: robust locally...
|
7 |
A Reinforcement Learning Controller for Functional Electrical Stimulation of a Human ArmThomas, Philip S. January 2009 (has links)
No description available.
|
Page generated in 0.0997 seconds