Spelling suggestions: "subject:"quantile""
51 |
L'approche Support Vector Machines (SVM) pour le traitement des données fonctionnelles / Support Vector Machines (SVM) for Fonctional Data AnalysisHenchiri, Yousri 16 October 2013 (has links)
L'Analyse des Données Fonctionnelles est un domaine important et dynamique en statistique. Elle offre des outils efficaces et propose de nouveaux développements méthodologiques et théoriques en présence de données de type fonctionnel (fonctions, courbes, surfaces, ...). Le travail exposé dans cette thèse apporte une nouvelle contribution aux thèmes de l'apprentissage statistique et des quantiles conditionnels lorsque les données sont assimilables à des fonctions. Une attention particulière a été réservée à l'utilisation de la technique Support Vector Machines (SVM). Cette technique fait intervenir la notion d'Espace de Hilbert à Noyau Reproduisant. Dans ce cadre, l'objectif principal est d'étendre cette technique non-paramétrique d'estimation aux modèles conditionnels où les données sont fonctionnelles. Nous avons étudié les aspects théoriques et le comportement pratique de la technique présentée et adaptée sur les modèles de régression suivants. Le premier modèle est le modèle fonctionnel de quantiles de régression quand la variable réponse est réelle, les variables explicatives sont à valeurs dans un espace fonctionnel de dimension infinie et les observations sont i.i.d.. Le deuxième modèle est le modèle additif fonctionnel de quantiles de régression où la variable d'intérêt réelle dépend d'un vecteur de variables explicatives fonctionnelles. Le dernier modèle est le modèle fonctionnel de quantiles de régression quand les observations sont dépendantes. Nous avons obtenu des résultats sur la consistance et les vitesses de convergence des estimateurs dans ces modèles. Des simulations ont été effectuées afin d'évaluer la performance des procédures d'inférence. Des applications sur des jeux de données réelles ont été considérées. Le bon comportement de l'estimateur SVM est ainsi mis en évidence. / Functional Data Analysis is an important and dynamic area of statistics. It offers effective new tools and proposes new methodological and theoretical developments in the presence of functional type data (functions, curves, surfaces, ...). The work outlined in this dissertation provides a new contribution to the themes of statistical learning and quantile regression when data can be considered as functions. Special attention is devoted to use the Support Vector Machines (SVM) technique, which involves the notion of a Reproducing Kernel Hilbert Space. In this context, the main goal is to extend this nonparametric estimation technique to conditional models that take into account functional data. We investigated the theoretical aspects and practical attitude of the proposed and adapted technique to the following regression models.The first model is the conditional quantile functional model when the covariate takes its values in a bounded subspace of the functional space of infinite dimension, the response variable takes its values in a compact of the real line, and the observations are i.i.d.. The second model is the functional additive quantile regression model where the response variable depends on a vector of functional covariates. The last model is the conditional quantile functional model in the dependent functional data case. We obtained the weak consistency and a convergence rate of these estimators. Simulation studies are performed to evaluate the performance of the inference procedures. Applications to chemometrics, environmental and climatic data analysis are considered. The good behavior of the SVM estimator is thus highlighted.
|
52 |
Normative and quantitative analysis of educational inequalities, with reference to BrazilDomingues Waltenberg, Fabio 02 July 2007 (has links)
The existence of substantial socio-economic inequalities is one of the most fundamental features of the Brazilian society. Although educational inequality is not the only source of such socio-economic inequalities, it plays a major role, particularly regarding income inequality, both for current and for future generations. Acquiring a better understanding of the patterns of educational inequalities in Brazil is thus a relevant research topic, with implications for policy-making.
The first part of the thesis contains a conceptual discussion in which we try to determine an appropriate definition of educational justice. We advocate the use of “essential educational achievements” as the relevant “currency of educational justice” and we defend a version of “equality of educational opportunity” in which the responsibility that is assigned to individuals increases as they grow up.
While a remarkable quantitative improvement has taken place recently in Brazil, the situation concerning the quality of education is less clear. To explore qualitative aspects, in the second part, we turn to pupils' performance in standardized tests. Applying usual distributional assessment tools to such data, we map the intensity of educational inequalities in the country. Using recently-developed indices of inequality of opportunity, we assess the fairness of the Brazilian schooling system. Thus we identify both the areas where educational inequality is more intense, and those where educational unfairness is more severe.
In the third part, we use econometric methods to investigate how the reallocation of educational resources could contribute to moving Brazilian educational system towards educational fairness. First, we evaluate the effect of teachers' wages on pupils' achievement, and our analysis suggests there is scope for Brazilian public schools to improve their human resources policies, with potential benefits accruing to low-performing pupils. Then, we analyze the reallocations of educational resources required to equalize educational opportunities, and we find that the redistribution of non-monetary inputs could considerably reduce the magnitude of the financial redistribution needed.
|
53 |
Les effets de pairs à la lumière des interactions entre élèves et des dimensions subjectives du vécu scolaireRoco Fossa, Rodrigo 27 June 2011 (has links) (PDF)
Le présent travail de thèse aborde la problématique des effets de pairs en contexte scolaire. A partir de l'analyse détaillée d'une large base des données issue d'une enquête nationale au Chili (SIMCE 2004), on s'interroge sur les mécanismes qui véhiculent les influences entre élèves différemment dotés d'un point de vue de leurs capitaux culturels, humains et scolaires. Ces influences sembleraient présentes sur différents résultats à l'école, y compris ceux de type académique. Considérant la littérature produite sous différentes approches disciplinaires --sociologie, économie, psychologie sociale et sciences de l'éducation -- on s'attarde sur les manières d'identifier et de mesurer lesdits effets de pairs. En même temps, on considère la présence de dimensions subjectives capables d'exprimer, en partie, le vécu scolaire des élèves. Ces dimensions seraient, par ailleurs, reliées à la présence des pairs et aux interactions entre élèves. De manière additionnelle, on propose une révision de la littérature sur le système scolaire au Chili, notamment sur sa segmentation socio-scolaire et sa relation avec le mécanisme de vouchers. Dans ce cadre, trois interrogations principales organisent ce travail. D'une part, l'existence ou non d'un impact net sur les acquis scolaires des pratiques d'étude faisant appel aux camarades. Ensuite, la présence probable des influences sous la forme des " transferts des capitaux " entre élèves différemment dotés et déclarant pratiquer l'entraide. Enfin, les relations qui s'avèrent visibles entre ces pratiques et des dimensions telles que le bien-être à l'école ou le concept du soi académique, mais aussi, entre ces dernières et les acquis scolaires. Une séquence d'analyses est entreprise visant à donner des bases robustes aux éventuelles réponses à ces questions. Entre autres, différentes séries d'analyses de régression hiérarchique et par quantiles ont été conduites sur quatre disciplines scolaires. Les principaux résultats de recherche indiquent, d'un côté, que les interactions entre élèves sont assez répandues en milieux scolaire (entre 22% et 41% en moyenne), mais leur proportion varie d'une discipline à l'autre et selon la direction qui prend l'aide. Plus encore, ces interactions sont significativement liées aux résultats scolaires. A conditions comparables, les élèves académiquement faibles gagnent à être aidés par leurs camarades, quelque soit la discipline concernée. En même temps, les élèves qui aident leurs camarades montrent toujours un profil académique fortement associé à des gains de score assez importants. D'un autre côté, on trouve que les élèves possédant plus de capital culturel ont, toutes choses égales par ailleurs, de plus fortes chances de déclarer aider leurs camarades. Enfin, les analyses confirment que les interactions entre élèves sont, de manière importante et significative, liées aux sentiments de bien-être à l'école et au concept de soi académique. La construction d'indices pertinents pour ces derniers est, d'ailleurs, discutée. Différents résultats secondaires ont été aussi produits et discutés, notamment la confirmation, pour la première fois dans le cas chilien, des hypothèses associées au paradigme BFLPE (Marsh, 1987). Ces résultats sont discutés dans leurs probables conséquences en termes de politique éducative, notamment dans le cadre des systèmes éducatifs à forte ségrégation sociale et scolaire.
|
54 |
Estimateur bootstrap de la variance d'un estimateur de quantile en contexte de population finieMcNealis, Vanessa 12 1900 (has links)
Ce mémoire propose une adaptation lisse de méthodes bootstrap par pseudo-population aux fins d'estimation de la variance et de formation d'intervalles de confiance pour des quantiles de population finie. Dans le cas de données i.i.d., Hall et al. (1989) ont montré que l'ordre de convergence de l'erreur relative de l’estimateur bootstrap de la variance d’un quantile échantillonnal connaît un gain lorsque l'on rééchantillonne à partir d’une estimation lisse de la fonction de répartition plutôt que de la fonction de répartition expérimentale. Dans cet ouvrage, nous étendons le principe du bootstrap lisse au contexte de population finie en le mettant en œuvre au sein des méthodes bootstrap par pseudo-population. Étant donné un noyau et un paramètre de lissage, cela consiste à lisser la pseudo-population dont sont issus les échantillons bootstrap selon le plan de sondage initial. Deux plans sont abordés, soit l'échantillonnage aléatoire simple sans remise et l'échantillonnage de Poisson. Comme l'utilisation des algorithmes proposés nécessite la spécification du paramètre de lissage, nous décrivons une méthode de sélection par injection et des méthodes de sélection par la minimisation d'estimés bootstrap de critères d'ajustement sur une grille de valeurs du paramètre de lissage. Nous présentons des résultats d'une étude par simulation permettant de montrer empiriquement l'efficacité de l'approche lisse par rapport à l'approche standard pour ce qui est de l'estimation de la variance d'un estimateur de quantile et des résultats plus mitigés en ce qui concerne les intervalles de confiance. / This thesis introduces smoothed pseudo-population bootstrap methods for the purposes
of variance estimation and the construction of confidence intervals for finite population
quantiles. In an i.i.d. context, Hall et al. (1989) have shown that resampling from a smoothed
estimate of the distribution function instead of the usual empirical distribution function can
improve the convergence rate of the bootstrap variance estimator of a sample quantile. We
extend the smoothed bootstrap to the survey sampling framework by implementing it in
pseudo-population bootstrap methods. Given a kernel function and a bandwidth, it consists
of smoothing the pseudo-population from which bootstrap samples are drawn using the
original sampling design. Two designs are discussed, namely simple random sampling and
Poisson sampling. The implementation of the proposed algorithms requires the specification
of the bandwidth. To do so, we develop a plug-in selection method along with grid search
selection methods based on bootstrap estimates of two performance metrics. We present the
results of a simulation study which provide empirical evidence that the smoothed approach
is more efficient than the standard approach for estimating the variance of a quantile
estimator together with mixed results regarding confidence intervals.
|
55 |
Essays on ForecastingPacella, Claudia 15 June 2020 (has links) (PDF)
In this thesis I apply modern econometric techniques on macroeconomic time series. Forecasting is here developed along several dimensions in the three chapters. The chapters are in principle self-contained. However, a common element is represented by the business cycle analysis. In the first paper, which primarily deals with the problem of forecasting euro area inflation in the short and medium run, we also compute the country-specific responses of a common business cycle shock. Both chapters 2 and 3 deal predominately with business cycle issues from two different perspectives. The former chapter analyses the business cycle as a dichotomous non-observable variable and addresses the issue of evaluating the euro area business cycle dating formulated by the CEPR committee, while the latter chapter studies the entire distribution of GDP growth. / Doctorat en Sciences économiques et de gestion / info:eu-repo/semantics/nonPublished
|
56 |
Market Surveillance Using Empirical Quantile Model and Machine Learning / Marknadsövervakning med hjälp av empirisk kvantilmodell och maskininlärningLandberg, Daniel January 2022 (has links)
In recent years, financial trading has become more available. This has led to more market participants and more trades taking place each day. The increased activity also implies an increasing number of abusive trades. To detect the abusive trades, market surveillance systems are developed and used. In this thesis, two different methods were tested to detect these abusive trades on high-dimensional data. One was based on empirical quantiles, and the other was based on an unsupervised machine learning technique called isolation forest. The empirical quantile method uses empirical quantiles on dimensionally reduced data to determine if a datapoint is an outlier or not. Principal Component Analysis (PCA) is used to reduce the dimensionality of the data and handle the correlation between features.Isolation forest is a machine learning method that detects outliers by sorting each datapoint in a tree structure. If a datapoint is close to the root, it is more likely to be an outlier. Isolation forest have been proven to detect outliers in high-dimensional datasets successfully, but have not been tested before for market surveillance. The performance of both the quantile method and isolation forest was tested by using recall and run-time. The conclusion was that the empirical quantile method did not detect outliers accurately when all dimensions of the data were used. The method most likely suffered from the curse of dimensionality and could not handle high dimensional data. However, the performance increased when the dimensionality was reduced. Isolation forest performed better than the empirical quantile method and detected 99% of all outliers by classifying 226 datapoints as outliers out of a dataset with 184 true outliers and 1882 datapoints. / Under de senaste åren har finansiell handel blivit mer tillgänglig för allmänheten. Detta har lett till fler deltagare på marknaderna och att fler affärer sker varje dag. Den ökade aktiviteten innebär också att de missbruk som förekommer ökar. För att upptäcka otillåtna affärer utvecklas och används marknadsövervakningssystem. I den här avhandlingen testades två olika metoder för att upptäcka dessa missbruk utifrån högdimensionell data. Den ena baserades på empiriska kvantiler och den andra baserades på en oövervakad maskininlärningsteknik som kallas isolationsskog. Den empiriska kvantilmetoden använder empiriska kvantiler på dimensionellt reducerad data för att avgöra om en datapunkt är ett extremvärde eller inte. För att reducera dimensionen av datan, och för att hantera korrelationen mellan variabler, används huvudkomponent analys (HKA).Isolationsskog är en maskininlärnings metod som upptäcker extremvärden genom att sortera varje datapunkt i en trädstruktur. Om en datapunkt är nära roten är det mer sannolikt att det är en extremvärde. Isolationsskog har visat sig framgångsrikt upptäcka extremvärden i högdimensionella datauppsättningar, men har inte testats för marknadsövervakning tidigare. För att mäta prestanda för båda metoderna användes recall och körtid. Slutsatsen är att den empiriska kvantilmetoden inte hittade extremvärden när alla dimensioner av datan användes. Metoden led med största sannolikhet av dimensionalitetens förbannelse och kunde inte hantera högdimensionell data, men när dimensionaliteten reducerades ökade prestandan. Isolationsskog presterade bättre än den empiriska kvantilmetoden och lyckades detektera 99% av alla extremvärden genom att klassificera 226 datapunkter som extremvärden ur ett dataset med 184 verkliga extremvärden och 1882 datapunkter.
|
57 |
Efficient Approaches to the Treatment of Uncertainty in Satisfying Regulatory LimitsGrabaskas, David 30 August 2012 (has links)
No description available.
|
58 |
股市價量關係的分量迴歸分析 / A Quantile Regression Analysis of Return-Volume Relations in the Stock Markets莊家彰, Chuang, Chia-Chang Unknown Date (has links)
第一章 台灣與美國股市價量關係的分量迴歸分析
摘要
本文利用分量迴歸來觀察台灣和美國股市報酬率和成交量的價量關係。實證結果發現兩地股市的價量關係截然不同。台灣股市的報酬率與成交量之間具有正向關係,呈現「價量齊揚」和「價跌量縮」的現象,而前者效果通常較顯著;但報酬率接近最大漲幅限制時,報酬率與成交量之間並無顯著關係,報酬率接近最大跌幅限制時,「價跌量縮」的現象甚至更強。相對於台灣,美國股市的報酬率與成交量則出現「價量齊揚」與「價量背離」互相對稱的 “V” 字關係。就實證方法而言,傳統以 OLS 方法估計的迴歸模型並無法得到上述的實證結果。進一步的分析顯示,融券成數的高低以及平盤以下不得放空等規定都是造成台灣股市出現「價跌量縮」的可能原因。
第二章 股市價量關係的分量迴歸分析 (二)
摘要
本章利用分量迴歸觀察包括台灣在內的亞洲新興工業國家與成熟股市的價量關係。實證結果顯示,亞洲新興工業國家和日本股市「價量齊揚」的效果較強,其中香港、南韓和新加坡呈現較弱的「價量背離」現象,因此價量之間有不對稱的 “V” 字關係;而日本股市則呈現「價跌量縮」,與第一章分析的台灣股市價量關係相似。在成熟股市的價量關係中,英國金融時報指數、美國道瓊工業指數和德國股價指數皆呈現對稱的 “V” 字關係,與美國US指數的價量關係相似。亞洲地區的國家在1997下半年到1998上半年普遍經歷了一場金融風暴,本文進一步的分析發現在這場風暴期間,亞洲地區除了台灣以外,日本、香港、南韓與新加坡都出現較強的「價量齊揚」與「價量背離」,這種現象可能肇因於投資人認為風暴期間的股價報酬率風險較高,遂使得股價報酬率對成交量的反應較為敏銳。相對而言,歐美地區的國家,受到亞洲金融風暴的影響較小,所以整體的價量關係在亞洲金融風暴期間並無重大改變。本章的結果都是透過分量迴歸所獲得。
第三章 股市價量因果關係的分量迴歸分析
摘要
本文依據分量迴歸設計 Granger 因果關係的新檢驗方法,並依此方法來檢驗幾個股市價量之間的因果關係。本文分析的股市包括日本、英國與美國等世界前三大股市,以及合稱亞洲四小龍的台灣、香港、南韓與新加坡等新興工業國家或地區的股市。實證結果顯示:除了台灣股市以外,其他的股市皆呈現 “V” 字的跨期價量關係。其中英國、美國、香港和新加坡股市的跨期價量關係大體呈現正向「價量齊揚」與負向「價量背離」互相對稱的 “V” 字關係,而日本和南韓股市則是「價量齊揚」較強的不對稱 “V” 字關係。此一結果表示這些股市的價量之間都存在分配上的 Granger (1969) 因果關係。但若以均數迴歸來衡量跨期價量關係,則所有股市都呈現不顯著的跨期價量關係,也就是傳統文獻上所謂價量之間沒有 Granger 因果關係。本文所提出的 Granger 因果關係之分量迴歸分析,可以觀察到整個條件分配中各分量的因果關係,為分配上的 Granger 因果關係提供一個較完整的檢驗方法。 / We examine the relationship between the stock return and trading volume in the Taiwan and U.S. Stock Exchanges using quantile regression. The empirical results show that the return-volume relations in these two exchanges are quite different. For Taiwan data, there are significant positive return-volume relations across quantiles, showing that a large positive return is usually accompanied with a large trading volume and a large negative return with a small trading volume, yet the effect of former is stronger. However, such relations change when returns approach the price limits. We also find that for U.S. data, return-volume relations exhibit symmetric V-shapes across quantiles, showing that a large return (in either sign) is usually accompanied with a large trading volume. On the other hand, linear regressions estimated by the ordinary least square method are unable to reveal such patterns. Further investigation shows that various restrictions on short sales in the Taiwan Stock Exchange may explain the difference between the return-volume relations in Taiwan and U.S. data.
|
59 |
Outliers detection in mixtures of dissymmetric distributions for data sets with spatial constraints / Détection de valeurs aberrantes dans des mélanges de distributions dissymétriques pour des ensembles de données avec contraintes spatialesPlanchon, Viviane 29 May 2007 (has links)
In the case of soil chemical analyses, frequency distributions for some elements show a dissymmetrical aspect, with a very marked spread to the right or to the left. A high frequency of extreme values is also observed and a possible mixture of several distributions, due to the presence of various soil types within a single geographical unit, is encountered. Then, for the outliers detection and the establishment of detection limits, an original outliers detection procedure has been developed; it allows estimating extreme quantiles above and under which observations are considered as outliers. The estimation of these detection limits is based on the right and the left of the distribution tails. A first estimation is realised for each elementary geographical unit to determine an appropriate truncation level. Then, a spatial classification allows creating adjoining homogeneous groups of geographical units to estimate robust limit values based on an optimal number of observations. / Dans le cas des analyses chimiques de sols, les distributions de fréquences des résultats présentent, pour certains éléments étudiés, un caractère très dissymétrique avec un étalement très marqué à droite ou à gauche. Une fréquence importante de valeurs extrêmes est également observée et un mélange éventuel de plusieurs distributions au sein dune même entité géographique, lié à la présence de divers types de sols, peut être rencontré. Dès lors, pour la détection des valeurs aberrantes et la fixation des limites de détection, une méthode originale, permettant destimer des quantiles extrêmes au-dessus et en dessous desquelles les observations sont considérées comme aberrantes, a été élaborée. Lestimation des limites de détection est établie de manière distincte à partir des queues des distributions droite et gauche. Une première estimation par entité géographique élémentaire est réalisée afin de déterminer un niveau de troncature adéquat. Une classification spatiale permet ensuite de créer des groupes dentités homogènes contiguës, de manière à estimer des valeurs limites robustes basées sur un nombre dobservations optimal.
|
60 |
A New Mathematical Framework for Regional Frequency Analysis of FloodsBasu, Bidroha January 2015 (has links) (PDF)
Reliable estimates of design flood quantiles are often necessary at sparsely gauged/ungauged target locations in river basins for various applications in water resources engineering. Development of effective methods for use in this task has been a long-standing challenge in hydrology for over five decades.. Hydrologists often consider various regional flood frequency analysis (RFFA) approaches that involve (i) use of regionalization approach to delineate a homogeneous group of watersheds resembling watershed of the target location, and (ii) use of a regional frequency analysis (RFA) approach to transfer peak flow related information from gauged watersheds in the group to the target location, and considering the information as the basis to estimate flood quantile(s) for the target site. The work presented in the thesis is motivated to address various shortcomings/issues associated with widely used regionalization and RFA approaches.
Regionalization approaches often determine regions by grouping data points in multidimensional space of attributes depicting watershed’s hydrology, climatology, topography, land-use/land-cover and soils. There are no universally established procedures to identify appropriate attributes, and modelers use subjective procedures to choose a set of attributes that is considered common for the entire study area. This practice may not be meaningful, as different sets of attributes could influence extreme flow generation mechanism in watersheds located in different parts of the study area. Another issue is that practitioners usually give equal importance (weight) to all the attributes in regionalization, though some attributes could be more important than others in influencing peak flows. To address this issue, a two-stage clustering approach is developed in the thesis. It facilitates identification of appropriate attributes and their associated weights for use in regionalization of watersheds in the context of flood frequency analysis. Effectiveness of the approach is demonstrated through a case study on Indiana watersheds.
Conventional regionalization approaches could prove effective for delineating regions when data points (depicting watersheds) in watershed related attribute space can be segregated into disjoint groups using straight lines or linear planes. They prove ineffective when (i) data points are not linearly separable, (ii) the number of attributes and watersheds is large, (iii) there are outliers in the attribute space, and (iv) most watersheds resemble each other in terms of their attributes. In real world scenario, most watersheds resemble each other, and regions may not always be segregated using straight lines or linear planes, and dealing with outliers and high-dimensional data is inevitable in regionalization. To address this, a fuzzy support vector clustering approach is proposed in the thesis and its effectiveness over commonly used region-of-influence approach, and different cluster analysis based regionalization methods is demonstrated through a case study on Indiana watersheds. For the purpose of regional frequency analysis (RFA), index-flood approach is widely used over the past five decades. Conventional index-flood (CIF) approach assumes that values of scale and shape parameters of frequency distribution are identical across all the sites in a homogeneous region. In real world scenario, this assumption may not be valid even if a region is statistically homogeneous. Logarithmic index-flood (LIF) and population index-flood (PIF) methodologies were proposed to address the problem, but even those methodologies make unrealistic assumptions. PIF method assumes that the ratio of scale to location parameters is a constant for all the sites in a region. On the other hand, LIF method assumes that appropriate frequency distribution to fit peak flows could be found in log-space, but in reality the distribution of peak flows in log space may not be closer to any of the known theoretical distributions. To address this issue, a new mathematical approach to RFA is proposed in L-moment and LH-moment frameworks that can overcome shortcomings of the CIF approach and its related LIF and PIF methods that make various assumptions but cannot ensure their validity in RFA. For use with the proposed approach, transformation mechanisms are proposed for five commonly used three-parameter frequency distributions (GLO, GEV, GPA, GNO and PE3) to map the random variable being analyzed from the original space to a dimensionless space where distribution of the random variable does not change, and deviations of regional estimates of all the distribution’s parameters (location, scale, shape) with respect to their population values as well as at-site estimates are minimal. The proposed approach ensures validity of all the assumptions of CIF approach in the dimensionless space, and this makes it perform better than CIF approach and related LIF and PIF methods. Monte-Carlo simulation experiments revealed that the proposed approach is effective even when the form of regional frequency distribution is mis-specified. Case study on watersheds in conterminous United States indicated that the proposed approach outperforms methods based on index-flood approach in real world scenario.
In recent decades, fuzzy clustering approach gained recognition for regionalization of watersheds, as it can account for partial resemblance of several watersheds in watershed related attribute space. In working with this approach, formation of regions and quantile estimation requires discerning information from fuzzy-membership matrix. But, currently there are no effective procedures available for discerning the information. Practitioners often defuzzify the matrix to form disjoint clusters (regions) and use them as the basis for quantile estimation. The defuzzification approach (DFA) results in loss of information discerned on partial resemblance of watersheds. The lost information cannot be utilized in quantile estimation, owing to which the estimates could have significant error. To avert the loss of information, a threshold strategy (TS) was considered in some prior studies, but it results in under-prediction of quantiles. To address this, a mathematical approach is proposed in the thesis that allows discerning information from fuzzy-membership matrix derived using fuzzy clustering approach for effective quantile estimation. Effectiveness of the approach in estimating flood quantiles relative to DFA and TS was demonstrated through Monte-Carlo simulation experiments and case study on mid-Atlantic water resources region, USA.
Another issue with index flood approach and its related RFA methodologies is that they assume linear relationship between each of the statistical raw moments (SMs) of peak flows and watershed related attributes in a region. Those relationships form the basis to arrive at estimates of SMs for the target ungauged/sparsely gauged site, which are then utilized to estimate parameters of flood frequency distribution and quantiles corresponding to target return periods. In reality, non-linear relationships could exist between SMs and watershed related attributes. To address this, simple-scaling and multi-scaling methodologies have been proposed in literature, which assume that scaling (power law) relationship exists between each of the SMs of peak flows at sites in a region and drainage areas of watersheds corresponding to those sites. In real world scenario, drainage area alone may not completely describe watershed’s flood response. Therefore flood quantile estimates based on the scaling relationships can have large errors. To address this, a recursive multi-scaling (RMS) approach is proposed that facilitates construction of scaling (power law) relationship between each of the SMs of peak flows and a set of site’s region-specific watershed related attributes chosen/identified in a recursive manner. The approach is shown to outperform index-flood based region-of-influence approach, simple-and multi-scaling approaches, and a multiple linear regression method through leave-one-out cross validation experiment on watersheds in and around Indiana State, USA.
The conventional approaches to flood frequency analysis (FFA) are based on the assumption that peak flows at the target site represent a sample of independent and identically distributed realization drawn from a stationary homogeneous stochastic process. This assumption is not valid when flows are affected by changes in climate and/or land use/land cover, and regulation of rivers through dams, reservoirs and other artificial diversions/storages. In situations where evidence of non-stationarity in peak flows is strong, it is not appropriate to use quantile estimates obtained based on the conventional FFA approaches for hydrologic designs and other applications. Downscaling is one of the options to arrive at future projections of flows at target sites in a river basin for use in FFA. Conventional downscaling methods attempt to downscale General Circulation Model (GCM) simulated climate variables to streamflow at target sites. In real world scenario, correlation structure exists between records of streamflow at sites in a study area. An effective downscaling model must be parsimonious, and it should ensure preservation of the correlation structure in downscaled flows to a reasonable extent, though exact reproduction/mimicking of the structure may not be necessary in a climate change (non-stationary) scenario. A few recent studies attempted to address this issue based on the assumption of spatiotemporal covariance stationarity. However, there is dearth of meaningful efforts especially for multisite downscaling of flows. To address this, multivariate support vector regression (MSVR) based methodology is proposed to arrive at flood return levels (quantile estimates) for target locations in a river basin corresponding to different return periods in a climate change scenario. The approach involves (i) use of MSVR relationships to downscale GCM simulated large scale atmospheric variables (LSAVs) to monthly time series of streamflow at multiple locations in a river basin, (ii) disaggregation of the downscaled streamflows corresponding to each site from monthly to daily time scale using k-nearest neighbor disaggregation methodology, (iii) fitting time varying generalized extreme value (GEV) distribution to annual maximum flows extracted from the daily streamflows and estimating flood return levels for different target locations in the river basin corresponding to different return periods.
|
Page generated in 0.0631 seconds