Spelling suggestions: "subject:"byelection bias"" "subject:"dielection bias""
41 |
Risques professionnels dans l'asthme / Occupational risk factors in asthmaDumas Milne Edwards, Orianne 05 December 2012 (has links)
L’importance des facteurs de risque professionnels dans l’asthme est bien établie, mais le rôle de certains agents doit être clarifié. Les objectifs de la thèse sont d’évaluer les liens entre les expositions aux produits de nettoyage et l’asthme, et d’étudier l’impact et la prise en compte du biais du travailleur sain, dans l’étude Epidémiologique des facteurs Génétiques et Environnementaux de l’Asthme (EGEA, 2047 sujets dont 1477 adultes avec des données professionnelles). L’exposition aux produits de nettoyage a été estimée par une expertise et une matrice emploi-exposition. Chez les femmes, l’asthme actuel était associé à l’exposition aux détartrants (OR=2.4 (1.1-5.3)), et aux sprays (2.9 (1.0-8.1)) et à l’ammoniac (3.1 (1.2-7.8)) chez les aides-soignantes. Les détartrants et l’ammoniac sont des irritants. L’exposition aux produits de nettoyage était associée à l’asthme sévère et sans sensibilisation allergique. Deux analyses ont souligné l’importance du biais du travailleur sain dans l’asthme. Un biais de sélection à l’embauche a été observé chez des sujets avec un asthme sévère dans l’enfance. Un modèle marginal structural a permis de prendre en compte le biais du travailleur sain dans l’étude de l’effet des expositions professionnelles sur l’expression clinique de l’asthme au cours de la vie. En plus du rôle d’asthmogènes connus, le rôle d’agents moins bien établis, comprenant des irritants (1.6 (1.0-2.4)) était suggéré. Les résultats sont cohérents avec un rôle des irritants dans l’asthme lié au travail. Ils soutiennent une utilisation plus large d’approches d’analyse causale pour contrôler le biais du travailleur sain dans les études des risques professionnels. / It is well-recognized that workplace exposures importantly contribute to the burden of asthma, but the role of some agents needs to be clarified. The aims of the thesis are to evaluate the relationships between occupational exposure to cleaning products and asthma, and to study the impact and the control of the healthy worker effect bias, in the Epidemiological study on the Genetics and Environment of Asthma (EGEA, 2047 subjects including 1477 adults with data regarding occupations).Exposure to cleaning products was estimated by an expert assessment and a job-exposure matrix. In women, current asthma was associated with exposure to decalcifiers (OR=2.4 (1.1-5.3)), and to sprays (2.9 (1.0-8.1)) and ammonia (3.1 (1.2-7.8)) in personal care workers. Decalcifiers and ammonia are irritants. Exposure to cleaning products was associated with severe asthma, and asthma without allergic sensitization. Two analyses underlined the important impact of the healthy worker effect in asthma. A healthy worker hire effect was observed in subjects with severe asthma in childhood. Using a marginal structural model, we studied the effect of occupational exposure on asthma clinical expression over a lifetime, while controlling for the healthy worker effect bias. Elevated risks of asthma were observed, not only for known asthmagens, but also for other agents which role in asthma is less established, including irritants (1.6 (1.0-2.4)). The results are consistent with a role of irritants in work-related asthma. They support a broader use of causal inference approaches, to control the healthy worker effect bias in studies of occupational risk factors.
|
42 |
Learning under differing training and test distributionsBickel, Steffen January 2008 (has links)
One of the main problems in machine learning is to train a predictive model from training data and to make predictions on test data. Most predictive models are constructed under the assumption that the training data is governed by the exact same distribution which the model will later be exposed to. In practice, control over the data collection process is often imperfect. A typical scenario is when labels are collected by questionnaires and one does not have access to the test population. For example, parts of the test population are underrepresented in the survey, out of reach, or do not return the questionnaire. In many applications training data from the test distribution are scarce because they are difficult to obtain or very expensive. Data from auxiliary sources drawn from similar distributions are often cheaply available.
This thesis centers around learning under differing training and test distributions and covers several problem settings with different assumptions on the relationship between training and test distributions-including multi-task learning and learning under covariate shift and sample selection bias. Several new models are derived that directly characterize the divergence between training and test distributions, without the intermediate step of estimating training and test distributions separately. The integral part of these models are rescaling weights that match the rescaled or resampled training distribution to the test distribution. Integrated models are studied where only one optimization problem needs to be solved for learning under differing distributions. With a two-step approximation to the integrated models almost any supervised learning algorithm can be adopted to biased training data.
In case studies on spam filtering, HIV therapy screening, targeted advertising, and other applications the performance of the new models is compared to state-of-the-art reference methods. / Eines der wichtigsten Probleme im Maschinellen Lernen ist das Trainieren von Vorhersagemodellen aus Trainingsdaten und das Ableiten von Vorhersagen für Testdaten. Vorhersagemodelle basieren üblicherweise auf der Annahme, dass Trainingsdaten aus der gleichen Verteilung gezogen werden wie Testdaten. In der Praxis ist diese Annahme oft nicht erfüllt, zum Beispiel, wenn Trainingsdaten durch Fragebögen gesammelt werden. Hier steht meist nur eine verzerrte Zielpopulation zur Verfügung, denn Teile der Population können unterrepräsentiert sein, nicht erreichbar sein, oder ignorieren die Aufforderung zum Ausfüllen des Fragebogens. In vielen Anwendungen stehen nur sehr wenige Trainingsdaten aus der Testverteilung zur Verfügung, weil solche Daten teuer oder aufwändig zu sammeln sind. Daten aus alternativen Quellen, die aus ähnlichen Verteilungen gezogen werden, sind oft viel einfacher und günstiger zu beschaffen.
Die vorliegende Arbeit beschäftigt sich mit dem Lernen von Vorhersagemodellen aus Trainingsdaten, deren Verteilung sich von der Testverteilung unterscheidet. Es werden verschiedene Problemstellungen behandelt, die von unterschiedlichen Annahmen über die Beziehung zwischen Trainings- und Testverteilung ausgehen. Darunter fallen auch Multi-Task-Lernen und Lernen unter Covariate Shift und Sample Selection Bias. Es werden mehrere neue Modelle hergeleitet, die direkt den Unterschied zwischen Trainings- und Testverteilung charakterisieren, ohne dass eine einzelne Schätzung der Verteilungen nötig ist. Zentrale Bestandteile der Modelle sind Gewichtungsfaktoren, mit denen die Trainingsverteilung durch Umgewichtung auf die Testverteilung abgebildet wird. Es werden kombinierte Modelle zum Lernen mit verschiedenen Trainings- und Testverteilungen untersucht, für deren Schätzung nur ein einziges Optimierungsproblem gelöst werden muss. Die kombinierten Modelle können mit zwei Optimierungsschritten approximiert werden und dadurch kann fast jedes gängige Vorhersagemodell so erweitert werden, dass verzerrte Trainingsverteilungen korrigiert werden.
In Fallstudien zu Email-Spam-Filterung, HIV-Therapieempfehlung, Zielgruppenmarketing und anderen Anwendungen werden die neuen Modelle mit Referenzmethoden verglichen.
|
43 |
Analyse économétrique des décisions de production des propriétaires forestiers privés non industriels en FranceKere, Eric Nazindigouba 21 March 2013 (has links)
La production de bois intègre notamment des enjeux économiques, climatiques et énergétiques. En France, selon les données de l'Institut National de l'Information Géographique et Forestière, l'accroissement biologique de la forêt est largement supérieur aux prélèvements de bois. C'est pourquoi l'État français a fixé l'objectif de prélever 21 millions de m3 supplémentaires de bois d'ici 2020 (Grenelle de l'environnement, 2007). Cependant, la forêt française appartient majoritairement à des propriétaires forestiers privés qui ont des préférences à la fois pour le revenu issu de la vente de bois et pour les aménités non-bois. Les politiques visant à accroître la production de bois doivent donc intégrer ces aspects. L'objectif de ce travail de thèse est de comprendre les déterminants de la production jointe de bois et d'aménités non-bois en France. Pour ce faire, nous nous sommes d'abord intéressés aux déterminants individuels et régionaux de l'offre de bois. Nous montrons que le comportement d'offre de bois d'un propriétaire peut varier en fonction du comportement de production de bois constaté chez ses pairs (effets sociaux). Ensuite, nous mettons en évidence un comportement de mimétisme dans les décisions de production jointe de bois et d'aménités des propriétaires forestiers privés. Enfin, nous analysons les arbitrages inter-temporels réalisés par les propriétaires entre aménités non-bois et revenu de la vente de bois en prenant en compte explicitement les anticipations de prix et de croissance. Nous évaluons à 23e par an la valeur que les propriétaires de notre échantillon accordent à 1m3/ha de bois supplémentaire laissé sur pied par rapport au niveau de stock des propriétaires industriels afin d'avoir des aménités plus importantes.Un des enjeux de ce travail est d?offrir des pistes pour mobiliser la ressource forestière ne faisant pas l'objet d'une offre, faute d'implication des propriétaires privés, soit par manque de connaissance ou d'intérêt pour leur forêt, soit parce que d'autres aspects sont privilégiés (services d'aménités non-bois par exemple). Dans cette thèse, nous montrons que les effets de mimétisme et d'entrainement social (effets sociaux) peuvent être utilisés pour amener les propriétaires forestiers à produire plus de bois. Nous montrons également, qu'une hausse du prix du bois ou la mise en place d'une taxepeut favoriser la prise de la décision de coupe de bois et augmenter l'intensité de la récolte. / Timber production is related to economic, climate and energy issues. In France,according to data from the National Institute of Geoinformation and Forestry, thebiological growth rate of the forest is greater than the timber harvest rate. Thus, theFrench government has set a target of harvesting an additional quantity of 21 millioncubic meter of timber by 2020 ("Grenelle de l'environnement, 2007"). However, theFrench forest is majority owned by private forest owners who have preferences forboth income from timber trade and from non-timber amenities. The policies toincrease timber production must include these aspects. The objective of this thesisis to understand the determinants of joint production of timber and non-timberamenities in France.Therefore, we first analyze private forest owners' timber supply, taking into accountindividual and regional determinants. Afterwards, we investigate whether thedrivers of forest owners behavior differ within and between these different levels.We show that similar timber supply behavior can be observed when regional characteristicsor those of peers are similar. Then, we highlight a mimicry behavior injoint production decisions of timber and amenities made by private forest owners.Finally, we analyze inter-temporal trade-offs made by the owners from non-timberamenities and income from the sale of wood. We explicitly take into account theprice expectations and growth. Our estimations show that the willingness to pay fornon-timber amenities is e23 for our case study. This value is the difference betweenthe value they could have earned if they tried to maximize timber revenue and therevenue of their actual logging.Mainly beacause of a lack of involvement of private owners, either through a lackof knowledge or interest in their forest, or because other aspects are privileged (nontimberamenities, e.g.), a part of forest ressource is not subject to a commercial offer.Providing ways to mobilize this ressource is one of the challenges of this work. Weshow that the mimetic effects and the contextual effects can be used to encourageforest owners to produce more timber. An effective policy could be a combinationof these two effects. We also show that an increase in the price of timber or theadoption of a tax may be an incentive for timber harvesting.
|
44 |
Možnosti redukce výběrového zkreslení v ratingových modelech / Selection Bias Reduction in Credit Scoring ModelsDitrich, Josef January 2009 (has links)
Nowadays, the use of credit scoring models in the financial sector is a common practice. Credit scoring plays an important role in profitability and transparency of lending business. Given the high credit volumes, even a small improvement of discriminatory and predictive power of a credit scoring model may provide a substantial additional profit. Scoring models are applied on the through-the-door population, however, for creating them or adjusting already existing credit rules, it is usual to use only the data corresponding to accepted applicants for which payment discipline can be observed. This discrepancy can lead to reject bias (or selection bias in general). Methods trying to eliminate or reduce this phenomenon are known by the term reject inference. In general, these methods try to assess the behavior of rejected applicants or to obtain an additional information about them. In the dissertation thesis, I dealt with the enlargement method which is based on a random acceptance of applicants that would have been rejected. This method is not only time consuming but also expensive. Therefore I looked for the ways how to reduce the cost of acquiring additional information about rejected applicants. As a result, I have proposed a modification which I called the enlargement method with sorting variable. It was validated on real bank database with two possible sorting variables and the results were compared with the original version of the method. It was shown that both tested approaches can reduce its cost while retaining the accuracy of the scoring models.
|
45 |
The use of weights to account for non-response and drop-outHöfler, Michael, Pfister, Hildegard, Lieb, Roselind, Wittchen, Hans-Ulrich January 2005 (has links)
Background: Empirical studies in psychiatric research and other fields often show substantially high refusal and drop-out rates. Non-participation and drop-out may introduce a bias whose magnitude depends on how strongly its determinants are related to the respective parameter of interest.
Methods: When most information is missing, the standard approach is to estimate each respondent’s probability of participating and assign each respondent a weight that is inversely proportional to this probability. This paper contains a review of the major ideas and principles regarding the computation of statistical weights and the analysis of weighted data.
Results: A short software review for weighted data is provided and the use of statistical weights is illustrated through data from the EDSP (Early Developmental Stages of Psychopathology) Study. The results show that disregarding different sampling and response probabilities can have a major impact on estimated odds ratios.
Conclusions: The benefit of using statistical weights in reducing sampling bias should be balanced against increased variances in the weighted parameter estimates.
|
46 |
Estimating the Causal Effect of High School Mathematics Coursetaking on Placement out of Postsecondary Remedial MathematicsShowalter, Daniel A. 12 June 2014 (has links)
No description available.
|
47 |
Statistical Estimation of Software Reliability and Failure-causing EffectShu, Gang 02 September 2014 (has links)
No description available.
|
48 |
Le biais de sélection par rapport au sexe en recherche sur le stress humain : une étude exploratoireAlarie, Samuel 12 1900 (has links)
Le biais de sélection par rapport au sexe (ou biais de sexe) représente une différence systématique des proportions d’hommes et de femmes entre un échantillon de participants et leur population, ce qui peut miner la validité d’une étude. La recherche sur le stress humain est susceptible au biais de sexe étant donné la présence de facteurs y étant généralement associés, principalement les protocoles invasifs – contenant des éléments douloureux, inconfortables ou menaçants pour les participants. La présente étude a vérifié si les proportions d’hommes et de femmes des études sur le stress varient selon 1) le niveau d’invasion d’une étude en stress (invasif ou non invasif) et selon 2) des facteurs exploratoires (p. ex. pays, méthode de recrutement). Deux domaines hors stress possédant des protocoles invasifs (douleur expérimentale) et non invasifs (mémoire) ont été utilisés comme domaines contrôles. Dans cette enquête transversale de la littérature, les proportions d’hommes et de femmes ont été recueillies dans 324 études contenant des protocoles invasifs ou non invasifs, représentant un total de 23 611 participants, dont 42,18 % d’hommes. La représentativité des sexes a varié selon le niveau d’invasion dans les domaines du stress et hors stress, les hommes davantage représentés dans les études invasives que dans les études non invasives. Les résultats indiquent que les facteurs exploratoires analysés peuvent tous être associés au sexe. Cette étude a identifié la présence de facteurs pouvant provoquer un biais de sexe en recherche sur le stress humain, ouvrant la voie aux recherches souhaitant approfondir la généralisation des résultats. / Sex selection bias (or sex bias) refers to a systematic difference in the proportions of men and women between a sample of participants and their population, which may undermine the validity of a study. Human stress research is vulnerable to sex bias, given the presence of factors typically associated with it, primarily invasive protocols — containing painful, uncomfortable, or threatening elements for participants. The present study has verified whether the proportions of men and women in stress studies differ by 1) the invasiveness of a study (invasive or non-invasive) and by 2) exploratory factors (e.g. country, recruitment method). Two non-stress domains with invasive (experimental pain) and non-invasive (memory) protocols were used as control domains. In a cross-sectional survey of the literature, the proportions of men and women were collected from 324 studies containing invasive or non-invasive protocols, representing a total of 23 611 participants, 42,18 % of whom were men. Sex representativeness differed across invasiveness levels in both the stress and non-stress domains with men being more represented in invasive than in non-invasive studies. Results indicate that the exploratory factors analyzed may all be associated with sex. This study identified the presence of factors that may cause sex bias in human stress research, opening the door to research wishing to further investigate the generalizability of results.
|
49 |
Estimation of the mincerian wage model addressing its specification and different econometric issuesBhatti, Sajjad Haider 03 December 2012 (has links) (PDF)
In the present doctoral thesis, we estimated Mincer's (1974) semi logarithmic wage function for the French and Pakistani labour force data. This model is considered as a standard tool in order to estimate the relationship between earnings/wages and different contributory factors. Despite of its vide and extensive use, simple estimation of the Mincerian model is biased because of different econometric problems. The main sources of bias noted in the literature are endogeneity of schooling, measurement error, and sample selectivity. We have tackled the endogeneity and measurement error biases via instrumental variables two stage least squares approach for which we have proposed two new instrumental variables. The first instrumental variable is defined as "the average years of schooling in the family of the concerned individual" and the second instrumental variable is defined as "the average years of schooling in the country, of particular age group, of particular gender, at the particular time when an individual had joined the labour force". Schooling is found to be endogenous for the both countries. Comparing two said instruments we have selected second instrument to be more appropriate. We have applied the Heckman (1979) two-step procedure to eliminate possible sample selection bias which found to be significantly positive for the both countries which means that in the both countries, people who decided not to participate in labour force as wage worker would have earned less than participants if they had decided to work as wage earner. We have estimated a specification that tackled endogeneity and sample selectivity problems together as we found in respect to present literature relative scarcity of such studies all over the globe in general and absence of such studies for France and Pakistan, in particular. Differences in coefficients proved worth of such specification. We have also estimated model semi-parametrically, but contrary to general norm in the context of the Mincerian model, our semi-parametric estimation contained non-parametric component from first-stage schooling equation instead of non-parametric component from selection equation. For both countries, we have found parametric model to be more appropriate. We found errors to be heteroscedastic for the data from both countries and then applied adaptive estimation to control adverse effects of heteroscedasticity. Comparing simple and adaptive estimations, we prefer adaptive specification of parametric model for both countries. Finally, we have applied quantile regression on the selected model from mean regression. Quantile regression exposed that different explanatory factors influence differently in different parts of the wage distribution of the two countries. For both Pakistan and France, it would be the first study that corrected both sample selectivity and endogeneity in single specification in quantile regression framework
|
50 |
Philosophical controversies in the evaluation of medical treatments : With a focus on the evidential roles of randomization and mechanisms in Evidence-Based MedicineMebius, Alexander January 2015 (has links)
This thesis examines philosophical controversies surrounding the evaluation of medical treatments, with a focus on the evidential roles of randomised trials and mechanisms in Evidence-Based Medicine. Current 'best practice' usually involves excluding non-randomised trial evidence from systematic reviews in cases where randomised trials are available for inclusion in the reviews. The first paper challenges this practice and evaluates whether adding of evidence from non-randomised trials might improve the quality and precision of some systematic reviews. The second paper compares the alleged methodological benefits of randomised trials over observational studies for investigating treatment benefits. It suggests that claims about the superiority of well-conducted randomised controlled trials over well-conducted observational studies are justified, especially when results from the two methods are contradictory. The third paper argues that postulating the unpredictability paradox in systematic reviews when no detectable empirical differences can be found requires further justification. The fourth paper examines the problem of absence causation in the context of explaining causal mechanisms and argues that a recent solution (Barros 2013) is incomplete and requires further justification. Solving the problem by describing absences as causes of 'mechanism failure' fails to take into account the effects of absences that lead to vacillating levels of mechanism functionality (i.e. differences in effectiveness or efficiency). The fifth paper criticises literature that has emphasised functioning versus 'broken' or 'non-functioning' mechanisms emphasising that many diseases result from increased or decreased mechanism function, rather than complete loss of function. Mechanistic explanations must account for differences in the effectiveness of performed functions, yet current philosophical mechanistic explanations do not achieve this. The last paper argues that the standard of evidence embodied in the ICE theory of technological function (i.e. testimonial evidence and evidence of mechanisms) is too permissive for evaluating whether the proposed functions of medical technologies have been adequately assessed and correctly ascribed. It argues that high-quality evidence from clinical studies is necessary to justify functional ascriptions to health care technologies. / <p>QC 20150312</p>
|
Page generated in 0.0873 seconds