Global ETD Search

1	ESTUDO DA INTERAÇÃO GENÓTIPO X AMBIENTE SOBRE CARACTERÍSTICAS PRODUTIVAS NA RAÇA HOLANDESA EM DIFERENTES REGIÕES DO PARANÁ Moreira, Raphael Patrick 17 July 2017 (has links) Submitted by Angela Maria de Oliveira (amolivei@uepg.br) on 2017-09-13T13:51:47Z No. of bitstreams: 2 license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5) Moreira, Raphael Patrick.pdf: 1363040 bytes, checksum: 74bca5c71c7cd6e6549ae07aa1b28433 (MD5) / Made available in DSpace on 2017-09-13T13:51:47Z (GMT). No. of bitstreams: 2 license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5) Moreira, Raphael Patrick.pdf: 1363040 bytes, checksum: 74bca5c71c7cd6e6549ae07aa1b28433 (MD5) Previous issue date: 2017-07-17 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / A larga utilização da inseminação artificial favoreceu que diferentes genótipos fossem distribuídos em distintas regiões ao redor do mundo. A falta de adaptação a algumas condições ambientais pode causar um efeito denominado interação genótipo x ambiente, principalmente em características poligênicas, ocorrendo alterações nos parâmetros genéticos. Deste modo, objetivou-se avaliar a interação genótipo x ambiente através das correlações genéticas para produção de leite (PL), produção de gordura no leite (PG) e produção de proteína no leite (PP). Foram utilizadas 57.967 vacas primíparas, com lactações entre os anos 1990 e 2015 e matriz de parentesco de 106.417 animais para estimar a correlação genética entre três regiões de climas distintos. O banco de dados utilizado pertence a Associação Paranaense de Criadores de Bovinos da Raça Holandesa – APCBRH e foi dividido por região de acordo com a classificação climática em, R1) Clima mesotérmico úmido e super úmido; R2) Clima mesotérmico sem estação seca e R3) Clima mesotérmico com estação seca. Os efeitos inclusos no modelo foram efeitos fixos do grupo de contemporâneo (rebanho e ano de nascimento), como covariável a idade ao parto e o efeito genético aditivo como efeito aleatório. Os parâmetros genéticos foram estimados pelo método REML, utilizando o programa VCE 6.0. Para obtenção dos valores genéticos utilizou-se o programa PEST2, e posteriormente, os touros foram classificados e submetidos às correlações de Pearson e Spearman, respectivamente. Foram verificados os animais coincidentes entre as regiões, considerando os touros 10% melhores classificados em cada região para realização da comparação estatística. A herdabilidade variou entre as regiões de 0,16 a 0,21 para PL, de 0,17 a 0,25 para PG e de 0,10 a 0,17 para PP. As correlações genéticas, correlações de Pearson, correlações de Spearman e porcentagem de animais coincidentes para as três características entre as regiões foram todas maiores que 0,90 e, portanto, próximas de um. Os resultados obtidos demonstram não haver interação genótipo x ambiente para as três características avaliadas dentro das distintas classificações climáticas no Paraná. Ou seja, a estimativa do valor gênico de um animal em determinada região poderá também ser utilizado para outra região, sem significativo viés no progresso genético da característica. Além do mais, como não houve alteração significativa da predição genética entre regiões do Paraná, apenas um programa de seleção pode ser adotado, o qual incluiria todos os rebanhos da raça Holandesa do estado, reduzindo seu custo e tempo. / The wide use of artificial insemination has favored that different genotypes be distributed in different regions around the world. The lack of adaptation to some environmental conditions can cause an effect called genotype x environment interaction, mainly in polygenic characteristics, with changes in genetic parameters. The objective of this study was to evaluate the genotype x environment interaction through genetic correlations for milk yield (MY), milk fat yield (FY) and milk protein yield (PY). A total of 57,967 primiparous cows with lactations between 1990 and 2015 and a relationship matrix of 106,848 animals were used to estimate the genetic correlation between three regions of different climates. The database used belongs to the Associação Paranaense de Criadores de Bovinos da Raça Holandesa- APCBRH and was divided by region according to the climatic classification in, R1) mesothermic climate moist and super humid, R2) mesothermic climate without dry season, R3) mesothermic climate with dry season. The effects included in the model were fixed effects of the contemporary group (herd and year of birth), as covariate the age at birth and the additive genetic effect as random effect. The genetic parameters were estimated by the REML method, using the VCE 6.0 program. To obtain the genetic values, the PEST2 program was used, in which the bulls were later classified and submitted to the Pearson and Spearman correlations, respectively. The coincident animals between the regions were verified, considering the bulls 10% better classified of each region for the accomplishment of the statistical comparison. The heritability between the regions ranged from 0.16 to 0.21 for MY, from 0.17 to 0.25 for FY and from 0.10 to 0.17 for PY. Genetic correlations, Pearson correlations, Spearman correlations and percentage of coincident animals for the three characteristics between regions were all higher than 0.90 and therefore close to one. The results obtained demonstrate that there is no interaction genotype x environment for the three characteristics evaluated within the distinct climatic classifications in Paraná. That is, the estimation of the genetic value of an animal in one region may also be used for another region, without significant bias in the genetic progress of the trait. Moreover, since there was no significant change in genetic prediction between regions of Paraná, only one selection program could be adopted, which would include all the herds of the state of Holstein, reducing their cost and time. CNPQ::CIENCIAS AGRARIAS::ZOOTECNIA clima subtropical correlação genética eficiência de seleção parâmetros genéticos efficiency selection genetic correlation genetic parameters reranking subtropical climate
2	Exploitation d’informations riches pour guider la traduction automatique statistique / Complex Feature Guidance for Statistical Machine Translation Marie, Benjamin 25 March 2016 (has links) S'il est indéniable que de nos jours la traduction automatique (TA) facilite la communication entre langues, et plus encore depuis les récents progrès des systèmes de TA statistiques, ses résultats sont encore loin du niveau de qualité des traductions obtenues avec des traducteurs humains.Ce constat résulte en partie du mode de fonctionnement d'un système de TA statistique, très contraint sur la nature des modèles qu'il peut utiliser pour construire et évaluer de nombreuses hypothèses de traduction partielles avant de parvenir à une hypothèse de traduction complète. Il existe cependant des types de modèles, que nous qualifions de « complexes », qui sont appris à partir d'informations riches. Si un enjeu pour les développeurs de systèmes de TA consiste à les intégrer lors de la construction initiale des hypothèses de traduction, cela n'est pas toujours possible, car elles peuvent notamment nécessiter des hypothèses complètes ou impliquer un coût de calcul très important. En conséquence, de tels modèles complexes sont typiquement uniquement utilisés en TA pour effectuer le reclassement de listes de meilleures hypothèses complètes. Bien que ceci permette dans les faits de tirer profit d'une meilleure modélisation de certains aspects des traductions, cette approche reste par nature limitée : en effet, les listes d'hypothèses reclassées ne représentent qu'une infime partie de l'espace de recherche du décodeur, contiennent des hypothèses peu diversifiées, et ont été obtenues à l'aide de modèles dont la nature peut être très différente des modèles complexes utilisés en reclassement.Nous formulons donc l'hypothèse que de telles listes d'hypothèses de traduction sont mal adaptées afin de faire s'exprimer au mieux les modèles complexes utilisés. Les travaux que nous présentons dans cette thèse ont pour objectif de permettre une meilleure exploitation d'informations riches pour l'amélioration des traductions obtenues à l'aide de systèmes de TA statistique.Notre première contribution s'articule autour d'un système de réécriture guidé par des informations riches. Des réécritures successives, appliquées aux meilleures hypothèses de traduction obtenues avec un système de reclassement ayant accès aux mêmes informations riches, permettent à notre système d'améliorer la qualité de la traduction.L'originalité de notre seconde contribution consiste à faire une construction de listes d'hypothèses par passes multiples qui exploitent des informations dérivées de l'évaluation des hypothèses de traduction produites antérieurement à l'aide de notre ensemble d'informations riches. Notre système produit ainsi des listes d'hypothèses plus diversifiées et de meilleure qualité, qui s'avèrent donc plus intéressantes pour un reclassement fondé sur des informations riches. De surcroît, notre système de réécriture précédent permet d'améliorer les hypothèses produites par cette deuxième approche à passes multiples.Notre troisième contribution repose sur la simulation d'un type d'information idéalisé parfait qui permet de déterminer quelles parties d'une hypothèse de traduction sont correctes. Cette idéalisation nous permet d'apporter une indication de la meilleure performance atteignable avec les approches introduites précédemment si les informations riches disponibles décrivaient parfaitement ce qui constitue une bonne traduction. Cette approche est en outre présentée sous la forme d'une traduction interactive, baptisée « pré-post-édition », qui serait réduite à sa forme la plus simple : un système de TA statistique produit sa meilleure hypothèse de traduction, puis un humain apporte la connaissance des parties qui sont correctes, et cette information est exploitée au cours d'une nouvelle recherche pour identifier une meilleure traduction. / Although communication between languages has without question been made easier thanks to Machine Translation (MT), especially given the recent advances in statistical MT systems, the quality of the translations produced by MT systems is still well below the translation quality that can be obtained through human translation. This gap is partly due to the way in which statistical MT systems operate; the types of models that can be used are limited because of the need to construct and evaluate a great number of partial hypotheses to produce a complete translation hypothesis. While more “complex” models learnt from richer information do exist, in practice, their integration into the system is not always possible, would necessitate a complete hypothesis to be computed or would be too computationally expensive. Such features are therefore typically used in a reranking step applied to the list of the best complete hypotheses produced by the MT system.Using these features in a reranking framework does often provide a better modelization of certain aspects of the translation. However, this approach is inherently limited: reranked hypothesis lists represent only a small portion of the decoder's search space, tend to contain hypotheses that vary little between each other and which were obtained with features that may be very different from the complex features to be used during reranking.In this work, we put forward the hypothesis that such translation hypothesis lists are poorly adapted for exploiting the full potential of complex features. The aim of this thesis is to establish new and better methods of exploiting such features to improve translations produced by statistical MT systems.Our first contribution is a rewriting system guided by complex features. Sequences of rewriting operations, applied to hypotheses obtained by a reranking framework that uses the same features, allow us to obtain a substantial improvement in translation quality.The originality of our second contribution lies in the construction of hypothesis lists with a multi-pass decoding that exploits information derived from the evaluation of previously translated hypotheses, using a set of complex features. Our system is therefore capable of producing more diverse hypothesis lists, which are globally of a better quality and which are better adapted to a reranking step with complex features. What is more, our forementioned rewriting system enables us to further improve the hypotheses produced with our multi-pass decoding approach.Our third contribution is based on the simulation of an ideal information type, designed to perfectly identify the correct fragments of a translation hypothesis. This perfect information gives us an indication of the best attainable performance with the systems described in our first two contributions, in the case where the complex features are able to modelize the translation perfectly. Through this approach, we also introduce a novel form of interactive translation, coined "pre-post-editing", under a very simplified form: a statistical MT system produces its best translation hypothesis, then a human indicates which fragments of the hypothesis are correct, and this new information is then used during a new decoding pass to find a new best translation. Traduction automatique statistique Modèle complexe Reclassement d'hypothèses Recherche locale Décodage à passes multiples Post-Édition Statistical machine translation Complex feature Hypotheses reranking Greedy search Multi-Pass decoding Post-Editing
3	我國社會保險重分配效果於不同教育程度之影響蕭茜文 Unknown Date (has links) 社會保險制度(social insurance)是社會安全制度(social security)的重心，也是現今政府為了增進民生福祉、促進社會安全的重要環節。我國的社會保險制度自民國三十九年開辦勞工保險，至今已歷經五十六年。社會保險制度具有諸多的功能，其中一項重要的功能便是所得重分配(income redistribution)，亦即政府可以利用租稅以外的社會保險制度，來改善國民所得分配不平均之問題，以達公平的目標。在測度所得不均等程度方面，最普遍用來衡量的指標便是吉尼係數 (Gini coefficient)，亦即我們可以從吉尼係數的變動來看出所得重分配變動的情形。另外，多數的學者以往只從垂直與水平兩效果著手來衡量所得重分配，但本文採用Aronson、Johnson與Lambert於1994年提出的AJL計算模型，將所得重分配效果拆解更細，除了拆解出垂直效果與水平效果外，還多了重排序效果。由於現今國人受教育的機會愈來愈普遍，有不少學者利用人力資本論(human capital theory)與篩選理論(screening theory)來說明教育與所得之間的關係。這二種理論均認為薪資與教育程度呈現正向關係，因此，教育程度愈高者，所得水準通常會愈高，教育程度愈低者，所得水準通常會愈低，使得教育程度成為所得差異的來源之一。故透過社會保險的施行，能使高所得者與低所得者之間進行所得重分配。此外，隨著經濟發展層次之提昇，使得薪資所得佔總所得的比重日益增加，因此，透過受教育機會的擴大及國民教育品質的提昇，也會造成所得更均等地分配。而本文正是以教育程度別與社會保險為出發點，來探討我國社會保險的所得重分配效果是否在不同教育程度下會有所影響。本文的資料來源為行政院主計處之「家庭收支調查報告」，實證年度為民國85年至91年。綜合本文中第二章到第五章的分析，對於民國85年至91年社會保險實施情形的所得重分配效果所得到的結論如下：一、從民國85至91年之各年「全年」的所得重分配效果分解中發現，每年的所得重分配效果皆為正數，代表社會保險具有所得重分配的功能。若將重分配效果拆解成垂直、水平、重排序三種效果，則發現垂直效果占了重分配總效果大部分的比例，且重排序效果大於水平效果，但由於水平與重排序兩效果的比重不高，故我國的社會保險仍然是具有改善所得分配的功能。二、若進一步把各年的所得重分配以教育程度分成「國小及以下」、「國中」、「高中職」、「大專及以上」四類來重新檢視社會保險的重分配效果，則發現教育程度愈低者愈能因為社會保險制度的施行而使所得分配改善。三、在上述四種教育程度分類下，所得重分配效果中除了垂直效果與重排序效果有大幅變動外，水平效果所占的比例甚低，且其變動幅度不大，因此我們可以說，社會保險的實施所帶來的水平不公平甚低。 / Social insurance is the core of social security, and is also the important link for the government to improve the welfare of people. Social insurance has many function, the most important is that it can redistribute people’s income. Therefore, the government can use social insurance to achieve the goal of equity. In the aspect of measuring the inequity, the most popular index is Gini coefficient. Lots of scholars decompose the income redistribution effect into two parts-vertical effect and horizontal effect. Unlike previous, I use AJL model to analyze the income redistribution of social insurance. In other words, AJL model decompose the income redistribution into there parts-vertical effect, horizontal effect and reranking effect. Because of the popular of education, people can accept education easily. Lots of scholars use “Human capital theory” and “Screening theory” to explain the relations between education and income. They point out that when a person have a high education then he will get a high payment. Therefore, this thesis is based on education and is arguing the distribution function of social insurance. The results are as follow： First, the social insurance system of Taiwan stills has the function of redistributing people’s income. Second, the people in the lower layer of education usually can redistribute their income by means of social insurance. Finally, the horizon inequity is few in reality. 社會保險所得重分配累進效果水平不公平所得重排序 social insurance income redistribution vertical effect horizon inequity reranking effect
4	Zero-shot, One Kill: BERT for Neural Information Retrieval Efes, Stergios January 2021 (has links) [Background]: The advent of bidirectional encoder representation from trans- formers (BERT) language models (Devlin et al., 2018) and MS Marco, a large scale human-annotated dataset for machine reading comprehension (Bajaj et al., 2016) that made publicly available, led the field of information retrieval (IR) to experience a revolution (Lin et al., 2020). The retrieval model based on BERT of Nogueira and Cho (2019), by the time they published their paper, became the top entry in the MS Marco passage-reranking leaderboard, surpassing the previous state of the art by 27% in MRR@10. However, training such neural IR models for different domains than MS Marco is still hard because neural approaches often require a vast amount of training data to perform effectively, which is not always available. To address the problem of the shortage of labelled data a new line of research emerged, training neural models with weak supervision. In weak supervision, given an unlabelled dataset labels are generated automatically using an existing model and then a machine learning model is trained upon the artificial “weak“ data. In case of weak supervision for IR, the training dataset comes in the form of a tuple (query, passage). Dehghani et al. (2017) in their work used the AOL query logs (Pass et al., 2006), which is a set of millions of real web queries, and BM25 to retrieve the relevant passages for each of the user queries. A drawback with this approach is that it is hard to obtain query logs for every single different domain. [Objective]: This thesis proposes an intuitive approach for addressing the shortage of data in domains with limited or no data at all through transfer learning in the context of IR. We leverage Wikipedia’s structure for creating a Wikipedia-based generic IR training dataset for zero-shot neural models. [Method]: We create the “pseudo-queries“ by concatenating the titles of Wikipedia’s articles along with each of their title sections and we consider the associated section’s passage as the relevant passage of the pseudo-queries. All of our experiments are evaluated on a standard collection: MS Marco, which is a large scale web collection. For our zero-shot experiments, our proposed model, called “Wiki“, is a BERT model trained on the artificial Wikipedia-based dataset and the baseline is a default BERT model without any additional training. In our second line of experiments, we explore the benefits gained by pre-fine- tuning on the Wikipedia-based IR dataset and further fine-tuning on in-domain data. Our proposed model, "Wiki+Ma", is a BERT model pre-fine-tuned in the Wikipedia-based dataset and further fine-tuned in MS Marco, while the baseline is a BERT model fine-tuned only in MS Marco. [Results]: Results regarding our first experiments show that our BERT model trained on the Wikipedia-based IR dataset, called "Wiki", achieves a performance of 0.197 in MRR@10, which is about +10 points more in comparison to a BERT model with default weights; in addition, results in the development set indicate that the “Wiki“ model performs better than BERT model trained on in-domain data when the data is between 10k-50k instances. Results regarding our second line of experiments show that pre-fine-tuning on the Wikipedia-based IR dataset benefits later fine-tuning steps on in-domain data in terms of stability. [Conclusion]: Our findings suggest that transfer learning for IR tasks by leveraging the generic knowledge incorporated in Wikipedia is possible, though more experimentation is needed to understand its limitations in comparison with the traditional approaches such as the BM25. neural information retrieval passage ranking weak supervision question answering passage reranking BERT transfer-learning in IR zero-shot IR passage-retrieval BERT for passage-retrieval MS Marco information retrieval neural IR

1

Page generated in 0.0502 seconds