Global ETD Search

1	Search Term Selection and Document Clustering for Query Suggestion Zhang, Xiaomin 06 1900 (has links) In order to improve a user's query and help the user quickly satisfy his/her information need, most search engines provide query suggestions that are meant to be relevant alternatives to the user's query. This thesis builds on the query suggestion system and evaluation methodology described in Shen Jiang's Masters thesis (2008). Jiang's system constructs query suggestions by searching for lexical aliases of web documents and then applying query search to the lexical aliases. A lexical alias for a web document is a list of terms that return the web document in a top-ranked position. Query search is a search process that finds useful combinations of search terms. The main focus of this thesis is to supply alternatives for the components of Jiang's system. We suggest three term scoring mechanisms and generalize Jiang's lexical alias search to be a general search for terms that are useful for constructing good query suggestions. We also replace Jiang's top-down query search by a bottom-up beam search method. We experimentally show that our query suggestion method improves Jiang's system by 30% for short queries and 90% for long queries using Jiang's evaluation method. In addition, we add new evidence supporting Jiang's conclusion that terms in the user's initial query terms are important to include in the query suggestions. In addition, we explore the usefulness of document clustering in creating query suggestions. Our experimental results are the opposite of what we expected: query suggestion based on clustering does not perform nearly as well, in terms of the "coverage" scores we are using for evaluation, as our best method that is not based on document clustering.
2	Search Term Selection and Document Clustering for Query Suggestion Zhang, Xiaomin Unknown Date No description available.
3	Improving Feature Selection Techniques for Machine Learning Tan, Feng 27 November 2007 (has links) As a commonly used technique in data preprocessing for machine learning, feature selection identifies important features and removes irrelevant, redundant or noise features to reduce the dimensionality of feature space. It improves efficiency, accuracy and comprehensibility of the models built by learning algorithms. Feature selection techniques have been widely employed in a variety of applications, such as genomic analysis, information retrieval, and text categorization. Researchers have introduced many feature selection algorithms with different selection criteria. However, it has been discovered that no single criterion is best for all applications. We proposed a hybrid feature selection framework called based on genetic algorithms (GAs) that employs a target learning algorithm to evaluate features, a wrapper method. We call it hybrid genetic feature selection (HGFS) framework. The advantages of this approach include the ability to accommodate multiple feature selection criteria and find small subsets of features that perform well for the target algorithm. The experiments on genomic data demonstrate that ours is a robust and effective approach that can find subsets of features with higher classification accuracy and/or smaller size compared to each individual feature selection algorithm. A common characteristic of text categorization tasks is multi-label classification with a great number of features, which makes wrapper methods time-consuming and impractical. We proposed a simple filter (non-wrapper) approach called Relation Strength and Frequency Variance (RSFV) measure. The basic idea is that informative features are those that are highly correlated with the class and distribute most differently among all classes. The approach is compared with two well-known feature selection methods in the experiments on two standard text corpora. The experiments show that RSFV generate equal or better performance than the others in many cases. Feature selection Gene selection Text categorization Text classification Genetic algorithm Dimension Reduction Term selection Computer Sciences
4	Effects of Long-Term Selection for Non-Destructive Deformation in White Leghorns / 採卵鶏（ホワイトレグホーン種）における卵の非破壊変形を指標とした長期選抜の効果 Gervais, Olivier 23 September 2016 (has links) 京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第20025号 / 情博第620号 / 新制\|\|情\|\|108(附属図書館) / 33121 / 京都大学大学院情報学研究科社会情報学専攻 / (主査)教授守屋和幸, 教授松田哲也, 教授廣岡博之 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DGAM Egg production Egg shape Eggshell strength Long-term selection Non-destructive deformation Poultry White Leghorn 007
5	Long-term selection of biparental crosses: a comparison among genomic methods and phenotypic selection / Seleção de cruzamentos biparentais em longo prazo: Uma comparação entre métodos genômicos e seleção fenotípica Nalin, Rafael Storto 11 July 2019 (has links) The selection of crosses is a fundamental part of a breeding program, and the use of an adequate strategy is crucial. A good strategy should balance the selection of the best individuals and maintenance of genetic diversity throughout cycles of breeding, aiming for long-term genetic gains. Among the methods proposed in the literature, we can highlight the genomic prediction with simulated offsprings, which can be used to estimate the mean and genetic variance of each combination of candidate parents, providing useful information to the breeder. However, as far as we know, there are no reports on how this method performs concerning long-term genetic gain. Thus, the goal of this study was to evaluate how genomic prediction with simulated offsprings performs compared with the traditional phenotypic selection across five cycles of breeding. In silico and data-based simulation was used to investigate these approaches in terms of genetic gain and several other parameters related to the genetic diversity. We simulated an In silico standard wheat breeding program, with a capacity to evaluate 1000 lines per cycle. We considered different scenarios for the heritability, number of populations and the number of offspring per population. A real dataset of 1465 wheat inbred lines was also used to perform simulations. In this case, markers were randomly assigned to be genes. The results indicated that the best method is dependent of the heritability of the trait under consideration, the breeder\'s strategy about how many crosses will be done and also if the breeding goal is to have short or long-term genetic gains. In general, the genomic methods, especially the genomic prediction with simulated progenies, presented the best results under scenarios of low heritability and high number of population, either on short or long-term. However, even though the conversion of genetic variability into genetic gains is faster than any other strategy, the losses of variability are also higher, being interesting to bring new sources of variability with the advance of the cycles of breeding. The adoption of the restriction on the number of times a genotype is a parent in crosses is also of fundamental importance for obtaining long-term genetic gains. / A seleção de cruzamentos é parte importante de um programa de melhoramento e o uso de uma estratégia adequada é crucial. Uma boa estratégia deve balancear a seleção dos melhores indivíduos e a manutenção da diversidade genética ao longo dos ciclos de seleção, visando ganhos a longo prazo. Dentre os métodos propostos na literatura podemos destacar a predição genômica com progênies simuladas, que pode ser utilizada para se estimar a média e a variância genética de cada combinação de parentais candidatos, provendo valiosa informação para o melhorista. No entanto, não há relatos sobre como esse método se comporta em um processo de seleção a longo prazo. Portanto, o objetivo desse trabalho foi avaliar a performance do método de predição genômica com progênies simuladas em relação aos métodos tradicionais de seleção fenotípica, ao longo de dez ciclos de melhoramento. Simulações In silico e utilizando um conjunto de dados foram utilizadas para investigar essas metodologias em relação ao ganho genético e diversos outros parâmetros relacionados a diversidade genética. Simulou-se um programa de melhoramento de trigo com capacidade para avaliar 1000 genótipos a cada ciclo. Diferentes cenários para herdabilidade e a combinação número de populações e número de progênies por população foram avaliados. Um conjunto de dados reais de 1465 linhagens de trigo também foi utilizado com o objetivo de proceder com uma simulação baseada em dados reais. Nesse caso, marcas foram aleatoriamente designadas como genes. Os resultados indicam que o melhor método é dependente da herdabilidade da característica, da estratégia adotada pelo melhorista quanto ao número de cruzamentos realizado e também se o objetivo de melhoramento é a obtenção de ganhos genéticos a curto ou à longo prazo. No geral, os métodos envolvendo seleção genômica, especialmente o que faz uso de progênies simuladas, apresentaram melhores resultados quando a herdabilidade é baixa e o número de populações é alta, tanto a curto como à longo prazo. No entanto, embora a conversão de variabilidade genética em ganhos genéticos seja mais rápida com essa estratégia, a perda de variabilidade é mais acentuada, sendo interessante a reposição de novas fontes de diversidade com o avançar dos ciclos de melhoramento. A adoção de uma restrição no número de vezes que um genótipo atua como genitor é, também, de fundamental importância para a obtenção de ganhos à longo prazo. Triticum spp. L. Triticum spp. L. Computer simulation Long-term selection Predição de cruzamentos Prediction of crosses Seleção à longo prazo Simulação computacional
6	Chicken Genomics - Linkage and QTL mapping Wahlberg, Per January 2009 (has links) This thesis presents results from genetic studies conducted in the chicken (Gallus gallus). The domestication of chicken is believed to have been initiated approximately 7,000 – 9,000 years ago in Southeast Asia. Since that time, selective breeding has altered the appearance of the wild ancestor, creating highly specialized chicken lines developed for egg and meat production. The first part of this thesis describes a detailed genetic analysis conducted on an F2 intercross between two phenotypically diverse chicken lines. The two parental lines used in this experiment originated from the same base population and have been developed by divergent selection for juvenile body-weight. Selection during forty generations has resulted in an eight-fold difference in body-weight between the High-Weight Selected (HWS) and Low-Weight Selected (LWS) line. In an attempt to identify the genetic factors differentiating the two lines, a large intercross population was bred to map Quantitative Trait Loci (QTL) affecting body-weight traits. A linkage map was constructed which included 434 genetic markers covering 31 of the 38 chicken autosomes. Although there is a dramatic phenotypic difference between the two founder lines, the QTL analysis for marginal effect could only identify seven QTL, each with small additive effects, influencing body-weight. We extended the genetic analysis to also include a model testing for pair-wise interactions between loci (epistasis). The analysis revealed 15 QTL pairs that affect body-weight and several of those formed a network of interacting loci. These results suggest that the genetic basis for the large difference in body-weight is most likely a result of a combined effect of multiple genetic factors, including QTL with small additive effects in combination with pair-wise interactions between QTL. The second part of this thesis presents two linkage maps. The first map constructed was of the chicken Z chromosome, the second used a genome-wide marker set, including 12,945 SNP markers, to build an updated consensus map of the chicken genome. The resulting consensus map includes 9,268 genetic markers and covers 33 chromosomes, still leaving five microchromosomes without marker coverage. The genome average rate of recombination was estimated to 3.1 cM/Mb, but varied considerably between and within chromosomes. A general trend of elevated recombination rates towards telomeric ends and lower rates near centromeres was observed. This was in concordance to previous reports from mammalian species. Recombination rates in chicken were also found to be highly positively correlated with GC-rich sequences. chicken quantitative genetics linkage map QTL mapping recombination long-term selection Z chromosome selective sweep SNP Genetics Genetik

1

Page generated in 0.0706 seconds