• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 30
  • 6
  • 6
  • 3
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 55
  • 15
  • 15
  • 15
  • 13
  • 9
  • 9
  • 8
  • 6
  • 6
  • 6
  • 6
  • 5
  • 5
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Design of instructions scheduling Mechanism in Hyper-Threading Architecture for Improving Performance

du, ling-yan 03 August 2004 (has links)
In the microprocessor system, exploiting ILP is an important key for improving performance. As instructions scheduling mechanism is designed complicated for employing ILP more efficient, the hardware cost will become larger in opposition. In the nowadays processor, they adopt the multiple scheduler queues to issue instructions so that the hardware cost will be not larger. But in this scheduling mechanism, it could successive issue the instructions that have dependence. This situation can makes that the utilization of execution units is not saturated. In the hyperthreading architecture, the instructions in the scheduler queue have high degree of parallelism. If we can decrease the probability of situation that successive issue the instructions that have dependence, the utilization of execution units will heighten. In this paper, we propose the scheduling mechanism called as priority-scheduling buffer to replace the original scheduler queues. The scheduling mechanism will divide an original scheduler queue into multiple virtual scheduler queues according to the dependence of instructions. the instructions that have dependence will dispatch into the same virtual scheduler queue. The instructions can be issued from the ahead of different virtual scheduler queues. This can reduce the probability that successive issues the instructions that have dependence. According to result of simulation in SPEC CINT2000, we adopt the Intel Pentium 4 for basic architecture of our simulation. In the five threads executing simultaneously, the performance will increase 7.14% average that compares with the original scheduler queue.
2

Annotation Concept Synthesis and Enrichment Analysis: a Logic-Based Approach to the Interpretation of High-Throughput Biological Experiments

Jiline, Mikhail 26 January 2011 (has links)
Annotation Enrichment Analysis is a widely used analytical methodology to process data generated by high-throughput genomic and proteomic experiments such as gene expression microarrays. The analysis uncovers and summarizes discriminating background information for sets of genes identified by the previous processing stages (e.g., a set of differentially expressed genes, a cluster). Enrichment analysis algorithms attach annotations to the genes and then discover statistical fluctuations of individual annotation terms in a given gene subset. The annotation terms represent different aspects of biological knowledge and come from databases such as GO, BIND, KEGG. Typical statistical models used to detect enrichments or depletions of annotation terms are hypergeometric, binomial and X2. At the end, the discovered information is utilized by human experts to find biological interpretations of the experiments. The main drawback of AEA is that it isolates and tests for overrepresentation of isolated individual annotation terms or groups of similar terms. As a result, AEA is limited in its ability to uncover complex phenomena involving relationships between multiple annotation terms from various knowledge bases. Also, AEA assumes that annotations describe the whole object of interest, which makes it difficult to apply it to sets of compound objects (e.g., sets of protein-protein interactions) and to sets of objects having an internal structure (e.g., protein complexes). To overcome this shortcoming, we propose a novel logic-based Annotation Concept Synthesis and Enrichment Analysis (ACSEA) approach. In this approach, the source annotation information, experimental data and uncovered enriched annotations are represented as First-Order Logic (FOL) statements. ACSEA uses the fusion of inductive logic reasoning with statistical inference to uncover more complex phenomena captured by the experiments. The proposed paradigm allows a synthesis of enriched annotation concepts that better describe the observed biological processes. The methodological advantage of Annotation Concept Synthesis and Enrichment Analysis is six-fold. Firstly, it is easier to represent complex, structural annotation information. Information already captured and formalized in OWL and RDF knowledge bases can be directly utilized. Secondly, it is possible to synthesize and analyze complex annotation concepts. Thirdly, it is possible to perform the enrichment analysis for sets of aggregate objects (such as sets of genetic interactions, physical protein-protein interactions or sets of protein complexes). Fourthly, annotation concepts are straightforward to interpret by a human expert. Fifthly, the logic data model and logic induction are a common platform that can integrate specialized analytical tools (e.g. tools for numerical, structural and sequential analysis). Sixthly, used statistical inference methods are robust on noisy and incomplete data, scalable and trusted by human experts in the field. In this thesis we developed and implemented the ACSEA approach. We evaluate it on large-scale datasets from several microarray experiments and on a clustered genome-wide genetic interaction network using different biological knowledge bases. Also, we define a statistical model of experimental and annotation data and evaluate ACSEA on synthetic datasets. The discovered interpretations are more enriched in terms of P- and Q-values than the interpretations found by AEA, are highly integrative in nature, and include analysis of quantitative and structured information present in the knowledge bases. The results suggest that ACSEA can significantly boost the effectiveness of the processing of high-throughput experiment data.
3

Annotation Concept Synthesis and Enrichment Analysis: a Logic-Based Approach to the Interpretation of High-Throughput Biological Experiments

Jiline, Mikhail 26 January 2011 (has links)
Annotation Enrichment Analysis is a widely used analytical methodology to process data generated by high-throughput genomic and proteomic experiments such as gene expression microarrays. The analysis uncovers and summarizes discriminating background information for sets of genes identified by the previous processing stages (e.g., a set of differentially expressed genes, a cluster). Enrichment analysis algorithms attach annotations to the genes and then discover statistical fluctuations of individual annotation terms in a given gene subset. The annotation terms represent different aspects of biological knowledge and come from databases such as GO, BIND, KEGG. Typical statistical models used to detect enrichments or depletions of annotation terms are hypergeometric, binomial and X2. At the end, the discovered information is utilized by human experts to find biological interpretations of the experiments. The main drawback of AEA is that it isolates and tests for overrepresentation of isolated individual annotation terms or groups of similar terms. As a result, AEA is limited in its ability to uncover complex phenomena involving relationships between multiple annotation terms from various knowledge bases. Also, AEA assumes that annotations describe the whole object of interest, which makes it difficult to apply it to sets of compound objects (e.g., sets of protein-protein interactions) and to sets of objects having an internal structure (e.g., protein complexes). To overcome this shortcoming, we propose a novel logic-based Annotation Concept Synthesis and Enrichment Analysis (ACSEA) approach. In this approach, the source annotation information, experimental data and uncovered enriched annotations are represented as First-Order Logic (FOL) statements. ACSEA uses the fusion of inductive logic reasoning with statistical inference to uncover more complex phenomena captured by the experiments. The proposed paradigm allows a synthesis of enriched annotation concepts that better describe the observed biological processes. The methodological advantage of Annotation Concept Synthesis and Enrichment Analysis is six-fold. Firstly, it is easier to represent complex, structural annotation information. Information already captured and formalized in OWL and RDF knowledge bases can be directly utilized. Secondly, it is possible to synthesize and analyze complex annotation concepts. Thirdly, it is possible to perform the enrichment analysis for sets of aggregate objects (such as sets of genetic interactions, physical protein-protein interactions or sets of protein complexes). Fourthly, annotation concepts are straightforward to interpret by a human expert. Fifthly, the logic data model and logic induction are a common platform that can integrate specialized analytical tools (e.g. tools for numerical, structural and sequential analysis). Sixthly, used statistical inference methods are robust on noisy and incomplete data, scalable and trusted by human experts in the field. In this thesis we developed and implemented the ACSEA approach. We evaluate it on large-scale datasets from several microarray experiments and on a clustered genome-wide genetic interaction network using different biological knowledge bases. Also, we define a statistical model of experimental and annotation data and evaluate ACSEA on synthetic datasets. The discovered interpretations are more enriched in terms of P- and Q-values than the interpretations found by AEA, are highly integrative in nature, and include analysis of quantitative and structured information present in the knowledge bases. The results suggest that ACSEA can significantly boost the effectiveness of the processing of high-throughput experiment data.
4

Annotation Concept Synthesis and Enrichment Analysis: a Logic-Based Approach to the Interpretation of High-Throughput Biological Experiments

Jiline, Mikhail 26 January 2011 (has links)
Annotation Enrichment Analysis is a widely used analytical methodology to process data generated by high-throughput genomic and proteomic experiments such as gene expression microarrays. The analysis uncovers and summarizes discriminating background information for sets of genes identified by the previous processing stages (e.g., a set of differentially expressed genes, a cluster). Enrichment analysis algorithms attach annotations to the genes and then discover statistical fluctuations of individual annotation terms in a given gene subset. The annotation terms represent different aspects of biological knowledge and come from databases such as GO, BIND, KEGG. Typical statistical models used to detect enrichments or depletions of annotation terms are hypergeometric, binomial and X2. At the end, the discovered information is utilized by human experts to find biological interpretations of the experiments. The main drawback of AEA is that it isolates and tests for overrepresentation of isolated individual annotation terms or groups of similar terms. As a result, AEA is limited in its ability to uncover complex phenomena involving relationships between multiple annotation terms from various knowledge bases. Also, AEA assumes that annotations describe the whole object of interest, which makes it difficult to apply it to sets of compound objects (e.g., sets of protein-protein interactions) and to sets of objects having an internal structure (e.g., protein complexes). To overcome this shortcoming, we propose a novel logic-based Annotation Concept Synthesis and Enrichment Analysis (ACSEA) approach. In this approach, the source annotation information, experimental data and uncovered enriched annotations are represented as First-Order Logic (FOL) statements. ACSEA uses the fusion of inductive logic reasoning with statistical inference to uncover more complex phenomena captured by the experiments. The proposed paradigm allows a synthesis of enriched annotation concepts that better describe the observed biological processes. The methodological advantage of Annotation Concept Synthesis and Enrichment Analysis is six-fold. Firstly, it is easier to represent complex, structural annotation information. Information already captured and formalized in OWL and RDF knowledge bases can be directly utilized. Secondly, it is possible to synthesize and analyze complex annotation concepts. Thirdly, it is possible to perform the enrichment analysis for sets of aggregate objects (such as sets of genetic interactions, physical protein-protein interactions or sets of protein complexes). Fourthly, annotation concepts are straightforward to interpret by a human expert. Fifthly, the logic data model and logic induction are a common platform that can integrate specialized analytical tools (e.g. tools for numerical, structural and sequential analysis). Sixthly, used statistical inference methods are robust on noisy and incomplete data, scalable and trusted by human experts in the field. In this thesis we developed and implemented the ACSEA approach. We evaluate it on large-scale datasets from several microarray experiments and on a clustered genome-wide genetic interaction network using different biological knowledge bases. Also, we define a statistical model of experimental and annotation data and evaluate ACSEA on synthetic datasets. The discovered interpretations are more enriched in terms of P- and Q-values than the interpretations found by AEA, are highly integrative in nature, and include analysis of quantitative and structured information present in the knowledge bases. The results suggest that ACSEA can significantly boost the effectiveness of the processing of high-throughput experiment data.
5

Use of Fourier series for curve fitting tabular data in the Integrated Mechanisms Program

Linder, Susan, M. January 1983 (has links)
Thesis (M.S.)--University of Wisconsin--Madison, 1983. / Typescript. eContent provider-neutral record in process. Description based on print version record. Includes bibliographical references (leaves 127-128).
6

Annotation Concept Synthesis and Enrichment Analysis: a Logic-Based Approach to the Interpretation of High-Throughput Biological Experiments

Jiline, Mikhail January 2011 (has links)
Annotation Enrichment Analysis is a widely used analytical methodology to process data generated by high-throughput genomic and proteomic experiments such as gene expression microarrays. The analysis uncovers and summarizes discriminating background information for sets of genes identified by the previous processing stages (e.g., a set of differentially expressed genes, a cluster). Enrichment analysis algorithms attach annotations to the genes and then discover statistical fluctuations of individual annotation terms in a given gene subset. The annotation terms represent different aspects of biological knowledge and come from databases such as GO, BIND, KEGG. Typical statistical models used to detect enrichments or depletions of annotation terms are hypergeometric, binomial and X2. At the end, the discovered information is utilized by human experts to find biological interpretations of the experiments. The main drawback of AEA is that it isolates and tests for overrepresentation of isolated individual annotation terms or groups of similar terms. As a result, AEA is limited in its ability to uncover complex phenomena involving relationships between multiple annotation terms from various knowledge bases. Also, AEA assumes that annotations describe the whole object of interest, which makes it difficult to apply it to sets of compound objects (e.g., sets of protein-protein interactions) and to sets of objects having an internal structure (e.g., protein complexes). To overcome this shortcoming, we propose a novel logic-based Annotation Concept Synthesis and Enrichment Analysis (ACSEA) approach. In this approach, the source annotation information, experimental data and uncovered enriched annotations are represented as First-Order Logic (FOL) statements. ACSEA uses the fusion of inductive logic reasoning with statistical inference to uncover more complex phenomena captured by the experiments. The proposed paradigm allows a synthesis of enriched annotation concepts that better describe the observed biological processes. The methodological advantage of Annotation Concept Synthesis and Enrichment Analysis is six-fold. Firstly, it is easier to represent complex, structural annotation information. Information already captured and formalized in OWL and RDF knowledge bases can be directly utilized. Secondly, it is possible to synthesize and analyze complex annotation concepts. Thirdly, it is possible to perform the enrichment analysis for sets of aggregate objects (such as sets of genetic interactions, physical protein-protein interactions or sets of protein complexes). Fourthly, annotation concepts are straightforward to interpret by a human expert. Fifthly, the logic data model and logic induction are a common platform that can integrate specialized analytical tools (e.g. tools for numerical, structural and sequential analysis). Sixthly, used statistical inference methods are robust on noisy and incomplete data, scalable and trusted by human experts in the field. In this thesis we developed and implemented the ACSEA approach. We evaluate it on large-scale datasets from several microarray experiments and on a clustered genome-wide genetic interaction network using different biological knowledge bases. Also, we define a statistical model of experimental and annotation data and evaluate ACSEA on synthetic datasets. The discovered interpretations are more enriched in terms of P- and Q-values than the interpretations found by AEA, are highly integrative in nature, and include analysis of quantitative and structured information present in the knowledge bases. The results suggest that ACSEA can significantly boost the effectiveness of the processing of high-throughput experiment data.
7

Consórcio de duas espécies forrageiras com milho: caracteristicas fitotécnicas, produtividade e composição bromatológica

Lopes, Maycom Marinho 02 September 2017 (has links)
Submitted by Helena Bejio (helena.bejio@unioeste.br) on 2018-04-24T11:40:30Z No. of bitstreams: 2 Dissertação final_Maycom Lopes(1).pdf: 906545 bytes, checksum: 7c6a8b3a770195b693e8e5a9e7b78567 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Made available in DSpace on 2018-04-24T11:40:30Z (GMT). No. of bitstreams: 2 Dissertação final_Maycom Lopes(1).pdf: 906545 bytes, checksum: 7c6a8b3a770195b693e8e5a9e7b78567 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Previous issue date: 2017-09-02 / Fundação de Amparo à Pesquisa do Estado do Amazonas FAPEAM / The objective of this work was to evaluate two experiments: a) maize consortium with two fodder species in three sowing modalities and exclusive corn, the phytotechnical characteristics, corn production and composition of the corn were evaluated, b) evaluation of the production and composition of the forage fodder -collection of corn. The Experiments were installed in the Experimental Farm Professor Antonio Carlos dos Santos Pessoa Guará line, belonging to the Center of Agricultural Sciences of the State University of the West of Paraná, in a dystrophic Red Latosol, the experimental designs used were in blocks, being experiment a) 7 treatments in a 2x3 + 1 factorial scheme, with 4 replicates and experiment b) 6 treatments 2x3, with 4 replicates. The plant characteristics, plant population, dry mass production and chemical composition of the corn plant were evaluated in the corn plant. Fodder characteristics and bromatological composition were evaluated in the forages. In the phytotechnical evaluations of maize and forage plants, due to the consortium and fodder modalities, no significant differences were observed in any of the analyzed variables. In relation to the corn plant population, the consortium with Brachiaria ruziziensis decreased plant density due to competition whose forage inhibited maize from issuing tillers. In the bromatological evaluations, a significant difference was found for the DM and NDF variables, with the highest values attributed to maize plants in a consortium with Panicum maximum cv. Mombasa. For post-harvest forage evaluations of corn, it was observed that in both cut-off periods no significant differences were found in the variables MS, MM, OM, NDF, FDA, LIG and CEL. Analyzing the data obtained, it was verified that the use of the maize consortium in both forage species studied, Panicum maximum cv. mombaça and Brachiaria ruziziensis in the three types of sowing line, interline and haul compared to exclusive corn do not alter the phytotechnical characteristics, production and nutritive aspects of corn for silage. In relation to dry matter production and chemical composition of the post-harvest forages, both presented similar behavior in the two studied periods. / O objetivo do trabalho foi avaliar dois experimentos: a) consórcio do milho com duas espécies forrageiras em três modalidades de semeadura e milho exclusivo, foram avaliadas características fitotécnicas, produção e composição bromatológica do milho, b) avaliação da produção e composição bromatológica das forrageiras pós-colheita de milho. Os Experimentos foram instalado na Fazenda Experimental Professor Antonio Carlos dos Santos Pessoa linha Guará, pertencente ao Centro de Ciências Agrárias, da Universidade Estadual do Oeste do Paraná, em Latossolo Vermelho distrófico, os delineamentos experimentais utilizados foram em blocos casualidades sendo experimento a) 7 tratamentos em esquema fatorial 2x3+1, com 4 repetições e experimento b) 6 tratamentos 2x3, com 4 repetições. Foi avaliado na planta de milho as características fitotécnicas, população de planta, produção de massa seca e composição química da planta de milho. Nas forrageiras avaliou-se as características fitotécnicas e composição bromatológica. Nas avaliações fitotécnicas da planta de milho e forrageiras em função das modalidades de consórcio com as forrageiras não foi evidenciado diferenças significativas em nenhumas das variáveis analisadas. Em relação à população de planta de milho o consórcio com Brachiaria ruziziensis diminuiu a densidade de plantas, devido a competição cuja forrageira inibiu o milho de emitir perfilhos. Nas avaliações bromatológicas foram encontrados diferença significativa para as variáveis MS e FDN, sendo maiores valores atribuídas as plantas de milho em consórcio com Panicum maximum cv. Mombaça. Para as avaliações das forrageiras pós-colheita de milho observou-se que em ambos os períodos de corte não foram encontradas diferenças significativas nas variáveis MS, MM, MO, FDN, FDA, LIG e CEL. Analisando os dados obtidos verificou-se que a utilização do consórcio de milho em ambas as espécies forrageiras estudadas, Panicum maximum cv. mombaça e Brachiaria ruziziensis nas três modalidades de semeadura linha, entrelinha e a lanço comparados com o milho exclusivo não alteram as características fitotecnicas, produção e aspectos nutritivos do milho para silagem. Em relação a produção de massa seca e composição química das forrageiras pós-colheita ambas apresentaram mesmo comportamento nos dois períodos estudados.
8

Fusion: a Visualization Framework for Interactive Ilp Rule Mining With Applications to Bioinformatics

Indukuri, Kiran Kumar 04 January 2005 (has links)
Microarrays provide biologists an opportunity to find the expression profiles of thousands of genes simultaneously. Biologists try to understand the mechanisms underlying the life processes by finding out relationships between gene-expression and their functional categories. Fusion is a software system that aids the biologists in performing microarray data analysis by providing them with both visual data exploration and data mining capabilities. Its multiple view visual framework allows the user to choose different views for different types of data. Fusion uses Proteus, an Inductive Logic Programming (ILP) rule finding algorithm to mine relationships in the microarray data. Fusion allows the user to explore the data interactively, choose biases, run the data mining algorithms and visualize the discovered rules. Fusion has the capability to smoothly switch across interactive data exploration and batch data mining modes. This optimizes the knowledge discovery process by facilitating a synergy between the interactivity and usability of visualization process with the pattern-finding abilities of ILP rule mining algorithms. Fusion was successful in helping biologists better understand the mechanisms underlying the acclimatization of certain varieties of Arabidopsis to ozone exposure. / Master of Science
9

Recherches sur l’opposition entre ser et estar en espagnol : historique de la question, et application à l’étude des variations dans leurs emplois en espagnol spontané contemporain au Mexique / Research on the opposition between "ser" and "estar" in spanish : a review of its treatment and an empirical study of the variation in their use in spontaneous contemporary spanish in Mexico

Garcia Markina, Yekaterina 04 December 2013 (has links)
Cette étude empirique synchronique se focalise sur les emplois effectifs des copules espagnoles ser et estar dans les constructions attributives adjectivales dans la variété mexicaine de l’espagnol. Les constructions qui nous intéressent sont celles qui admettent les deux copules, classiquement distinguées en termes de type de prédicat, selon la distinction de Carlson (1977) : Individual-level Predicates et Stage-Level Predicates. Nous avons pourtant observé des occurrences dans la variété mexicaine où cette distinction n’explique pas la présence de la copule estar, qui semble être en cours d’extension, ce qui a déjà été observé par divers auteurs qui ont étudié certaines variétés américaines. Étant donné que la bibliographie sur le sujet est nombreuse et que les approches et perspectives sont diverses, il nous a semblé important, pour la compréhension du sujet, d’établir d’abord un historique critique de la question, qui a été divisé par type d’approche et par auteur. Ensuite, dans une démarche de découverte, fondée en grande partie sur notre intuition de locutrice native, nous avons entrepris une première application analytique avec une tâche contextualisée de préférence comparative entre locuteurs espagnols et locuteurs mexicains, ce qui a montré une considérable variation interne parmi les locuteurs mexicains. Enfin, nous proposons une analyse multifactorielle de nos données « naturelles », i.e. spontanées, tant écrites qu’orales, de diverses sources, ce qui nous a permis de comparer différents critères permettant d’identifier ceux qui favorisent la sélection de estar, et les différentes valeurs sémantiques potentielles que le choix d’une copule plutôt que de l’autre entraîne. / This empirical and synchronic study focuses on the actual uses of Spanish copulas ser and estar in predicative constructions in Mexican Spanish. The constructions studied here are those which admit both copulas. These constructions have been traditionally distinguished following Carlson’s (1977) Individual-level Predicates and Stage-level Predicates. Nevertheless, we have noticed that some of the occurrences of estar in Mexican Spanish cannot be explained by the Carlsonian distinction. This copula seems to be undergoing an extension process in some of the varieties of Latin American Spanish. Authors have studied Spanish copulas from different approaches and perspectives. To better understand the complexity of the matter, we have established a critical history of this issue. Then, following a discovery process based mostly on our native speaker’s intuition, we analyze the question applying a comparative experimental task between Mexican Spanish and European Spanish speakers, which shows important internal variations among Mexican native speakers. Finally, we propose a multifactorial analysis of our “natural”, i.e. spontaneous, written and oral corpora from different sources, which allows us to compare different patterns, and to identify which ones are significant for choosing estar. To conclude, the various potential semantic values entailed by copula choice are analyzed.
10

Résumé automatique multi-document dynamique / Multi-document Update-summarization

Mnasri, Maali 20 September 2018 (has links)
Cette thèse s’intéresse au Résumé Automatique de texte et plus particulièrement au résumémis-à-jour. Cette problématique de recherche vise à produire un résumé différentiel d'un ensemble denouveaux documents par rapport à un ensemble de documents supposés connus. Elle intègre ainsidans la problématique du résumé à la fois la question de la dimension temporelle de l'information etcelle de l’historique de l’utilisateur. Dans ce contexte, le travail présenté s'inscrit dans les approchespar extraction fondées sur une optimisation linéaire en nombres entiers (ILP) et s’articule autour dedeux axes principaux : la détection de la redondance des informations sélectionnées et la maximisationde leur saillance. Pour le premier axe, nous nous sommes plus particulièrement intéressés àl'exploitation des similarités inter-phrastiques pour détecter, par la définition d'une méthode deregroupement sémantique de phrases, les redondances entre les informations des nouveaux documentset celles présentes dans les documents déjà connus. Concernant notre second axe, nous avons étudiél’impact de la prise en compte de la structure discursive des documents, dans le cadre de la Théorie dela Structure Rhétorique (RS), pour favoriser la sélection des informations considérées comme les plusimportantes. L'intérêt des méthodes ainsi définies a été démontré dans le cadre d'évaluations menéessur les données des campagnes TAC et DUC. Enfin, l'intégration de ces critères sémantique etdiscursif au travers d'un mécanisme de fusion tardive a permis de montrer dans le même cadre lacomplémentarité de ces deux axes et le bénéfice de leur combinaison. / This thesis focuses on text Automatic Summarization and particularly on UpdateSummarization. This research problem aims to produce a differential summary of a set of newdocuments with regard to a set of old documents assumed to be known. It thus adds two issues to thetask of generic automatic summarization: the temporal dimension of the information and the history ofthe user. In this context, the work presented here is based on an extractive approach using integerlinear programming (ILP) and is organized around two main axes: the redundancy detection betweenthe selected information and the user history and the maximization of their saliency . For the first axis,we were particularly interested in the exploitation of inter-sentence similarities to detect theredundancies between the information of the new documents and those present in the already knownones, by defining a method of semantic clustering of sentences. Concerning our second axis, westudied the impact of taking into account the discursive structure of documents, in the context of theRhetorical Structure Theory (RST), to favor the selection of information considered as the mostimportant. The benefit of the methods thus defined has been demonstrated in the context ofevaluations carried out on the data of TAC and DUC campaigns. Finally, the integration of thesesemantic and discursive criteria through a delayed fusion mechanism has proved the complementarityof these two axes and the benefit of their combination.

Page generated in 0.0127 seconds