• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 128
  • 41
  • 13
  • 12
  • 6
  • 4
  • 3
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 242
  • 73
  • 68
  • 67
  • 64
  • 59
  • 51
  • 45
  • 38
  • 38
  • 35
  • 34
  • 32
  • 31
  • 28
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
151

Distributed Document Clustering and Cluster Summarization in Peer-to-Peer Environments

Hammouda, Khaled M. January 2007 (has links)
This thesis addresses difficult challenges in distributed document clustering and cluster summarization. Mining large document collections poses many challenges, one of which is the extraction of topics or summaries from documents for the purpose of interpretation of clustering results. Another important challenge, which is caused by new trends in distributed repositories and peer-to-peer computing, is that document data is becoming more distributed. We introduce a solution for interpreting document clusters using keyphrase extraction from multiple documents simultaneously. We also introduce two solutions for the problem of distributed document clustering in peer-to-peer environments, each satisfying a different goal: maximizing local clustering quality through collaboration, and maximizing global clustering quality through cooperation. The keyphrase extraction algorithm efficiently extracts and scores candidate keyphrases from a document cluster. The algorithm is called CorePhrase and is based on modeling document collections as a graph upon which we can leverage graph mining to extract frequent and significant phrases, which are used to label the clusters. Results show that CorePhrase can extract keyphrases relevant to documents in a cluster with very high accuracy. Although this algorithm can be used to summarize centralized clusters, it is specifically employed within distributed clustering to both boost distributed clustering accuracy, and to provide summaries for distributed clusters. The first method for distributed document clustering is called collaborative peer-to-peer document clustering, which models nodes in a peer-to-peer network as collaborative nodes with the goal of improving the quality of individual local clustering solutions. This is achieved through the exchange of local cluster summaries between peers, followed by recommendation of documents to be merged into remote clusters. Results on large sets of distributed document collections show that: (i) such collaboration technique achieves significant improvement in the final clustering of individual nodes; (ii) networks with larger number of nodes generally achieve greater improvements in clustering after collaboration relative to the initial clustering before collaboration, while on the other hand they tend to achieve lower absolute clustering quality than networks with fewer number of nodes; and (iii) as more overlap of the data is introduced across the nodes, collaboration tends to have little effect on improving clustering quality. The second method for distributed document clustering is called hierarchically-distributed document clustering. Unlike the collaborative model, this model aims at producing one clustering solution across the whole network. It specifically addresses scalability of network size, and consequently the distributed clustering complexity, by modeling the distributed clustering problem as a hierarchy of node neighborhoods. Summarization of the global distributed clusters is achieved through a distributed version of the CorePhrase algorithm. Results on large document sets show that: (i) distributed clustering accuracy is not affected by increasing the number of nodes for networks of single level; (ii) we can achieve decent speedup by making the hierarchy taller, but on the expense of clustering quality which degrades as we go up the hierarchy; (iii) in networks that grow arbitrarily, data gets more fragmented across neighborhoods causing poor centroid generation, thus suggesting we should not increase the number of nodes in the network beyond a certain level without increasing the data set size; and (iv) distributed cluster summarization can produce accurate summaries similar to those produced by centralized summarization. The proposed algorithms offer high degree of flexibility, scalability, and interpretability of large distributed document collections. Achieving the same results using current methodologies require centralization of the data first, which is sometimes not feasible.
152

Hacia un modelo lingüístico de resumen automático de artículos médicos en español

Cunha Fanego, Iria da 25 April 2008 (has links)
En esta tesis se presenta un modelo lingüístico de resumen automático de artículos médicos en español que aúna criterios basados en la estructura textual, en las unidades léxicas y la estructura discursiva y sintáctico-comunicativa de los textos. El modelo se crea partiendo de la hipótesis de que los especialistas de cada ámbito emplean estrategias específicas a la hora de resumir. La validación de esta hipótesis mediante experimentos estadísticos permite tomar los artículos médicos acompañados de sus respectivos resúmenes como material de referencia para analizar, de cara a detectar las estrategias empleadas por los profesionales médicos para resumir sus textos. Una vez detectadas, estas estrategias se formalizan en forma de reglas y se diseña un modo de integración de las mismas. Esto da lugar al modelo presentado en esta tesis, del cual se implementa una parte. Los resúmenes resultantes se evalúan obteniendo buenos resultados, lo cual confirma que el modelo simula correctamente las estrategias empleadas por los especialistas y que estas se refieren a diversos aspectos lingüísticos. / In this thesis a linguistic model of automatic summarization of Spanish medical articles that joins criteria based on the textual structure, on lexical units and on the discourse and syntactic-communicative structure of texts is presented. The model is developed under the hypothesis that specialists of a domain use specific strategies when they summarize. The validation of this hypothesis by means of statistical experiments allows us to draw upon medical articles and their respective abstracts as reference in order to determine the strategies used by medical professionals. Once these strategies have been determined, they are formalized in terms of an integrated rule-based system, of which a part is implemented. The resulting summaries have been evaluated. Good results were obtained, which confirms that the model simulates correctly the strategies used by specialists and that these strategies refer to different linguistic aspects.
153

PragmaSUM: novos m?todos na utiliza??o de palavras-chave na sumariza??o autom?tica

Rocha, Valdir J?nior Cordeiro 05 December 2017 (has links)
Submitted by Jos? Henrique Henrique (jose.neves@ufvjm.edu.br) on 2018-05-03T18:35:26Z No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) valdir_junior_cordeiro_rocha.pdf: 3757934 bytes, checksum: 00a2e6ee18188436daa1415ec6a05021 (MD5) / Approved for entry into archive by Rodrigo Martins Cruz (rodrigo.cruz@ufvjm.edu.br) on 2018-05-04T16:22:37Z (GMT) No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) valdir_junior_cordeiro_rocha.pdf: 3757934 bytes, checksum: 00a2e6ee18188436daa1415ec6a05021 (MD5) / Made available in DSpace on 2018-05-04T16:22:37Z (GMT). No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) valdir_junior_cordeiro_rocha.pdf: 3757934 bytes, checksum: 00a2e6ee18188436daa1415ec6a05021 (MD5) Previous issue date: 2017 / Com a amplia??o do acesso ? internet e a cria??o de ferramentas que possibilitam pessoas a criarem conte?do, a informa??o dispon?vel cresce de forma acelerada. Textos sobre os mais diversos assuntos e autores s?o criados todos os dias. ? imposs?vel absorver a quantidade de informa??o dispon?vel, o que dificulta a escolha da mais adequada para determinado interesse ou p?blico. A sumariza??o autom?tica de textos, al?m de apresentar um texto de forma condensada, pode simplifica-lo, gerando uma alternativa para ganho de tempo e amplia??o do acesso a informa??o contida aos mais diferentes tipos de leitores. Os sumarizadores autom?ticos existentes atualmente na literatura n?o apresentam m?todos de personifica??o dos sum?rios para cada tipo de leitor, e consequentemente geram resultados pouco precisos. Este trabalho tem como objetivo utilizar o sumarizador autom?tico de textos PragmaSUM em textos educacionais com novas t?cnicas de sumariza??o utilizando palavras-chave. A utiliza??o de m?todos de personifica??o do sum?rio com palavras-chave visa aumentar a precis?o e melhorar o desempenho do PragmaSUM e seus sum?rios. Para isto, um corpus formado apenas por artigos cient?ficos da ?rea educacional foi criado para realiza??o de testes e compara??es entre diferentes sumarizadores e m?todos de sumariza??o. O desempenho dos sumarizadores foi medido pelas m?tricas Recall, Precision e F-Measure presentes na ferramenta ROUGE e validados com os testes estat?sticos ANOVA de Friedman e Coeficiente de Concord?ncia de Kendall. Os resultados obtidos apontam uma melhora no desempenho com a utiliza??o de palavras-chave na sumariza??o com o PragmaSUM, indicando a import?ncia na escolha adequada destas palavras-chave para classifica??o do conte?do do texto fonte. / Disserta??o (Mestrado Profissional) ? Programa de P?s-Gradua??o em Educa??o, Universidade Federal dos Vales do Jequitinhonha e Mucuri, 2017. / By expanding access to the internet and creating tools that enable people to create content, available information grows rapidly. Texts on the most diverse subjects and authors are created every day. It is impossible to absorb the amount of information available, which makes it difficult to choose the most appropriate for a particular interest or public. Automatic text summarization, as well as presenting a condensed text, can simplify it, generating an alternative to gain time and increase the access to information contained to the most different types of readers. The automatic summarizers that currently exist in the literature do not present methods of personification of the summaries for each type of reader, and consequently generate results inaccurate. This work aims to use the PragmaSUM automatic text summarizer in educational texts with new summarization techniques using keywords. Using summary keywords impersonation methods is intended to increase accuracy and improve the performance of PragmaSUM and its summaries. For this, a corpus formed only by scientific articles of the educational area was created to carry out tests and comparisons between different summarizers and summarization methods. The performance of the summarizers was measured by the Recall, Precision and F-Measure metrics present in the ROUGE tool and validated with the Friedman ANOVA statistical tests and Kendall's coefficient of agreement. The results obtained indicate an improvement in the performance with the use of keywords in the summarization with PragmaSUM, pointing out importance in the appropriate choice of these keywords for classification of the content of the source text.
154

Short text contextualization in information retrieval : application to tweet contextualization and automatic query expansion / Contextualisation de textes courts pour la recherche d'information : application à la contextualisation de tweets et à l'expansion automatique de requêtes.

Ermakova, Liana 31 March 2016 (has links)
La communication efficace a tendance à suivre la loi du moindre effort. Selon ce principe, en utilisant une langue donnée les interlocuteurs ne veulent pas travailler plus que nécessaire pour être compris. Ce fait mène à la compression extrême de textes surtout dans la communication électronique, comme dans les microblogues, SMS, ou les requêtes dans les moteurs de recherche. Cependant souvent ces textes ne sont pas auto-suffisants car pour les comprendre, il est nécessaire d’avoir des connaissances sur la terminologie, les entités nommées ou les faits liés. Ainsi, la tâche principale de la recherche présentée dans ce mémoire de thèse de doctorat est de fournir le contexte d’un texte court à l’utilisateur ou au système comme à un moteur de recherche par exemple.Le premier objectif de notre travail est d'aider l’utilisateur à mieux comprendre un message court par l’extraction du contexte d’une source externe comme le Web ou la Wikipédia au moyen de résumés construits automatiquement. Pour cela nous proposons une approche pour le résumé automatique de documents multiples et nous l’appliquons à la contextualisation de messages, notamment à la contextualisation de tweets. La méthode que nous proposons est basée sur la reconnaissance des entités nommées, la pondération des parties du discours et la mesure de la qualité des phrases. Contrairement aux travaux précédents, nous introduisons un algorithme de lissage en fonction du contexte local. Notre approche s’appuie sur la structure thème-rhème des textes. De plus, nous avons développé un algorithme basé sur les graphes pour le ré-ordonnancement des phrases. La méthode a été évaluée à la tâche INEX/CLEF Tweet Contextualization sur une période de 4 ans. La méthode a été également adaptée pour la génération de snippets. Les résultats des évaluations attestent une bonne performance de notre approche. / The efficient communication tends to follow the principle of the least effort. According to this principle, using a given language interlocutors do not want to work any harder than necessary to reach understanding. This fact leads to the extreme compression of texts especially in electronic communication, e.g. microblogs, SMS, search queries. However, sometimes these texts are not self-contained and need to be explained since understanding them requires knowledge of terminology, named entities or related facts. The main goal of this research is to provide a context to a user or a system from a textual resource.The first aim of this work is to help a user to better understand a short message by extracting a context from an external source like a text collection, the Web or the Wikipedia by means of text summarization. To this end we developed an approach for automatic multi-document summarization and we applied it to short message contextualization, in particular to tweet contextualization. The proposed method is based on named entity recognition, part-of-speech weighting and sentence quality measuring. In contrast to previous research, we introduced an algorithm for smoothing from the local context. Our approach exploits topic-comment structure of a text. Moreover, we developed a graph-based algorithm for sentence reordering. The method has been evaluated at INEX/CLEF tweet contextualization track. We provide the evaluation results over the 4 years of the track. The method was also adapted to snippet retrieval. The evaluation results indicate good performance of the approach.
155

SearchViz: An Interactive Visual Interface to Navigate Search-Results in Online Discussion Forums

January 2015 (has links)
abstract: Online programming communities are widely used by programmers for troubleshooting or various problem solving tasks. Large and ever increasing volume of posts on these communities demands more efforts to read and comprehend thus making it harder to find relevant information. In my thesis; I designed and studied an alternate approach by using interactive network visualization to represent relevant search results for online programming discussion forums. I conducted user study to evaluate the effectiveness of this approach. Results show that users were able to identify relevant information more precisely via visual interface as compared to traditional list based approach. Network visualization demonstrated effective search-result navigation support to facilitate user’s tasks and improved query quality for successive queries. Subjective evaluation also showed that visualizing search results conveys more semantic information in efficient manner and makes searching more effective. / Dissertation/Thesis / Masters Thesis Computer Science 2015
156

Desenvolvimento de técnicas baseadas em redes complexas para sumarização extrativa de textos / Development of techniques based on complex networks for extractive text summarization

Lucas Antiqueira 27 February 2007 (has links)
A Sumarização Automática de Textos tem considerável importância nas tarefas de localização e utilização de conteúdo relevante em meio à quantidade enorme de informação disponível atualmente em meio digital. Nessa área, procura-se desenvolver técnicas que possibilitem obter o conteúdo mais relevante de documentos, de maneira condensada, sem alterar seu significado original, e com mínima intervenção humana. O objetivo deste trabalho de mestrado foi investigar de que maneira conceitos desenvolvidos na área de Redes Complexas podem ser aplicados à Sumarização Automática de Textos, mais especificamente à sumarização extrativa. Embora grande parte das pesquisas em sumarização tenha se voltado para a utilização de técnicas extrativas, ainda é possível melhorar o nível de informatividade dos extratos gerados automaticamente. Neste trabalho, textos foram representados como redes, das quais foram extraídas medidas tradicionalmente utilizadas na caracterização de redes complexas (por exemplo, coeficiente de aglomeração, grau hierárquico e índice de localidade), com o intuito de fornecer subsídios à seleção das sentenças mais significativas de um texto. Essas redes são formadas pelas sentenças (representadas pelos vértices) de um determinado texto, juntamente com as repetições (representadas pelas arestas) de substantivos entre sentenças após lematização. Cada método de sumarização proposto foi aplicado no córpus TeMário, de textos jornalísticos em português, e em córpus das conferências DUC, de textos jornalísticos em inglês. A avaliação desse estudo foi feita por meio da realização de quatro experimentos, fazendo-se uso de métodos de avaliação automática (Rouge-1 e Precisão/Cobertura de sentenças) e comparando-se os resultados com os de outros sistemas de sumarização extrativa. Os melhores sumarizadores propostos referem-se aos seguintes conceitos: d-anel, grau, k-núcleo e caminho mínimo. Foram obtidos resultados comparáveis aos dos melhores métodos de sumarização já propostos para o português, enquanto que, para o inglês, os resultados são menos expressivos. / Automatic Text Summarization has considerably importance in tasks such as finding and using relevant content in the enormous amount of information available nowadays in digital media. The focus in this field is on the development of techniques that allow someone to obtain the most relevant content of documents, in a condensed way, preserving the original meaning and with little (or even none) human help. The purpose of this MSc project was to investigate a way of applying concepts borrowed from the studies of Complex Networks to the Automatic Text Summarization field, specifically to the task of extractive summarization. Although the majority of works in summarization have focused on extractive techniques, it is still possible to obtain better levels of informativity in extracts automatically generated. In this work, texts were represented as networks, from which the most significant sentences were selected through the use of ranking algorithms. Such networks are obtained from a text in the following manner: the sentences are represented as nodes, and an edge between two nodes is created if there is at least one repetition of a noun in both sentences, after the lemmatization step. Measurements typically employed in the characterization of complex networks, such as clustering coefficient, hierarchical degree and locality index, were used on the basis of the process of node (sentence) selection in order to build an extract. Each summarization technique proposed was applied to the TeMário corpus, which comprises newspaper articles in Portuguese, and to the DUC corpora, which comprises newspaper articles in English. Four evaluation experiments were carried out, by means of automatic evaluation measurements (Rouge-1 and sentence Precision/Recall) and comparison with the results obtained by other extractive summarization systems. The best summarizers are the ones based on the following concepts: d-ring, degree, k-core and shortest path. Performances comparable to the best summarization systems for Portuguese were achieved, whilst the results are less significant for English.
157

Extractive document summarization using complex networks / Sumarização extractiva de documentos usando redes complexas

Jorge Andoni Valverde Tohalino 15 June 2018 (has links)
Due to a large amount of textual information available on the Internet, the task of automatic document summarization has gained significant importance. Document summarization became important because its focus is the development of techniques aimed at finding relevant and concise content in large volumes of information without changing its original meaning. The purpose of this Masters work is to use network theory concepts for extractive document summarization for both Single Document Summarization (SDS) and Multi-Document Summarization (MDS). In this work, the documents are modeled as networks, where sentences are represented as nodes with the aim of extracting the most relevant sentences through the use of ranking algorithms. The edges between nodes are established in different ways. The first approach for edge calculation is based on the number of common nouns between two sentences (network nodes). Another approach to creating an edge is through the similarity between two sentences. In order to calculate the similarity of such sentences, we used the vector space model based on Tf-Idf weighting and word embeddings for the vector representation of the sentences. Also, we make a distinction between edges linking sentences from different documents (inter-layer) and those connecting sentences from the same document (intra-layer) by using multilayer network models for the Multi-Document Summarization task. In this approach, each network layer represents a document of the document set that will be summarized. In addition to the measurements typically used in complex networks such as node degree, clustering coefficient, shortest paths, etc., the network characterization also is guided by dynamical measurements of complex networks, including symmetry, accessibility and absorption time. The generated summaries were evaluated by using different corpus for both Portuguese and English language. The ROUGE-1 metric was used for the validation of generated summaries. The results suggest that simpler models like Noun and Tf-Idf based networks achieved a better performance in comparison to those models based on word embeddings. Also, excellent results were achieved by using the multilayered representation of documents for MDS. Finally, we concluded that several measurements could be used to improve the characterization of networks for the summarization task. / Devido à grande quantidade de informações textuais disponíveis na Internet, a tarefa de sumarização automática de documentos ganhou importância significativa. A sumarização de documentos tornou-se importante porque seu foco é o desenvolvimento de técnicas destinadas a encontrar conteúdo relevante e conciso em grandes volumes de informação sem alterar seu significado original. O objetivo deste trabalho de Mestrado é usar os conceitos da teoria de grafos para o resumo extrativo de documentos para Sumarização mono-documento (SDS) e Sumarização multi-documento (MDS). Neste trabalho, os documentos são modelados como redes, onde as sentenças são representadas como nós com o objetivo de extrair as sentenças mais relevantes através do uso de algoritmos de ranqueamento. As arestas entre nós são estabelecidas de maneiras diferentes. A primeira abordagem para o cálculo de arestas é baseada no número de substantivos comuns entre duas sentenças (nós da rede). Outra abordagem para criar uma aresta é através da similaridade entre duas sentenças. Para calcular a similaridade de tais sentenças, foi usado o modelo de espaço vetorial baseado na ponderação Tf-Idf e word embeddings para a representação vetorial das sentenças. Além disso, fazemos uma distinção entre as arestas que vinculam sentenças de diferentes documentos (inter-camada) e aquelas que conectam sentenças do mesmo documento (intra-camada) usando modelos de redes multicamada para a tarefa de Sumarização multi-documento. Nesta abordagem, cada camada da rede representa um documento do conjunto de documentos que será resumido. Além das medições tipicamente usadas em redes complexas como grau dos nós, coeficiente de agrupamento, caminhos mais curtos, etc., a caracterização da rede também é guiada por medições dinâmicas de redes complexas, incluindo simetria, acessibilidade e tempo de absorção. Os resumos gerados foram avaliados usando diferentes corpus para Português e Inglês. A métrica ROUGE-1 foi usada para a validação dos resumos gerados. Os resultados sugerem que os modelos mais simples, como redes baseadas em Noun e Tf-Idf, obtiveram um melhor desempenho em comparação com os modelos baseados em word embeddings. Além disso, excelentes resultados foram obtidos usando a representação de redes multicamada de documentos para MDS. Finalmente, concluímos que várias medidas podem ser usadas para melhorar a caracterização de redes para a tarefa de sumarização.
158

Indexace elektronických dokumentů a jejich částí / Indexing of text documents and their parts

Tomeš, Jiří January 2015 (has links)
The thesis describes the design and implementation of an application for processing electronic publications (collections of conference papers, comprehensive manuals, or even classical electronic books) in order to enrich their internal navigation by hyperlinks between their related parts, respectively producing as representative as possible summarizations of given length. Unlike similar applications summarizations can be based not only on the sentences, but on elements of other categories like paragraphs, sections and the like.The main emphasis was put on ease of use, platform independence, and multilingual support. The application provides a flexible environment that can be customized to user's needs.
159

Création automatique de résumés vidéo par programmation par contraintes / Automatic video summarization using constraint satisfaction programming

Boukadida, Haykel 04 December 2015 (has links)
Cette thèse s’intéresse à la création automatique de résumés de vidéos. L’idée est de créer de manière adaptative un résumé vidéo qui prenne en compte des règles définies sur le contenu audiovisuel d’une part, et qui s’adapte aux préférences de l’utilisateur d’autre part. Nous proposons une nouvelle approche qui considère le problème de création automatique de résumés sous forme d’un problème de satisfaction de contraintes. La solution est basée sur la programmation par contraintes comme paradigme de programmation. Un expert commence par définir un ensemble de règles générales de production du résumé, règles liées au contenu multimédia de la vidéo d’entrée. Ces règles de production sont exprimées sous forme de contraintes à satisfaire. L’utilisateur final peut alors définir des contraintes supplémentaires (comme la durée souhaitée du résumé) ou fixer des paramètres de haut niveau des contraintes définies par l’expert. Cette approche a plusieurs avantages. Elle permet de séparer clairement les règles de production des résumés (modélisation du problème) de l’algorithme de génération de résumés (la résolution du problème par le solveur de contraintes). Le résumé peut donc être adapté sans qu’il soit nécessaire de revoir tout le processus de génération des résumés. Cette approche permet par exemple aux utilisateurs d’adapter le résumé à l’application cible et à leurs préférences en ajoutant une contrainte ou en modifiant une contrainte existante, ceci sans avoir à modifier l’algorithme de production des résumés. Nous avons proposé trois modèles de représentation des vidéos qui se distinguent par leur flexibilité et leur efficacité. Outre les originalités liées à chacun des trois modèles, une contribution supplémentaire de cette thèse est une étude comparative de leurs performances et de la qualité des résumés résultants en utilisant des mesures objectives et subjectives. Enfin, et dans le but d’évaluer la qualité des résumés générés automatiquement, l’approche proposée a été évaluée par des utilisateurs à grande échelle. Cette évaluation a impliqué plus de 60 personnes. Ces expériences ont porté sur le résumé de matchs de tennis. / This thesis focuses on the issue of automatic video summarization. The idea is to create an adaptive video summary that takes into account a set of rules defined on the audiovisual content on the one hand, and that adapts to the users preferences on the other hand. We propose a novel approach that considers the problem of automatic video summarization as a constraint satisfaction problem. The solution is based on constraint satisfaction programming (CSP) as programming paradigm. A set of general rules for summary production are inherently defined by an expert. These production rules are related to the multimedia content of the input video. The rules are expressed as constraints to be satisfied. The final user can then define additional constraints (such as the desired duration of the summary) or enter a set of high-level parameters involving to the constraints already defined by the expert. This approach has several advantages. This will clearly separate the summary production rules (the problem modeling) from the summary generation algorithm (the problem solving by the CSP solver). The summary can hence be adapted without reviewing the whole summary generation process. For instance, our approach enables users to adapt the summary to the target application and to their preferences by adding a constraint or modifying an existing one, without changing the summaries generation algorithm. We have proposed three models of video representation that are distinguished by their flexibility and their efficiency. Besides the originality related to each of the three proposed models, an additional contribution of this thesis is an extensive comparative study of their performance and the quality of the resulting summaries using objective and subjective measures. Finally, and in order to assess the quality of automatically generated summaries, the proposed approach was evaluated by a large-scale user evaluation. This evaluation involved more than 60 people. All these experiments have been performed within the challenging application of tennis match automatic summarization.
160

Event Centric Approaches in Natural Language Processing / 自然言語処理におけるイベント中心のアプローチ

Huang, Yin Jou 26 July 2021 (has links)
京都大学 / 新制・課程博士 / 博士(情報学) / 甲第23438号 / 情博第768号 / 新制||情||131(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)教授 黒橋 禎夫, 教授 河原 達也, 教授 伊藤 孝行 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM

Page generated in 0.0479 seconds