Global ETD Search

1	Topic Retrospection with Storyline-based Summarization on News Reports Liang, Chia-Hao 18 July 2005 (has links) The electronics newspaper becomes a main source for online news readers. When facing the numerous stories, news readers need some supports in order to review a topic in short time. Due to previous researches in TDT (Topic Detection and Tracking) only considering how to identify events and present the results with news titles and keywords, a summarized text to present event evolution is necessary for general news readers to retrospect events under a news topic. This thesis proposes a topic retrospection process and implements the SToRe system that identifies various events under a new topic and constructs the relationship to compose a summary which gives readers the sketch of event evolution in a topic. It consists of three main functions: event identification, main storyline construction and storyline-based summarization. The constructed main storyline can remove the irrelevant events and present a main theme. The summarization extracts the representative sentences and takes the main theme as the template to compose summary. The summarization not only provides enough information to comprehend the development of a topic, but also can be an index to help readers to find more detailed information. A lab experiment is conducted to evaluate the SToRe system in the question-and-answer (Q&A) setting. From the experimental results, the SToRe system can help news readers more effectively and efficiently to capture the development of a topic. topic retrospection event threading summarization Topic Detection and Tracking (TDT)
2	Extracción y recuperación de información temporal Llidó Escrivá, Dolores Maria 20 September 2002 (has links) Esta tesis intenta demostrar cómo los sistemas de Recuperación de Información (RI) y los sistemas de Detección de Sucesos (TDT - Topic Detection and Tracking) mejoran si se añade una componente temporal extraída automáticamente del texto, a la cual denominaremos periodo de suceso. Este atributo representa el espacio de tiempo en el que transcurre el suceso principal relatado en cada documento. Con este propósito la tesis ha cubierto los siguientes objetivos: * Definición de un modelo de tiempo para representar y manipular las referencias temporales que aparecen en un texto. * Desarrollo de una aplicación para la extracción de expresiones temporales lingüísticas y el reconocimiento del intervalo absoluto que referencian según el calendario Gregoriano. * Implementación de un sistema para la extracción automática del periodo de suceso. * Modificación de los actuales sistemas de RI, TDT para incluir la información temporal extraída con las herramientas anteriores. Topic Detection and Tracking(TDT) extracción de Información (EI) expresiones temporal Recuperación Información (RI) Llenguatges i Sistemes Informàtics 004
3	Using the organizational and narrative thread structures in an e-book to support comprehension Sun, Yixing January 2007 (has links) Stories, themes, concepts and references are organized structurally and purposefully in most books. A person reading a book needs to understand themes and concepts within the context. Schank’s Dynamic Memory theory suggested that building on existing memory structures is essential to cognition and learning. Pirolli and Card emphasized the need to provide people with an independent and improved ability to access and understand information in their information seeking activities. Through a review of users’ reading behaviours and of existing e-Book user interfaces, we found that current e-Book browsers provide minimal support for comprehending the content of large and complex books. Readers of an e-Book need user interfaces that present and relate the organizational and narrative structures, and moreover, reveal the thematic structures. This thesis addresses the problem of providing readers with effective scaffolding of multiple structures of an e-Book in the user interface to support reading for comprehension. Recognising a story or topic as the basic unit in a book, we developed novel story segmentation techniques for discovering narrative segments, and adapted story linking techniques for linking narrative threads in semi-structured linear texts of an e-Book. We then designed an e-Book user interface to present the complex structures of the e-Book, as well as to assist the reader to discover these structures. We designed and developed evaluation methodologies to investigate reading and comprehension in e-Books, in order to assess the effectiveness of this user interface. We designed semi-directed reading tasks using a Story-Theme Map, and a set of corresponding measurements for the answers. We conducted user evaluations with book readers. Participants were asked to read stories, to browse and link related stories, and to identify major themes of stories in an e-Book. This thesis reports the experimental design and results in detail. The results confirmed that the e-Book interface helped readers perform reading tasks more effectively. The most important and interesting finding is that the interface proved to be more helpful to novice readers who had little background knowledge of the book. In addition, each component that supported the user interface was evaluated separately in a laboratory setting and, these results too are reported in the thesis. 070.573
4	巨量資料環境下之新聞主題暨輿情與股價關係之研究 / A Study of the Relevance between News Topics & Public Opinion and Stock Prices in Big Data 張良杰, Chang, Liang Chieh Unknown Date (has links) 近年來科技、網路以及儲存媒介的發達，產生的資料量呈現爆炸性的成長，也宣告了巨量資料時代的來臨。擁有巨量資料代表了不必再依靠傳統抽樣的方式來蒐集資料，分析數據也不再有資料收集不足以致於無法代表母題的限制。突破傳統的限制後，巨量資料的精隨在於如何從中找出有價值的資訊。以擁有大量輿論和人際互動資訊的社群網站為例，就有相關學者研究其情緒與股價具有正相關性，本研究也試著利用同樣具有巨量資料特性的網路新聞，抓取中央新聞社2013年7月至2014年5月之經濟類新聞共計30,879篇，結合新聞主題偵測與追蹤技術及情感分析，利用新聞事件相似的概念，透過連結匯聚成網絡並且分析新聞的情緒和股價指數的關係。研究結果顯示，新聞事件間可以連結成一特定新聞主題，且能在龐大的網絡中找出不同的新聞主題，並透過新聞主題之連結產生新聞主題脈絡。對此提供一種新的方式來迅速了解巨量新聞內容，也能有效的回溯新聞主題及新聞事件。在新聞情緒和股價指數方面，研究發現新聞情緒影響了股價指數之波動，其相關係數達到0.733562；且藉由情緒與心理線及買賣意願指標之比較，顯示新聞的情緒具有一定的程度能夠成為股價判斷之參考依據。 / In recent years, the technology, network, and storage media developed, the amount of generated data with the explosive growth, and also declared the new era of big data. Having big data let us no longer rely on the traditional sample ways to collect data, and no longer have the issue that could not represent the population which caused by the inadequate data collection. Once we break the limitations, the main spirit of big data is how to find out the valuable information in big data. For example, the social network sites (SNS) have a lot of public opinions and interpersonal information, and scholars have founded that the emotions in SNS have a positive correlation with stock prices. Therefore, the thesis tried to focus on the news which have the same characteristic of big data, using the web crawl to catch total of 30,879 economics news articles form the Central News Agency, furthermore, took the “Topic Detection & Tracking” and “Sentiment Analysis” technology on these articles. Finally, based on the concept of the similarity between news articles, through the links converging networks and analyze the relevant between news sentiment and stock prices. The results shows that news events can be linked to specific news topics, identify different news topics in a large network, and form the news topic context by linked news topics together. The thesis provides a new way to quickly understand the huge amount of news, and backtracking news topics and news event with effective. In the aspect of news sentiment and stock prices, the results shows that the news sentiments impact the fluctuations of stock prices, and the correlation coefficient is 0.733562. By comparing the emotion with psychological lines & trading willingness indicators, the emotion is better than the two indicators in the stock prices determination. 巨量資料文字探勘新聞主題偵測與追蹤連結分析情感分析 Big data Text mining News topic detection and tracking Link analysis Sentiment analysis
5	Appariement de contenus textuels dans le domaine de la presse en ligne : développement et adaptation d'un système de recherche d'information / Pairing textual content in the field of on-line news : development and adaptation of an information retrieval system Désoyer, Adèle 27 November 2017 (has links) L'objectif de cette thèse, menée dans un cadre industriel, est d'apparier des contenus textuels médiatiques. Plus précisément, il s'agit d'apparier à des articles de presse en ligne des vidéos pertinentes, pour lesquelles nous disposons d'une description textuelle. Notre problématique relève donc exclusivement de l'analyse de matériaux textuels, et ne fait intervenir aucune analyse d'image ni de langue orale. Surviennent alors des questions relatives à la façon de comparer des objets textuels, ainsi qu'aux critères mobilisés pour estimer leur degré de similarité. L'un de ces éléments est selon nous la similarité thématique de leurs contenus, autrement dit le fait que deux documents doivent relater le même sujet pour former une paire pertinente. Ces problématiques relèvent du domaine de la recherche d'information (ri), dans lequel nous nous ancrons principalement. Par ailleurs, lorsque l'on traite des contenus d'actualité, la dimension temporelle est aussi primordiale et les problématiques qui l'entourent relèvent de travaux ayant trait au domaine du topic detection and tracking (tdt) dans lequel nous nous inscrivons également.Le système d'appariement développé dans cette thèse distingue donc différentes étapes qui se complètent. Dans un premier temps, l'indexation des contenus fait appel à des méthodes de traitement automatique des langues (tal) pour dépasser la représentation classique des textes en sac de mots. Ensuite, deux scores sont calculés pour rendre compte du degré de similarité entre deux contenus : l'un relatif à leur similarité thématique, basé sur un modèle vectoriel de ri; l'autre à leur proximité temporelle, basé sur une fonction empirique. Finalement, un modèle de classification appris à partir de paires de documents, décrites par ces deux scores et annotées manuellement, permet d'ordonnancer les résultats.L'évaluation des performances du système a elle aussi fait l'objet de questionnements dans ces travaux de thèse. Les contraintes imposées par les données traitées et le besoin particulier de l'entreprise partenaire nous ont en effet contraints à adopter une alternative au protocole classique d'évaluation en ri, le paradigme de Cranfield. / The goal of this thesis, conducted within an industrial framework, is to pair textual media content. Specifically, the aim is to pair on-line news articles to relevant videos for which we have a textual description. The main issue is then a matter of textual analysis, no image or spoken language analysis was undertaken in the present study. The question that arises is how to compare these particular objects, the texts, and also what criteria to use in order to estimate their degree of similarity. We consider that one of these criteria is the topic similarity of their content, in other words, the fact that two documents have to deal with the same topic to form a relevant pair. This problem fall within the field of information retrieval (ir) which is the main strategy called upon in this research. Furthermore, when dealing with news content, the time dimension is of prime importance. To address this aspect, the field of topic detection and tracking (tdt) will also be explored.The pairing system developed in this thesis distinguishes different steps which complement one another. In the first step, the system uses natural language processing (nlp) methods to index both articles and videos, in order to overcome the traditionnal bag-of-words representation of texts. In the second step, two scores are calculated for an article-video pair: the first one reflects their topical similarity and is based on a vector space model; the second one expresses their proximity in time, based on an empirical function. At the end of the algorithm, a classification model learned from manually annotated document pairs is used to rank the results.Evaluation of the system's performances raised some further questions in this doctoral research. The constraints imposed both by the data and the specific need of the partner company led us to adapt the evaluation protocol traditionnal used in ir, namely the cranfield paradigm. We therefore propose an alternative solution for evaluating the system that takes all our constraints into account. Système de recherche d'information Recommandation basée sur le contenu Apprentissage supervisé Cadre d'évaluation Contexte industriel Information retrieval system Topic detection and tracking Content-Based recommendation Supervised learning Evaluation framework Industrial context

1

Page generated in 0.351 seconds