Spelling suggestions: "subject:"[een] SUMMARIZATION"" "subject:"[enn] SUMMARIZATION""
101 |
Sumarização de vídeos de histerocopias diagnósticas / Content-based summarization of diagnostic hysteroscopy videosGavião Neto, Wilson Pires January 2009 (has links)
Dada uma biblioteca com milhares de vídeos de histeroscopias diagnósticas, sobre a qual deseja-se realizar consultas como "retornar imagens contendo miomas submucosos" ou "recuperar imagens cujo diagnóstico é pólipo endometrial". Este é o contexto deste trabalho. Vídeos de histeroscopias diagnósticas são usados para avaliar a aparência do útero e são importantes não só para propósitos de diagnóstico de doenças mas também em estudos científicos em áreas da medicina, como reprodução humana e estudos sobre fertilidade. Estes vídeos contêm uma grande quantidade de informação, porém somente um número reduzido de quadros são úteis para propósitos de diagnósticos e/ou prognósticos. Esta tese apresenta um método para identificar automaticamente a informação relevante em vídeos de histeroscopias diagnósticas, criando um sumário do vídeo. Propõe-se uma representação hierárquica do conteúdo destes vídeos que é baseada no rastreamento de pontos geometricamente consistentes através da seqüência dos quadros. Demonstra-se que esta representação é uma maneira útil de organizar o conteúdo de vídeos de histeroscopias diagnósticas, permitindo que especialistas possam realizar atividades de browsing de uma forma rápida e sem introduzir informações espúrias no sumário do vídeo. Os experimentos indicam que o método proposto produz sumários compactos (com taxas de redução de dados em torno de 97.5%) sem descartar informações clinicamente relevantes. / Given a library containing thousands of diagnostic hysteroscopy videos, which are only indexed according to a patient ID and the exam date. Usually, users browse through this library in order to obtain answers to queries like retrieve images of submucosal myomas or recover images whose diagnosis is endometrial polyp. This is the context of this work. Specialists have been used diagnostic hysteroscopy videos to inspect the uterus appearance, once the images are important for diagnosis purposes as well as in medical research fields like human reproduction. These videos contain lots of information, but only a reduced number of frames are actually useful for diagnosis/prognosis purposes. This thesis proposes a technique to identify clinically relevant information in diagnostic hysteroscopy videos, creating a rich video summary. We propose a hierarchical representation based on a robust tracking of image points through the frame sequence. We demonstrate this representation is a helpful way to organize the hysteroscopy video content, allowing specialists to perform fast browsing without introducing spurious information in the video summary. The experimental results indicate that the method produces compact video summaries (data-rate reduction around 97.5%) without discarding clinically relevant information.
|
102 |
Ontology-based clustering in a Peer Data Management SystemPires, Carlos Eduardo Santos 31 January 2009 (has links)
Made available in DSpace on 2014-06-12T15:49:23Z (GMT). No. of bitstreams: 1
license.txt: 1748 bytes, checksum: 8a4605be74aa9ea9d79846c1fba20a33 (MD5)
Previous issue date: 2009 / Faculdade de Amparo à Ciência e Tecnologia do Estado de Pernambuco / Os Sistemas P2P de Gerenciamento de Dados (PDMS) são aplicações P2P
avançadas que permitem aos usuários consultar, de forma transparente, várias
fontes de dados distribuídas, heterogêneas e autônomas. Cada peer representa
uma fonte de dados e exporta seu esquema de dados completo ou apenas uma
parte dele. Tal esquema, denominado esquema exportado, representa os dados a
serem compartilhados com outros peers no sistema e é comumente descrito por
uma ontologia.
Os dois aspectos mais estudados sobre gerenciamento de dados em PDMS
estão relacionados com mapeamentos entre esquemas e processamento de
consultas. Estes aspectos podem ser melhorados se os peers estiverem
eficientemente dispostos na rede overlay de acordo com uma abordagem
baseada em semântica. Nesse contexto, a noção de comunidade semântica de
peers é bastante importante visto que permite aproximar logicamente peers com
interesses comuns sobre um tópico específico. Entretanto, devido ao
comportamento dinâmico dos peers, a criação e manutenção de comunidades
semânticas é um aspecto desafiador no estágio atual de desenvolvimento dos
PDMS.
O objetivo principal desta tese é propor um processo baseado em
semântica para agrupar, de modo incremental, peers semanticamente similares
que compõem comunidades em um PDMS. Nesse processo, os peers são
agrupados de acordo com o respectivo esquema exportado (uma ontologia) e
processos de gerenciamento de ontologias (por exemplo, matching e
sumarização) são utilizados para auxiliar a conexão dos peers. Uma arquitetura
de PDMS é proposta para facilitar a organização semântica dos peers na rede
overlay. Para obter a similaridade semântica entre duas ontologias de peers,
propomos uma medida de similaridade global como saída de um processo de
ontology matching. Para otimizar o matching entre ontologias, um processo
automático para sumarização de ontologias também é proposto. Um simulador
foi desenvolvido de acordo com a arquitetura do PDMS. Os processos de
gerenciamento de ontologias propostos também foram desenvolvidos e incluídos no simulador. Experimentações de cada processo no contexto do
PDMS assim como os resultados obtidos a partir dos experimentos são apresentadas
|
103 |
Keeping an Eye on the Context : An Eye Tracking Study of Cohesion Errors in Automatic Text Summarization / Med ett öga på sammanhanget : En ögonrörelsestudie av kohesionsfel i automatiska textsammanfattningarRennes, Evelina January 2013 (has links)
Automatic text summarization is a growing field due to the modern world’s Internet based society, but to automatically create perfect summaries is not easy, and cohesion errors are common. By the usage of an eye tracking camera, this thesis studies the nature of four different types of cohesion errors occurring in summaries. A total of 23 participants read and rated four different texts and marked the most difficult areas of each text. Statistical analysis of the data revealed that absent cohesion or context and broken anaphoric reference (pronouns) caused some disturbance in reading, but that the impact is restricted to the effort to read rather than the comprehension of the text. Erroneous anaphoric reference (pronouns) was not detected by the participants which poses a problem for automatic text summarizers, and other potential disturbing factors were detected. Finally, the question of the meaningfulness of keeping absent cohesion or context as a separate error type was raised.
|
104 |
New data analytics and visualization methods in personal data mining, cancer data analysis and sports data visualizationZhang, Lei 12 July 2017 (has links)
In this dissertation, we discuss a reading profiling system, a biological data visualization system and a sports visualization system. Self-tracking is getting increasingly popular in the field of personal informatics. Reading profiling can be used as a personal data collection method. We present UUAT, an unintrusive user attention tracking system. In UUAT, we used user interaction data to develop technologies that help to pinpoint a users reading region (RR). Based on computed RR and user interaction data, UUAT can identify a readers reading struggle or interest. A biomarker is a measurable substance that may be used as an indicator of a particular disease. We developed CancerVis for visual and interactive analysis of cancer data and demonstrate how to apply this platform in cancer biomarker research. CancerVis provides interactive multiple views from different perspectives of a dataset. The views are synchronized so that users can easily link them to a same data entry. Furthermore, CancerVis supports data mining practice in cancer biomarker, such as visualization of optimal cutpoints and cutthrough exploration. Tennis match summarization helps after-live sports consumers assimilate an interested match. We developed TennisVis, a comprehensive match summarization and visualization platform. TennisVis offers chart- graph for a client to quickly get match facts. Meanwhile, TennisVis offers various queries of tennis points to satisfy diversified client preferences (such as volley shot, many-shot rally) of tennis fans. Furthermore, TennisVis offers video clips for every single tennis point and a recommendation rating is computed for each tennis play. A case study shows that TennisVis identifies more than 75% tennis points in full time match.
|
105 |
Ranked Search on Data GraphsVaradarajan, Ramakrishna R. 10 March 2009 (has links)
Graph-structured databases are widely prevalent, and the problem of effective search and retrieval from such graphs has been receiving much attention recently. For example, the Web can be naturally viewed as a graph. Likewise, a relational database can be viewed as a graph where tuples are modeled as vertices connected via foreign-key relationships. Keyword search querying has emerged as one of the most effective paradigms for information discovery, especially over HTML documents in the World Wide Web. One of the key advantages of keyword search querying is its simplicity – users do not have to learn a complex query language, and can issue queries without any prior knowledge about the structure of the underlying data. The purpose of this dissertation was to develop techniques for user-friendly, high quality and efficient searching of graph structured databases. Several ranked search methods on data graphs have been studied in the recent years. Given a top-k keyword search query on a graph and some ranking criteria, a keyword proximity search finds the top-k answers where each answer is a substructure of the graph containing all query keywords, which illustrates the relationship between the keyword present in the graph. We applied keyword proximity search on the web and the page graph of web documents to find top-k answers that satisfy user’s information need and increase user satisfaction. Another effective ranking mechanism applied on data graphs is the authority flow based ranking mechanism. Given a top-k keyword search query on a graph, an authority-flow based search finds the top-k answers where each answer is a node in the graph ranked according to its relevance and importance to the query. We developed techniques that improved the authority flow based search on data graphs by creating a framework to explain and reformulate them taking in to consideration user preferences and feedback. We also applied the proposed graph search techniques for Information Discovery over biological databases. Our algorithms were experimentally evaluated for performance and quality. The quality of our method was compared to current approaches by using user surveys.
|
106 |
Exploração de métodos de sumarização automática multidocumento com base em conhecimento semântico-discursivo / Exploration of automatic methods for multi-document summarization using discourse modelsPaula Christina Figueira Cardoso 05 September 2014 (has links)
A sumarização automática multidocumento visa à produção de um sumário a partir de um conjunto de textos relacionados, para ser utilizado por um usuário particular e/ou para determinada tarefa. Com o crescimento exponencial das informações disponíveis e a necessidade das pessoas obterem a informação em um curto espaço de tempo, a tarefa de sumarização automática tem recebido muita atenção nos últimos tempos. Sabe-se que em um conjunto de textos relacionados existem informações redundantes, contraditórias e complementares, que representam os fenômenos multidocumento. Em cada texto-fonte, o assunto principal é descrito em uma sequência de subtópicos. Além disso, as sentenças de um texto-fonte possuem graus de relevância diferentes. Nesse contexto, espera-se que um sumário multidocumento consista das informações relevantes que representem o total de textos do conjunto. No entanto, as estratégias de sumarização automática multidocumento adotadas até o presente utilizam somente os relacionamentos entre textos e descartam a análise da estrutura textual de cada texto-fonte, resultando em sumários que são pouco representativos dos subtópicos textuais e menos informativos do que poderiam ser. A fim de tratar adequadamente a relevância das informações, os fenômenos multidocumento e a distribuição de subtópicos, neste trabalho de doutorado, investigou-se como modelar o processo de sumarização automática usando o conhecimento semântico-discursivo em métodos de seleção de conteúdo e o impacto disso para a produção de sumários mais informativos e representativos dos textos-fonte. Na formalização do conhecimento semântico-discursivo, foram utilizadas as teorias semântico-discursivas RST (Rhetorical Structure Theory) e CST (Cross-document Structure Theory). Para apoiar o trabalho, um córpus multidocumento foi anotado com RST e subtópicos, consistindo em um recurso disponível para outras pesquisas. A partir da análise de córpus, foram propostos 10 métodos de segmentação em subtópicos e 13 métodos inovadores de sumarização automática. A avaliação dos métodos de segmentação em subtópicos mostrou que existe uma forte relação entre a estrutura de subtópicos e a análise retórica de um texto. Quanto à avaliação dos métodos de sumarização automática, os resultados indicam que o uso do conhecimento semântico-discursivo em boas estratégias de seleção de conteúdo afeta positivamente a produção de sumários informativos. / The multi-document summarization aims at producing a summary from a set of related texts to be used for an individual or/and a particular task. Nowadays, with the exponential growth of available information and the peoples need to obtain information in a short time, the task of automatic summarization has received wide attention. It is known that in a set of related texts there are pieces of redundant, contradictory and complementary information that represent the multi-document phenomenon. In each source text, the main subject is described in a sequence of subtopics. Furthermore, some sentences in the same text are more relevant than others. Considering this context, it is expected that a multi-document summary consists of relevant information that represents a set of texts. However, strategies for automatic multi-document summarization adopted until now have used only the relationships between texts and dismissed the analysis of textual structure of each source text, resulting in summaries that are less representative of subtopics and less informative than they could be. In order to properly treat the relevance of information, multi-document phenomena and distribution of subtopics, in this thesis, we investigated how to model the summarization process using the semantic-discursive knowledge and its impact for producing more informative and representative summaries from source texts. In order to formalize the semantic-discursive knowledge, we adopted RST (Rhetorical Structure Theory) and CST (Cross-document Structure Theory) theories. To support the work, a multi-document corpus was annotated with RST and subtopics, consisting of a new resource available for other researchers. From the corpus analysis, 10 methods for subtopic segmentation and 13 orignal methods for automatic summarization were proposed. The assessment of methods for subtopic segmentation showed that there is a strong relationship between the subtopics structure and the rhetorical analysis of a text. In regards to the assessment of the methods for automatic summarization, the results indicate that the use of semantic-discursive knowledge in good strategies for content selection affects positively the production of informative summaries.
|
107 |
History-related Knowledge Extraction from Temporal Text Collections / テキストコレクションからの歴史関連知識の抽出Duan, Yijun 23 March 2020 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第22574号 / 情博第711号 / 新制||情||122(附属図書館) / 京都大学大学院情報学研究科社会情報学専攻 / (主査)教授 吉川 正俊, 教授 鹿島 久嗣, 教授 田島 敬史, 特定准教授 JATOWT Adam Wladyslaw / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DGAM
|
108 |
SDL model pro Source Specific Multicast / SDL model for Source Specific MulticastZáň, Stanislav January 2008 (has links)
This work deals with questions of IP net communication, with using metod Source-Specific Multicast. It focused on questions of registration, unregistration and administration of clients in multicast group and IGMP protocol, which is made for this communication. Work deals also about problems of signalization between source of data and clients of multicast group.In the introduction of this work questions of communications in multicast are analyse. I tis followed by the chapter focused on the specific metod of multicast – Source Specific Multicast (SSM). Next chapter is based on the protocols in SSM, which are used for distribution of data stream in source to clients direction and also in the reverse direction. The net chapter deals with signalization of communication in SSM. It specializes for reflection and summarization metods, which are used here. This chapter also shows basic matematics formules for sending signalization packets and proposes other solutions and ways for simplify communication and minimalize delay, which is for signallization very important. After that, the work deals with differences between these two metods of signalization. The aplicationis builded in practical part of work from the knowlidge of theory from previous chapters and it simulates the real communication between data source and clients situated to multicast group. This communication is explained in MSC diagrams. The aplication also simulates both used metods of signalization and real count of cients in multicast group. The results of simulation are interpreted in the last part of work.
|
109 |
Surmize: An Online NLP System for Close-Domain Question-Answering and SummarizationBergkvist, Alexander, Hedberg, Nils, Rollino, Sebastian, Sagen, Markus January 2020 (has links)
The amount of data available and consumed by people globally is growing. To reduce mental fatigue and increase the general ability to gain insight into complex texts or documents, we have developed an application to aid in this task. The application allows users to upload documents and ask domain-specific questions about them using our web application. A summarized version of each document is presented to the user, which could further facilitate their understanding of the document and guide them towards what types of questions could be relevant to ask. Our application allows users flexibility with the types of documents that can be processed, it is publicly available, stores no user data, and uses state-of-the-art models for its summaries and answers. The result is an application that yields near human-level intuition for answering questions in certain isolated cases, such as Wikipedia and news articles, as well as some scientific texts. The application shows a decrease in reliability and its prediction as to the complexity of the subject, the number of words in the document, and grammatical inconsistency in the questions increases. These are all aspects that can be improved further if used in production. / Mängden data som är tillgänglig och konsumeras av människor växer globalt. För att minska den mentala trötthet och öka den allmänna förmågan att få insikt i komplexa, massiva texter eller dokument, har vi utvecklat en applikation för att bistå i de uppgifterna. Applikationen tillåter användare att ladda upp dokument och fråga kontextspecifika frågor via vår webbapplikation. En sammanfattad version av varje dokument presenteras till användaren, vilket kan ytterligare förenkla förståelsen av ett dokument och vägleda dem mot vad som kan vara relevanta frågor att ställa. Vår applikation ger användare möjligheten att behandla olika typer av dokument, är tillgänglig för alla, sparar ingen personlig data, och använder de senaste modellerna inom språkbehandling för dess sammanfattningar och svar. Resultatet är en applikation som når en nära mänsklig intuition för vissa domäner och frågor, som exempelvis Wikipedia- och nyhetsartiklar, samt viss vetensaplig text. Noterade undantag för tillämpningen härrör från ämnets komplexitet, grammatiska korrekthet för frågorna och dokumentets längd. Dessa är områden som kan förbättras ytterligare om den används i produktionen.
|
110 |
Sumarizace genových expresních čipů z volně žijících druhů / Summarization of gene expression arrays from free living speciesTuma, Vojtěch January 2016 (has links)
Gene expression arrays are used to assess expression of exons and genes of orga- nisms. The design of expression arrays is based on a genome of laboratory strains of model organisms. The most frequent summarization algorithms used to pro- cess data from measurements are gcRMA, PLER and IterPLIER. When using expression arrays to research free living species, the measured values are influen- ced by differences in genomes of free living and model organisms. We propose a method to improve the results by removing parts of genomes influenced by known differences between species from the summarization. Removing influenced parts can improve summarization, especially on exon level. 1
|
Page generated in 0.0381 seconds