• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 10
  • 8
  • 5
  • 2
  • Tagged with
  • 27
  • 6
  • 6
  • 5
  • 5
  • 5
  • 5
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

The practical value of classification summaries in information management and integration

Rozman, Darija 12 1900 (has links)
The author discusses the value and importance of using short extracts from classification tables to support subject access management. While detailed classification is time consuming, complex and costly, the classification of documents into broader classes is a simpler and easier way of achieving meaningful and useful subject organization. The paper outlines the role of this type of classification use in bibliographic listings, in the organization and representation of physical documents, in the presentation of web resources, in statistical reports in collection development and use, and, last but not least, in information integration in a networked environment. This approach of subject classification is illustrated by the Slovenian union catalogue COBISS/OPAC in which a standardized set of UDC codes is used. The author emphasizes the importance of this outline for the homogeneity and continuity of the use of UDC in Slovenia and explains how this may be weakened by the changes in the top level of UDC.
2

DescribeX: A Framework for Exploring and Querying XML Web Collections

Rizzolo, Flavio Carlos 26 February 2009 (has links)
The nature of semistructured data in web collections is evolving. Even when XML web documents are valid with regard to a schema, the actual structure of such documents exhibits significant variations across collections for several reasons: an XML schema may be very lax (e.g., to accommodate the flexibility needed to represent collections of documents in RSS feeds), a schema may be large and different subsets used for different documents (e.g., this is common in industry standards like UBL), or open content models may allow arbitrary schemas to be mixed (e.g., RSS extensions like those used for podcasting). A schema alone may not provide sufficient information for many data management tasks that require knowledge of the actual structure of the collection. Web applications (such as processing RSS feeds or web service messages) rely on XPath-based data manipulation tools. Web developers need to use XPath queries effectively on increasingly larger web collections containing hundreds of thousands of XML documents. Even when tasks only need to deal with a single document at a time, developers benefit from understanding the behaviour of XPath expressions across multiple documents (e.g., what will a query return when run over the thousands of hourly feeds collected during the last few months?). Dealing with the (highly variable) structure of such web collections poses additional challenges. This thesis introduces DescribeX, a powerful framework that is capable of describing arbitrarily complex XML summaries of web collections, providing support for more efficient evaluation of XPath workloads. DescribeX permits the declarative description of document structure using all axes and language constructs in XPath, and generalizes many of the XML indexing and summarization approaches in the literature. DescribeX supports the construction of heterogenous summaries where different document elements sharing a common structure can be declaratively defined and refined by means of path regular expressions on axes, or axis path regular expression (AxPREs). DescribeX can significantly help in the understanding of both the structure of complex, heterogeneous XML collections and the behaviour of XPath queries evaluated on them. Experimental results demonstrate the scalability of DescribeX summary refinements and stabilizations (the key enablers for tailoring summaries) with multi-gigabyte web collections. A comparative study suggests that using a DescribeX summary created from a given workload can produce query evaluation times orders of magnitude better than using existing summaries. DescribeX’s light-weight approach of combining summaries with a file-at-a-time XPath processor can be a very competitive alternative, in terms of performance, to conventional fully-fledged XML query engines that provide DB-like functionality such as security, transaction processing, and native storage.
3

DescribeX: A Framework for Exploring and Querying XML Web Collections

Rizzolo, Flavio Carlos 26 February 2009 (has links)
The nature of semistructured data in web collections is evolving. Even when XML web documents are valid with regard to a schema, the actual structure of such documents exhibits significant variations across collections for several reasons: an XML schema may be very lax (e.g., to accommodate the flexibility needed to represent collections of documents in RSS feeds), a schema may be large and different subsets used for different documents (e.g., this is common in industry standards like UBL), or open content models may allow arbitrary schemas to be mixed (e.g., RSS extensions like those used for podcasting). A schema alone may not provide sufficient information for many data management tasks that require knowledge of the actual structure of the collection. Web applications (such as processing RSS feeds or web service messages) rely on XPath-based data manipulation tools. Web developers need to use XPath queries effectively on increasingly larger web collections containing hundreds of thousands of XML documents. Even when tasks only need to deal with a single document at a time, developers benefit from understanding the behaviour of XPath expressions across multiple documents (e.g., what will a query return when run over the thousands of hourly feeds collected during the last few months?). Dealing with the (highly variable) structure of such web collections poses additional challenges. This thesis introduces DescribeX, a powerful framework that is capable of describing arbitrarily complex XML summaries of web collections, providing support for more efficient evaluation of XPath workloads. DescribeX permits the declarative description of document structure using all axes and language constructs in XPath, and generalizes many of the XML indexing and summarization approaches in the literature. DescribeX supports the construction of heterogenous summaries where different document elements sharing a common structure can be declaratively defined and refined by means of path regular expressions on axes, or axis path regular expression (AxPREs). DescribeX can significantly help in the understanding of both the structure of complex, heterogeneous XML collections and the behaviour of XPath queries evaluated on them. Experimental results demonstrate the scalability of DescribeX summary refinements and stabilizations (the key enablers for tailoring summaries) with multi-gigabyte web collections. A comparative study suggests that using a DescribeX summary created from a given workload can produce query evaluation times orders of magnitude better than using existing summaries. DescribeX’s light-weight approach of combining summaries with a file-at-a-time XPath processor can be a very competitive alternative, in terms of performance, to conventional fully-fledged XML query engines that provide DB-like functionality such as security, transaction processing, and native storage.
4

IMPROVED DOCUMENT SUMMARIZATION AND TAG CLOUDS VIA SINGULAR VALUE DECOMPOSITION

Provost, JAMES 25 September 2008 (has links)
Automated summarization is a difficult task. World-class summarizers can provide only "best guesses" of which sentences encapsulate the important content from within a set of documents. As automated systems continue to improve, users are still not given the means to observe complex relationships between seemingly independent concepts. In this research we used singular value decompositions to organize concepts and determine the best candidate sentences for an automated summary. The results from this straightforward attempt were comparable to world-class summarizers. We then included a clustered tag cloud, using a singular value decomposition to measure term "interestingness" with respect to the set of documents. The combination of best candidate sentences and tag clouds provided a more inclusive summary than a traditionally-developed summarizer alone. / Thesis (Master, Computing) -- Queen's University, 2008-09-24 16:31:25.261
5

O gênero resumo na universidade : dialogismo e responsividade em resumos de alunos ingressantes / The gender summary at university : dialogism and responsivity in newcomer students'summaries

Costa, Cristina Fontes de Paula, 1988- 23 August 2018 (has links)
Orientador: Raquel Salek Fiad / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Estudos da Linguagem / Made available in DSpace on 2018-08-23T10:59:33Z (GMT). No. of bitstreams: 1 Costa_CristinaFontesdePaula_M.pdf: 1642439 bytes, checksum: a6b113eb999257cf38cd046cc310b0a1 (MD5) Previous issue date: 2013 / Resumo: Neste trabalho, fazemos uma discussão sobre um gênero muito comum na esfera escolar e acadêmica: o resumo. Estabelecendo um diálogo entre a orientação dialógica, de Bakhtin (1990, 1992), e o conceito de práticas de letramento, de Street (2003), o objetivo de nosso trabalho é observar, a partir de resumos produzidos a partir de um artigo de opinião, por alunos ingressantes que participaram de um programa universitário, indícios de práticas de leitura e escrita anteriores à esfera acadêmica, assim como detectar, a partir da materialidade linguística, diálogos com textos, gêneros e práticas. Com um olhar de Sherlock Holmes adotamos, como metodologia, o paradigma indiciário, proposto por Ginzburg (1989), que nos guiou para o particular, para a singularidade de cada texto. Em todas as análises, ressaltamos o caráter dialógico da linguagem; assim, fazemos uma análise do próprio texto base que, pelo fato de ser um artigo de opinião escrito por um cientista, já representa um diálogo entre as esferas jornalística e científica. Nos resumos, detectamos o diálogo principal com três práticas escolares de escrita: o próprio gênero resumo, a dissertação e os gêneros de divulgação científica. Constatamos ser a escrita de um resumo uma atividade complexa que envolve conhecimento e familiaridade com o gênero do texto base, diálogos com o já dito (e com o ainda por dizer), com outras esferas, com práticas de escrita e leitura e com a avaliação a ser feita pelo professor, sempre presente em qualquer prática de escrita na esfera escolar. Levando em conta que qualquer manifestação de singularidade é sempre uma resposta ao contexto social, podemos pensar nos diálogos como uma resposta à instituição escolar e suas práticas. Observamos que o diálogo com as práticas escolares remete ao momento histórico dos alunos, ingressantes na universidade, que não se desvencilham de uma hora para outra de práticas de letramento anteriores. Observamos, também, que a maior parte dos resumos analisados dialoga com práticas autoritárias de leitura e escrita, em que o aluno deve apenas reproduzir, revozear o já-dito. Esse fato diz muito sobre a produção de textos da escola, que pode estar privilegiando a reprodução e não a reflexão / Abstract: In this paper, we discuss a very common gender in scholar and academic sphere: the summary. Establishing a dialogue between the dialogic orientation, from Bakhtin (1990, 1992), and the literacy practices concept, from Street (2003), the goal of this work is to observe, in summaries produced from an opinion article, by students who participated of an undergraduate program, indications of anterior reading and writing practices, and also to detect, through linguistic materiality, dialogues with texts, genres and practices. With a Sherlock Holmes look we adopt, as a methodology, the paradigm of indication, proposed by Ginzburg (1989), which guided us for the particular, for the singularity of each text. In all analyses, we highlight the dialogic nature of language; to do so, we make an analyze of the base text, an opinion article written by a scientist, that represents a dialogue between journalistic and scientific spheres. In the summaries, we detect a main dialogue with three writing school practices: the genre summary itself, the essay and scientific popularization genres. We established that the writing of a summary is a complex activity which involves knowledge and familiarity with the base text genre, dialogues with the told (and what yet has not been told), with other spheres, reading and writing practices and with the assessment made by the teacher, always present in any writing practice in the school domain. Taking into account that any manifestation of singularity is always an answer to social context, we can consider the dialogues as an answer to the scholar institution and its practices. We see that the dialogue with school practices is related to the students historical moment, since they are first year university students and don't suddenly do away with anterior literacy practices. We also see that the most part of the analyzed summaries dialogue with authoritarian practices of reading and writing, in which the student only has to reproduce, retell what has been already told. This fact means much about text production at school, that could be privileging reproduction and not thinking / Mestrado / Lingua Materna / Mestra em Linguística Aplicada
6

Fouille de données par extraction de motifs graduels : contextualisation et enrichissement / Data mining based on gradual itemsets extraction : contextualization and enrichment

Oudni, Amal 09 July 2014 (has links)
Les travaux de cette thèse s'inscrivent dans le cadre de l'extraction de connaissances et de la fouille de données appliquée à des bases de données numériques ou floues afin d'extraire des résumés linguistiques sous la forme de motifs graduels exprimant des corrélations de co-variations des valeurs des attributs, de la forme « plus la température augmente, plus la pression augmente ». Notre objectif est de les contextualiser et de les enrichir en proposant différents types de compléments d'information afin d'augmenter leur qualité et leur apporter une meilleure interprétation. Nous proposons quatre formes de nouveaux motifs : nous avons tout d'abord étudié les motifs dits « renforcés », qui effectuent, dans le cas de données floues, une contextualisation par intégration d'attributs complémentaires, ajoutant des clauses introduites linguistiquement par l'expression « d'autant plus que ». Ils peuvent être illustrés par l'exemple « plus la température diminue, plus le volume de l'air diminue, d'autant plus que sa densité augmente ». Ce renforcement est interprété comme validité accrue des motifs graduels. Nous nous sommes également intéressées à la transposition de la notion de renforcement aux règles d'association classiques en discutant de leurs interprétations possibles et nous montrons leur apport limité. Nous proposons ensuite de traiter le problème des motifs graduels contradictoires rencontré par exemple lors de l'extraction simultanée des deux motifs « plus la température augmente, plus l'humidité augmente » et « plus la température augmente, plus l'humidité diminue ». Pour gérer ces contradictions, nous proposons une définition contrainte du support d'un motif graduel, qui, en particulier, ne dépend pas uniquement du motif considéré, mais aussi de ses contradicteurs potentiels. Nous proposons également deux méthodes d'extraction, respectivement basées sur un filtrage a posteriori et sur l'intégration de la contrainte du nouveau support dans le processus de génération. Nous introduisons également les motifs graduels caractérisés, définis par l'ajout d'une clause linguistiquement introduite par l'expression « surtout si » comme par exemple « plus la température diminue, plus l'humidité diminue, surtout si la température varie dans [0, 10] °C » : la clause additionnelle précise des plages de valeurs sur lesquelles la validité des motifs est accrue. Nous formalisons la qualité de cet enrichissement comme un compromis entre deux contraintes imposées à l'intervalle identifié, portant sur sa taille et sa validité, ainsi qu'une extension tenant compte de la densité des données.Nous proposons une méthode d'extraction automatique basée sur des outils de morphologie mathématique et la définition d'un filtre approprié et transcription. / This thesis's works belongs to the framework of knowledge extraction and data mining applied to numerical or fuzzy data in order to extract linguistic summaries in the form of gradual itemsets: the latter express correlation between attribute values of the form « the more the temperature increases, the more the pressure increases ». Our goal is to contextualize and enrich these gradual itemsets by proposing different types of additional information so as to increase their quality and provide a better interpretation. We propose four types of new itemsets: first of all, reinforced gradual itemsets, in the case of fuzzy data, perform a contextualization by integrating additional attributes linguistically introduced by the expression « all the more ». They can be illustrated by the example « the more the temperature decreases, the more the volume of air decreases, all the more its density increases ». Reinforcement is interpreted as increased validity of the gradual itemset. In addition, we study the extension of the concept of reinforcement to association rules, discussing their possible interpretations and showing their limited contribution. We then propose to process the contradictory itemsets that arise for example in the case of simultaneous extraction of « the more the temperature increases, the more the humidity increases » and « the more the temperature increases, the less the humidity decreases ». To manage these contradictions, we define a constrained variant of the gradual itemset support, which, in particular, does not only depend on the considered itemset, but also on its potential contradictors. We also propose two extraction methods: the first one consists in filtering, after all itemsets have been generated, and the second one integrates the filtering process within the generation step. We introduce characterized gradual itemsets, defined by adding a clause linguistically introduced by the expression « especially if » that can be illustrated by a sentence such as « the more the temperature decreases, the more the humidity decreases, especially if the temperature varies in [0, 10] °C »: the additional clause precise value ranges on which the validity of the itemset is increased. We formalize the quality of this enrichment as a trade-off between two constraints imposed to identified interval, namely a high validity and a high size, as well as an extension taking into account the data density. We propose a method to automatically extract characterized gradual based on appropriate mathematical morphology tools and the definition of an appropriate filter and transcription.
7

Semantics-driven Abstractive Document Summarization

Alambo, Amanuel 02 August 2022 (has links)
No description available.
8

Système symbolique de création de résumés de mise à jour

Genest, Pierre-Étienne January 2009 (has links)
Mémoire numérisé par la Division de la gestion de documents et des archives de l'Université de Montréal.
9

Resumo de artigo de opinião na perspectiva dos estudos linguísticos da microestrutura e da macroestrutura textual

Moraes, Otávio Brasil de 07 August 2017 (has links)
Submitted by Filipe dos Santos (fsantos@pucsp.br) on 2017-08-17T13:21:00Z No. of bitstreams: 1 Otávio Brasil de Moraes.pdf: 5548770 bytes, checksum: f76d20e7f7854b63714dd8e1030f214f (MD5) / Made available in DSpace on 2017-08-17T13:21:00Z (GMT). No. of bitstreams: 1 Otávio Brasil de Moraes.pdf: 5548770 bytes, checksum: f76d20e7f7854b63714dd8e1030f214f (MD5) Previous issue date: 2017-08-07 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / Fundação de Amparo à Pesquisa do Estado do Amazonas - FAPEAM / In this dissertation, our general objective is to propose the usage of the notions of microstructures and macrostructures in the production of summaries of opinion articles. Throughout the research, developed together with students of a secondary school in the city of Manaus, we have highlighted significant differences between summaries produced in the “traditional” manner and those produced according to our proposal. Our theoretical framework is Textual Linguistics of text grammars, especially the proposal by Van Dijk (1996). We have also considered the concept of text developed in the 1970’s and 80’s by Van Dijk and Kinstch. Moreover, we have highlighted more recent studies, such as Marquesi (2004), Delphino (1991) and Machado (2004). Methodologically, the research was developed as follows: initially, in the first class, we requested that students produced an abstract of an opinion article following the traditional perspective in teaching how to produce summaries, in which it is generally emphasized only the selection of the main ideas in a text. In the second class, we discussed the concepts of microstructures and macrostructures for the production of summaries, and, in the third class, we requested that students produced a second summary using those notions. We then analyzed twenty-eight summaries produced by 14 students, which allowed us to verify positive contributions from the proposal of working with textual macrostructures. The data analysis chapter contains as examples 6 summaries chosen among the 28 that were analyzed, of which 3 (three) were written with the traditional perspective and the other 3 (three) were written in the framework of microstructures and macrostructures. After, we discuss the results, which clearly point out the contributions of the studies of textual microstructures and macrostructures to the production of summaries / Nesta dissertação, temos como objetivo geral propor a utilização das noções de micro e macroestruturas para a produção de resumos de artigos de opinião. Ao longo da pesquisa, desenvolvida junto aos discentes do ensino médio de uma escola estadual da cidade de Manaus, destacamos diferenças significativas entre os resumos produzidos de forma “tradicional” e aqueles produzidos segundo a nossa proposta. Para tanto, tomamos como base teórica a Linguística Textual das gramáticas de texto, sobretudo a proposta de van Dijk (1996). Consideramos também o conceito de texto desenvolvido durante as décadas de 1970 e 1980 por van Dijk e Kinstch. Ademais, destacamos trabalhos mais recentes que tratam da produção de resumos de textos segundo a perspectiva desses estudos, tais como Marquesi (2004), Leite (2006), Delphino (1991) e Machado (2004). Metodologicamente, a pesquisa se desenvolveu da seguinte forma: inicialmente, na primeira aula, solicitamos aos alunos que produzissem um resumo de um artigo de opinião, obedecendo à perspectiva tradicional de ensino de produção de resumo, na qual de modo geral se enfatiza apenas a identificação das ideias principais do texto; na segunda aula, trabalhamos os conceitos de micro e macroestruturas para a produção de textos-resumo e, na terceira aula, solicitamos aos discentes que produzissem um segundo resumo, no caso considerando a utilização das noções de micro e macroestrutura. Após essa etapa, analisamos vinte e oito resumos produzidos por 14 alunos, o que nos permitiu verificar os pontos positivos da proposta que considera o trabalho com as macroestruturas textuais. No capítulo de análise, apresentamos a título de exemplificação a análise de 6 (seis) resumos escolhidos entre os 28 analisados, sendo 3 (três) elaborados segundo a perspectiva tradicional de ensino de produção de resumo e 3 (três) elaborados pelo viés micro/ macroestrutura. Em seguida, discutimos os resultados que sinalizaram claramente as contribuições da perspectiva que considera os estudos de micro e macroestrutura textual para a produção de resumos
10

Plágio na constituição de autoria: análise da produção acadêmica de resenhas e resumos publicados na Internet

Oliveira, Marta Melo de 14 March 2007 (has links)
Made available in DSpace on 2016-03-15T19:45:35Z (GMT). No. of bitstreams: 1 Marta Melo de Oliveira.pdf: 848558 bytes, checksum: 8ad90b4fefdf05b22d469f2d55c08771 (MD5) Previous issue date: 2007-03-14 / This work analysis the occurrence of plagiarism or half plagiarism in the constitution of authorship in a corpus composed of summaries and reports written and published (in the internet) by young college students as extra classes activities. We tried to control the conditions of texts productions according to the criteria suggested by Geraldi (2004) and Garcez (1998). We worked with the line of research Discursive process and Textual production in order to secure a theoretical anchorage that could give support to the purpose of describing the authorship constitution of academic texts; to verify the influence of the interlocutor in the textualdiscursive performance of the generating subject; to show the occurrence of plagiarism and half plagiarism in the corpus as a speech phenomena tied to the discursive miscellaneous and to the inter speech. The analytical textual practice of the corpus was submitted the ISD, perspective of Bronckart and the verification of the effectiveness of the summarization to the accomplishment of the academic summaries and reports was brought about by the rules/strategies of van Dijk and Sprenger-Charoles. / Este trabalho analisa a ocorrência de plágio ou meio plágio na constituição de autoria em um corpus formado por resumos e resenhas produzidos e publicados (na internet) por jovens universitários, como atividade extraclasse. Procuramos controlar as condições de produção dos textos, segundo critérios sugeridos por Geraldi (2004) e Garcez (1998). Trabalhamos com a linha de pesquisa o processo discursivo e a produção textual a fim de assegurar uma ancoragem teórica que desse sustentação aos objetivos de descrever a constituição da autoria no texto acadêmico; verificar a influência do interlocutor no desempenho textual-discursivo do sujeito produtor; mostrar a ocorrência de plágio e meio plágio no corpus como um fenômeno discursivo vinculado à heterogeneidade discursiva e ao interdiscurso. A prática analítica textual do corpus se submeteu à perspectiva interacionista sociodiscursiva de Bronckart e a verificação da eficácia da sumarização para a realização dos resumos e resenhas acadêmicos foi efetivada pelas estratégias de van Dijk e de Sprenger-Charolles.

Page generated in 0.0527 seconds