• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 4
  • 4
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

The Cluster Hypothesis: A Visual/Statistical Analysis

Sullivan, Terry 05 1900 (has links)
By allowing judgments based on a small number of exemplar documents to be applied to a larger number of unexamined documents, clustered presentation of search results represents an intuitively attractive possibility for reducing the cognitive resource demands on human users of information retrieval systems. However, clustered presentation of search results is sensible only to the extent that naturally occurring similarity relationships among documents correspond to topically coherent clusters. The Cluster Hypothesis posits just such a systematic relationship between document similarity and topical relevance. To date, experimental validation of the Cluster Hypothesis has proved problematic, with collection-specific results both supporting and failing to support this fundamental theoretical postulate. The present study consists of two computational information visualization experiments, representing a two-tiered test of the Cluster Hypothesis under adverse conditions. Both experiments rely on multidimensionally scaled representations of interdocument similarity matrices. Experiment 1 is a term-reduction condition, in which descriptive titles are extracted from Associated Press news stories drawn from the TREC information retrieval test collection. The clustering behavior of these titles is compared to the behavior of the corresponding full text via statistical analysis of the visual characteristics of a two-dimensional similarity map. Experiment 2 is a dimensionality reduction condition, in which inter-item similarity coefficients for full text documents are scaled into a single dimension and then rendered as a two-dimensional visualization; the clustering behavior of relevant documents within these unidimensionally scaled representations is examined via visual and statistical methods. Taken as a whole, results of both experiments lend strong though not unqualified support to the Cluster Hypothesis. In Experiment 1, semantically meaningful 6.6-word document surrogates systematically conform to the predictions of the Cluster Hypothesis. In Experiment 2, the majority of the unidimensionally scaled datasets exhibit a marked nonuniformity of distribution of relevant documents, further supporting the Cluster Hypothesis. Results of the two experiments are profoundly question-specific. Post hoc analyses suggest that it may be possible to predict the success of clustered searching based on the lexical characteristics of users' natural-language expression of their information need.
2

On the effect of INQUERY term-weighting scheme on query-sensitive similarity measures

Kini, Ananth Ullal 12 April 2006 (has links)
Cluster-based information retrieval systems often use a similarity measure to compute the association among text documents. In this thesis, we focus on a class of similarity measures named Query-Sensitive Similarity (QSS) measures. Recent studies have shown QSS measures to positively influence the outcome of a clustering procedure. These studies have used QSS measures in conjunction with the ltc term-weighting scheme. Several term-weighting schemes have superseded the ltc term-weighing scheme and demonstrated better retrieval performance relative to the latter. We test whether introducing one of these schemes, INQUERY, will offer any benefit over the ltc scheme when used in the context of QSS measures. The testing procedure uses the Nearest Neighbor (NN) test to quantify the clustering effectiveness of QSS measures and the corresponding term-weighting scheme. The NN tests are applied on certain standard test document collections and the results are tested for statistical significance. On analyzing results of the NN test relative to those obtained for the ltc scheme, we find several instances where the INQUERY scheme improves the clustering effectiveness of QSS measures. To be able to apply the NN test, we designed a software test framework, Ferret, by complementing the features provided by dtSearch, a search engine. The test framework automates the generation of NN coefficients by processing standard test document collection data. We provide an insight into the construction and working of the Ferret test framework.
3

Livsviktig läsundervisning i f–1 : Mötet mellan styrdokument, läromedelsförfattare och lärare / Essential reading instruction in year f–1 : The interception of policy documents, educational author and teachers

Sandquist, Lisa, Siebing, Lina January 2019 (has links)
Syftet med studien är att undersöka vad införandet av obligatorisk förskoleklass innebär för läsundervisningen i årskurs f–1. Studiens teoretiska ram utgår från läroplans- och ramfaktorteorin. Studiens data omfattas av olika regeringsförslag, den nya läroplanens framskrivning gällande förskoleklass och läsning i f–1 följt av åtta semistrukturerade intervjuer med fyra läromedelsförfattare och fyra verksamma f–1 lärare. Insamlade data analyserades utifrån Lindes (2012) tolkning av läroplansteorins tre arenor: formuleringsarenan, transformeringsarenan och realiseringsarenan. De tre arenorna samt ramfaktorer som läroplan, tid och behörighet är centrala i analysen av resultatet. Resultatet visar att införandet av en obligatorisk förskoleklass har potential att bidra till många fördelar gällande den livsviktiga läsundervisningen och den enskilda elevens läsutveckling, då fokus nu kan läggas på läsundervisningens progression och lärare kan fånga upp de svaga eleverna i ett tidigare skede. Studiens resultat visar också att läroplanens framskrivning för förskoleklass kan komma att bidra till en utökad undervisningstid i förskoleklass, eftersom förskoleklassens timplan fortfarande varierar från skola till skola trots det obligatoriska införandet. Det råder dock delade meningar om vad införandet av obligatorisk förskoleklass innebär för läsundervisningen i årskurs f–1 och hur vidare läsundervisningen kommer att förändras eller bedrivas på samma sätt som innan. Slutligen indikerar studiens resultat att det finns vissa oklarheter om vem som ska arbeta i förskoleklass, lärare eller förskollärare, och att det beslutet kan bära med sig konsekvenser för att kunna uppnå en likvärdig skola för alla.
4

Algoritmy pro shlukování textových dat / Text data clustering algorithms

Sedláček, Josef January 2011 (has links)
The thesis deals with text mining. It describes the theory of text document clustering as well as algorithms used for clustering. This theory serves as a basis for developing an application for clustering text data. The application is developed in Java programming language and contains three methods used for clustering. The user can choose which method will be used for clustering the collection of documents. The implemented methods are K medoids, BiSec K medoids, and SOM (self-organization maps). The application also includes a validation set, which was specially created for the diploma thesis and it is used for testing the algorithms. Finally, the algorithms are compared according to obtained results.

Page generated in 0.101 seconds