Global ETD Search

1	The Cluster Hypothesis: A Visual/Statistical Analysis Sullivan, Terry 05 1900 (has links) By allowing judgments based on a small number of exemplar documents to be applied to a larger number of unexamined documents, clustered presentation of search results represents an intuitively attractive possibility for reducing the cognitive resource demands on human users of information retrieval systems. However, clustered presentation of search results is sensible only to the extent that naturally occurring similarity relationships among documents correspond to topically coherent clusters. The Cluster Hypothesis posits just such a systematic relationship between document similarity and topical relevance. To date, experimental validation of the Cluster Hypothesis has proved problematic, with collection-specific results both supporting and failing to support this fundamental theoretical postulate. The present study consists of two computational information visualization experiments, representing a two-tiered test of the Cluster Hypothesis under adverse conditions. Both experiments rely on multidimensionally scaled representations of interdocument similarity matrices. Experiment 1 is a term-reduction condition, in which descriptive titles are extracted from Associated Press news stories drawn from the TREC information retrieval test collection. The clustering behavior of these titles is compared to the behavior of the corresponding full text via statistical analysis of the visual characteristics of a two-dimensional similarity map. Experiment 2 is a dimensionality reduction condition, in which inter-item similarity coefficients for full text documents are scaled into a single dimension and then rendered as a two-dimensional visualization; the clustering behavior of relevant documents within these unidimensionally scaled representations is examined via visual and statistical methods. Taken as a whole, results of both experiments lend strong though not unqualified support to the Cluster Hypothesis. In Experiment 1, semantically meaningful 6.6-word document surrogates systematically conform to the predictions of the Cluster Hypothesis. In Experiment 2, the majority of the unidimensionally scaled datasets exhibit a marked nonuniformity of distribution of relevant documents, further supporting the Cluster Hypothesis. Results of the two experiments are profoundly question-specific. Post hoc analyses suggest that it may be possible to predict the success of clustered searching based on the lexical characteristics of users' natural-language expression of their information need. Information retrieval. Database searching. Document search Document collection Information theory
2	On the effect of INQUERY term-weighting scheme on query-sensitive similarity measures Kini, Ananth Ullal 12 April 2006 (has links) Cluster-based information retrieval systems often use a similarity measure to compute the association among text documents. In this thesis, we focus on a class of similarity measures named Query-Sensitive Similarity (QSS) measures. Recent studies have shown QSS measures to positively influence the outcome of a clustering procedure. These studies have used QSS measures in conjunction with the ltc term-weighting scheme. Several term-weighting schemes have superseded the ltc term-weighing scheme and demonstrated better retrieval performance relative to the latter. We test whether introducing one of these schemes, INQUERY, will offer any benefit over the ltc scheme when used in the context of QSS measures. The testing procedure uses the Nearest Neighbor (NN) test to quantify the clustering effectiveness of QSS measures and the corresponding term-weighting scheme. The NN tests are applied on certain standard test document collections and the results are tested for statistical significance. On analyzing results of the NN test relative to those obtained for the ltc scheme, we find several instances where the INQUERY scheme improves the clustering effectiveness of QSS measures. To be able to apply the NN test, we designed a software test framework, Ferret, by complementing the features provided by dtSearch, a search engine. The test framework automates the generation of NN coefficients by processing standard test document collection data. We provide an insight into the construction and working of the Ferret test framework. Information Retrieval Term-Weighting Similarity Measure INQUERY ltc Query-Sensitive Similarity Clustering Nearest Neighbors Document Collection Text Search
3	Livsviktig läsundervisning i f–1 : Mötet mellan styrdokument, läromedelsförfattare och lärare / Essential reading instruction in year f–1 : The interception of policy documents, educational author and teachers Sandquist, Lisa, Siebing, Lina January 2019 (has links) Syftet med studien är att undersöka vad införandet av obligatorisk förskoleklass innebär för läsundervisningen i årskurs f–1. Studiens teoretiska ram utgår från läroplans- och ramfaktorteorin. Studiens data omfattas av olika regeringsförslag, den nya läroplanens framskrivning gällande förskoleklass och läsning i f–1 följt av åtta semistrukturerade intervjuer med fyra läromedelsförfattare och fyra verksamma f–1 lärare. Insamlade data analyserades utifrån Lindes (2012) tolkning av läroplansteorins tre arenor: formuleringsarenan, transformeringsarenan och realiseringsarenan. De tre arenorna samt ramfaktorer som läroplan, tid och behörighet är centrala i analysen av resultatet. Resultatet visar att införandet av en obligatorisk förskoleklass har potential att bidra till många fördelar gällande den livsviktiga läsundervisningen och den enskilda elevens läsutveckling, då fokus nu kan läggas på läsundervisningens progression och lärare kan fånga upp de svaga eleverna i ett tidigare skede. Studiens resultat visar också att läroplanens framskrivning för förskoleklass kan komma att bidra till en utökad undervisningstid i förskoleklass, eftersom förskoleklassens timplan fortfarande varierar från skola till skola trots det obligatoriska införandet. Det råder dock delade meningar om vad införandet av obligatorisk förskoleklass innebär för läsundervisningen i årskurs f–1 och hur vidare läsundervisningen kommer att förändras eller bedrivas på samma sätt som innan. Slutligen indikerar studiens resultat att det finns vissa oklarheter om vem som ska arbeta i förskoleklass, lärare eller förskollärare, och att det beslutet kan bära med sig konsekvenser för att kunna uppnå en likvärdig skola för alla. Läsundervisning obligatorisk förskoleklass läroplansteori ramfaktorteori dokumentinsamling intervju Reading instruction compulsory preschool class curriculum theory framework factor theory document collection interview Humanities and the Arts Humaniora och konst
4	Algoritmy pro shlukování textových dat / Text data clustering algorithms Sedláček, Josef January 2011 (has links) The thesis deals with text mining. It describes the theory of text document clustering as well as algorithms used for clustering. This theory serves as a basis for developing an application for clustering text data. The application is developed in Java programming language and contains three methods used for clustering. The user can choose which method will be used for clustering the collection of documents. The implemented methods are K medoids, BiSec K medoids, and SOM (self-organization maps). The application also includes a validation set, which was specially created for the diploma thesis and it is used for testing the algorithms. Finally, the algorithms are compared according to obtained results.

1

Page generated in 0.1174 seconds