Global ETD Search

61	Kaunokirjallisuuden sisällönkuvailun aspektit:kirjastoammattilaisten ja kirjastonkäyttäjien tekemien romaanien tiivistelmien ja asiasanoitusten yhdenmukaisuus Saarti, J. (Jarmo) 10 December 1999 (has links) Abstract The subject of this study is the content description of fictional works, especially novels. Study is divided into two sections. In the first part of it the aim is to investigate the communicative process of fiction, the previous studies on the content description of fiction and the making of thesauruses for fiction. In the second part the aim is to create means for the analysis of fictional content description (based on the first part's theoretical discussion) and to make a general model for fiction search and retrieval system. The material for the empirical part of the study was gathered in Finnish public libraries. The aim of the empirical part of the study is to find out how the clients and the library professionals of public libraries describe novels by indexing and abstracting them - what differences they have and what is the consistency between them. The basic theoretical approach in this study is qualitative and specifically as seen within the grounded theory. The corpus of this study was analyzed both with qualitative and quantitative methods. Qualitative methods were mainly used when analyzing the abstracts and the indexings were analyzed with the the basic statistical methods as well as with the calculated consistency values for indexers. The main finding was that the abstracts and indexings were very unconsistent. One could also typify the abstracts in four categories: plot/thematical abstracts, cultural/ historical abstracts, abstracts that describe the reading experience and critical abstracts. Also, with the aid of statistics one could make a typical indexing string of each novel that consisted about 10-15 indexing terms which described the basic contents of each novel. In the end of the study a model for search and retrieval system for fiction is presented. Methodologically, the triangulative approach with different kinds of methods turned out to be fruitful. Especially when studying the human behaviour, both quantitative and qualitative methods are needed. When used together they test the findings and results and thus give more validity to the final results. abstracts fiction grounded theory indexing
62	Automatic detection of shot boundaries in digital video Yusoff, Yusseri January 2002 (has links) This thesis describes the implementation of automatic shot boundary detection algorithms for the detection of cuts and gradual transitions in digital video sequences. The objective was to develop a fully automatic video segmentation system as a pre-processing step for video database retrieval management systems as well as other applications which has large video sequences as part of their systems. For die detection of cuts, we begin by looking at a set of baseline algorithms that look into measuring specific features of video images and calculating the dissimilarity of the measures between frames in the video sequence. We then propose two different approaches and compare them against the set of baseline algorithms. These approaches are themselves built upon the base set of algorithms. Observing that the baseline algorithms initially use hard thresholds to determine shot boundaries, we build Receiver Operating Characteristic (ROC) curves to plot the characteristics of the algorithms when varying the thresholds. In the first approach, we look into combining the multiple algorithms in such a way that as a collective, the detection of cuts are improved. The results of the fusion are then compared against the baseline algorithms on the ROC curve. For the second approach, we look into having adaptive thresholds for the baseline algorithms. A selection of adaptive thresholding methods were applied to the data set and compared with the baseline algorithms that are using hard thresholds. In the case of gradual transition detection, an application of a filtering technique used to detect ramp edges in images is adapted for use in video sequences. The approach is taken by starting with the observation that shot boundaries represent edges in time, with cuts being sharp edges and gradual transitions closely approximating ramp edges. The methods that we propose reflect our concentration on producing a reliable and efficient shot boundary detection mechanism. In each instance, be it for cuts or gradual transitions, we tested our algorithms on a comprehensive set of video sequences, containing a variety of content and obtained highly competitive results. 621 Video database indexing and retrieval
63	A Framework of Automatic Subject Term Assignment: An Indexing Conception-Based Approach Chung, EunKyung 12 1900 (has links) The purpose of dissertation is to examine whether the understandings of subject indexing processes conducted by human indexers have a positive impact on the effectiveness of automatic subject term assignment through text categorization (TC). More specifically, human indexers' subject indexing approaches or conceptions in conjunction with semantic sources were explored in the context of a typical scientific journal article data set. Based on the premise that subject indexing approaches or conceptions with semantic sources are important for automatic subject term assignment through TC, this study proposed an indexing conception-based framework. For the purpose of this study, three hypotheses were tested: 1) the effectiveness of semantic sources, 2) the effectiveness of an indexing conception-based framework, and 3) the effectiveness of each of three indexing conception-based approaches (the content-oriented, the document-oriented, and the domain-oriented approaches). The experiments were conducted using a support vector machine implementation in WEKA (Witten, & Frank, 2000). The experiment results pointed out that cited works, source title, and title were as effective as the full text, while keyword was found more effective than the full text. In addition, the findings showed that an indexing conception-based framework was more effective than the full text. Especially, the content-oriented and the document-oriented indexing approaches were found more effective than the full text. Among three indexing conception-based approaches, the content-oriented approach and the document-oriented approach were more effective than the domain-oriented approach. In other words, in the context of a typical scientific journal article data set, the objective contents and authors' intentions were more focused that the possible users' needs. The research findings of this study support that incorporation of human indexers' indexing approaches or conception in conjunction with semantic sources has a positive impact on the effectiveness of automatic subject term assignment. Automatic indexing. Indexing. Subject headings. subject indexing processes text categorization (TC) automatic subject term assignment subject indexing approaches
64	WebDoc an Automated Web Document Indexing System Tang, Bo 13 December 2002 (has links) This thesis describes WebDoc, an automated system that classifies Web documents according to the Library of Congress classification system. This work is an extension of an early version of the system that successfully generated indexes for journal articles. The unique features of Web documents, as well as how they will affect the design of a classification system, are discussed. We argue that full-text analysis of Web documents is inevitable, and contextual information must be used to assist the classification. The architecture of the WebDoc system is presented. We performed experiments on it with and without the assistance of contextual information. The results show that contextual information improved the system?s performance significantly. information retrieval Web document indexing
65	Improvement of automatic indexing through recognition of semantically equivalent syntactically different phrases / Aladesulu, Olorunfemi Stephen January 1985 (has links) No description available. Computer Science Automatic indexing Recognition
66	Video Indexing and Retrieval in Compressed Domain Using Fuzzy-Categorization. Fang, H., Qahwaji, Rami S.R., Jiang, Jianmin January 2006 (has links) No / There has been an increased interest in video indexing and retrieval in recent years. In this work, indexing and retrieval system of the visual contents is based on feature extracted from the compressed domain. Direct possessing of the compressed domain spares the decoding time, which is extremely important when indexing large number of multimedia archives. A fuzzy-categorizing structure is designed in this paper to improve the retrieval performance. In our experiment, a database that consists of basketball videos has been constructed for our study. This database includes three categories: full-court match, penalty and close-up. First, spatial and temporal feature extraction is applied to train the fuzzy membership functions using the minimum entropy optimal algorithm. Then, the max composition operation is used to generate a new fuzzy feature to represent the content of the shots. Finally, the fuzzy-based representation becomes the indexing feature for the content-based video retrieval system. The experimental results show that the proposal algorithm is quite promising for semantic-based video retrieval. Video indexing and retrieval Fuzzy-Categorization
67	Role of semantic indexing for text classification Sani, Sadiq January 2014 (has links) The Vector Space Model (VSM) of text representation suffers a number of limitations for text classification. Firstly, the VSM is based on the Bag-Of-Words (BOW) assumption where terms from the indexing vocabulary are treated independently of one another. However, the expressiveness of natural language means that lexically different terms often have related or even identical meanings. Thus, failure to take into account the semantic relatedness between terms means that document similarity is not properly captured in the VSM. To address this problem, semantic indexing approaches have been proposed for modelling the semantic relatedness between terms in document representations. Accordingly, in this thesis, we empirically review the impact of semantic indexing on text classification. This empirical review allows us to answer one important question: how beneficial is semantic indexing to text classification performance. We also carry out a detailed analysis of the semantic indexing process which allows us to identify reasons why semantic indexing may lead to poor text classification performance. Based on our findings, we propose a semantic indexing framework called Relevance Weighted Semantic Indexing (RWSI) that addresses the limitations identified in our analysis. RWSI uses relevance weights of terms to improve the semantic indexing of documents. A second problem with the VSM is the lack of supervision in the process of creating document representations. This arises from the fact that the VSM was originally designed for unsupervised document retrieval. An important feature of effective document representations is the ability to discriminate between relevant and non-relevant documents. For text classification, relevance information is explicitly available in the form of document class labels. Thus, more effective document vectors can be derived in a supervised manner by taking advantage of available class knowledge. Accordingly, we investigate approaches for utilising class knowledge for supervised indexing of documents. Firstly, we demonstrate how the RWSI framework can be utilised for assigning supervised weights to terms for supervised document indexing. Secondly, we present an approach called Supervised Sub-Spacing (S3) for supervised semantic indexing of documents. A further limitation of the standard VSM is that an indexing vocabulary that consists only of terms from the document collection is used for document representation. This is based on the assumption that terms alone are sufficient to model the meaning of text documents. However for certain classification tasks, terms are insufficient to adequately model the semantics needed for accurate document classification. A solution is to index documents using semantically rich concepts. Accordingly, we present an event extraction framework called Rule-Based Event Extractor (RUBEE) for identifying and utilising event information for concept-based indexing of incident reports. We also demonstrate how certain attributes of these events e.g. negation, can be taken into consideration to distinguish between documents that describe the occurrence of an event, and those that mention the non-occurrence of that event. 005.7
68	Building an Intelligent Filtering System Using Idea Indexing Yang, Li 08 1900 (has links) The widely used vector model maintains its popularity because of its simplicity, fast speed, and the appeal of using spatial proximity for semantic proximity. However, this model faces a disadvantage that is associated with the vagueness from keywords overlapping. Efforts have been made to improve the vector model. The research on improving document representation has been focused on four areas, namely, statistical co-occurrence of related items, forming term phrases, grouping of related words, and representing the content of documents. In this thesis, we propose the idea-indexing model to improve document representation for the filtering task in IR. The idea-indexing model matches document terms with the ideas they express and indexes the document with these ideas. This indexing scheme represents the document with its semantics instead of sets of independent terms. We show in this thesis that indexing with ideas leads to better performance. Information retrieval. Automatic indexing. Information retrieval vector model term indexing idea indexing
69	La automatización de la indización: propuesta teórica-metodológica. Aplicación en el área de biblioteconomía y documentación Gil Leiva, Isidoro 17 November 1997 (has links) Se expone un marco conceptual sobre la automatización de la indización concretado en su delimitación, los posicionamientos de los investigadores en Biblioteconomía y Documentación con respecto a estas indagaciones, el desarrollo diacrónico ocurrido en esta automatización, y en la explicitación de la interdisciplinariedad inherente a este proceso. Se presenta una propuesta teórico-metodológica para diseñar un procedimiento semiautomático para la indización de documentos sobre Biblioteconomía y Documentación constituído por cuatro módulos. En los tres primeros se preparan las fuentes utilizadas, se seleccionan los términos candidatos a descriptores y se valoran y ponderan dichos términos, mientras que en el cuarto módulo el usuario ejecuta una validación y edición interactiva de los resultados propuestos. El sistema se fundamenta en el uso de un vocabulario controlado sobre Biblioteconomía y Documentación construido para tal fin. La consistencia media obtenida entre la indización de cincuenta artículos analizados por indizadores de la Base de datos ISOC y por nuestra propuesta está dentro del rango de otros sistemas de indización automática. / A conceptual frame is exposed on automatic indexing concretized in his definition, the positionings of the investigators in Information Science with regard to these investigations and the development happened in this process. One methodological-theoretical offer is presented to design a semiautomatic indexing system for documents indexing on Information Science compound for four modules. In the three first the sources are prepared, candidates terms are selected for descriptors and the above mentioned terms are valued and weight, whereas in the fourth module the user executes one validation and interactive edition of the results. Automatic indexing system Automatic indexing Indización Sistemas de indización automática Indización automática Indexing Biblioteconomía y Documentación 02
70	Implementation Of X-tree With 3d Spatial Index And Fuzzy Secondary Index Keskin, Sinan 01 December 2010 (has links) (PDF) Multidimensional datasets are getting more extensively used in Geographic Information Systems (GIS) applications in recent years. Due to large volume of these datasets efficient querying becomes a significant problem. For this purpose, before creating index structure with these enormous datasets, choosing an efficient index structure is an urgent necessity. The aim of this thesis is to develop an efficient, flexible and extendible index structure which comprises 3D spatial data in primary index and fuzzy attributes in secondary index. These primary and secondary indexes are handled in a coupled structure. Firstly, a 3D spatial primary index is built by using X-tree structure, and then a fuzzy secondary index is overlaid over the X-tree structure. The coupled structure is shown more efficient on a certain class of queries than uncoupled index structures comprising 3D spatial data in primary index and fuzzy attributes in secondary index separately. In uncoupled index structure, we provided 3D spatial primary index by using X-tree index structure and fuzzy secondary index by using BPlusTree index structure. AI Indexes (General) 44197

Search results