An experiment in automatic indexing with Korean texts a comparison of syntactico-statistical and manual methods /Seo, Eun-Gyoung. January 1993 (has links)
Thesis (Ph. D.)--University of Illinois at Urbana-Champaign, 1993. / Includes bibliography (leaves 204-216). Also issued in print.
An innovative fully-distributed automatic object classification algorithm with a new content-based video indexing research platformHempel, Michael. January 1900 (has links)
Thesis (Ph.D.)--University of Nebraska-Lincoln, 2007. / Title from title screen (site viewed Dec. 4, 2007). PDF text: ca. 240 p. : ill. UMI publication number: AAT 3271930. Includes bibliographical references. Also available in microfilm and microfiche formats.
Chen, Hsinchun, Schatz, Bruce R., Yim, Tak, Fye, David
Artificial Intelligence Lab, Department of MIS, University of Arizona / This research reports an algorithmic approach to the automatic generation of thesauri for electronic community systems. The techniques used included term filtering, automatic indexing, and cluster analysis. The testbed for our research was the Worm Community System, which contains a comprehensive library of specialized community data and literature, currently in use by molecular biologists who study the nematode worm C. elegans. The resulting worm thesaurus included 2709 researchers’ names, 798 gene names, 20 experimental methods, and 4302 subject descriptors. On average, each term had about 90 weighted neighboring terms indicating relevant concepts. The thesaurus was developed as an online search aide. We tested the worm thesaurus in an experiment with six worm researchers of varying degrees of expertise and background. The experiment showed that the thesaurus was an excellent “memory-jogging” device and that it supported learning and serendipitous browsing. Despite some occurrences of obvious noise, the system was useful in suggesting relevant concepts for the researchers’ queries and it helped improve concept recall. With a simple browsing interface, an automatic thesaurus can become a useful tool for online search and can assist researchers in exploring and traversing a dynamic and complex electronic community system.
Herb Simon, the pioneer cognitive scientist, computer scientist, economist, and Nobel prize winner, wrote that design is the core of all professional activity (Simon, 1996). The natural sciences are concerned with how things are; the science of design is concerned with how things ought to be â with devising artifacts to attain goalsâ (Schon, 1990, p. 110). In other words, according to Simon, what professionals do is to â transform an existing state of affairs, a problem, into a preferred state, a solution â (Schon, p. 111). A key area of professional design in library and information science is the creation of systems for the organization of knowledge. The purpose of this research project is to examine the design process in knowledge organization using design theory which originated in other fields. There is a rich literature based on research in the fields of architecture, engineering, software design, clinical psychology, city planning, and other professions. I used the themes originating in this literature to explore design in LIS. In LIS, design work related to knowledge organization is carried out simultaneously at multiple levels in the devising of national standards for design such as the NISO Guidelines for the Construction, Format, and Management of Monolingual Thesauri, in the maintenance of major vocabularies such as the Library of Congressâ s Thesaurus for Graphic Materials, in the design of vocabularies intended to be diffused widely such as the Art & Architecture Thesaurus, and at the local level in the creation of descriptors and classification systems for individual collections of materials. The specific focus of this research project is design of vocabularies â in which I include subject headings, descriptors, keywords, captions, and classification systems -- for local collections of images.
Chen, Hsinchun, Zhang, Yin, Houston, Andrea L.
Artificial Intelligence Lab, Department of MIS, University of Arizona / This paper presents a neural network approach to document semantic indexing. A Hopfield net algorithm was used to simulate human associative memory for concept exploration in the domain of computer science and engineering. INSPEC, a collection of more than 320,000 document abstracts from leading journals, was used as the document testbed. Benchmark tests confirmed that three parameters (maximum number of activated nodes, E - maximum allowable error, and maximum number of iterations) were useful in positively influencing network convergence behavior without negatively impacting central processing unit performance. Another series of benchmark tests was performed to determine the effectiveness of various filtering techniques in reducing the negative impact of noisy input terms. Preliminary user tests confirmed our expectation that the Hopfield net algorithm is potentially useful as an associative memory technique to improve document recall and precision by solving discrepancies between indexer vocabularies and end-user vocabularies.
Jose, Sanjo, Jayakanth, Francis
Publishing research papers is an integral part of a researcher's professional life. Every research article will invariably provide large number of citations/bibliographic references of the papers that are being cited in that article. All such citations are to be rendered in the citation style specified by a publisher and they should be accurate. Researchers, over a period of time, accumulate a large number of bibliographic references that are relevant to their research and cite relevant references in their own publications. Efficient management of bibliographic references is therefore an important task for every researcher and it will save considerable amount of researchers' time in locating the required citations and in the correct rendering of citation details. In this paper, we are reporting the features of Aigaion, a web-based, open-source software for reference management.
Kipp, Margaret E. I.
This paper examines the tagging practices evident on CiteULike, a research oriented social bookmarking site for journal articles. Articles selected for this study were health information and medicine related. Tagging practices were examined using standard informetric measures for analysis of bibliographic information and analysis of term use. Additionally, tags were compared to descriptors assigned to the same article.
The article describes the research done over a bibliographic database in order to show the impact the specificity of the knowledge organising tools may have on information retrieval. For this purpose two multilingual UDC-based thesauri having different degrees of specificity are considered. Issues of harmonising a classificatory structure with a thesaurus structure are introduced and significant aspects of information retrieval in a multilingual environment are argued in an extensive manner. Aspects of complementarity are discussed with particular emphasis on the real impact produced by alternative search facilities on IR. Finally a number of conclusions are formulated as they arise from the study.
The Internet constitutes a vast universe of knowledge and human culture, allowing the dissemination of ideas and information without borders. The Web also became an important media for the diffusion of multilingual resources. However, linguistic differences still form a major obstacle to scientific, cultural, and educational exchange. With the ever increasing size of the Web and the availability of more and more documents in various languages, this problem becomes all the more pervasive. Besides this linguistic diversity, a multitude of databases and collections now contain documents in various formats, which may also adversely affect the retrieval process. This paper presents the context, the problem statement, and the experiment carried out of a research project aiming to verify the existing relations between two different indexing approaches: (1) traditional image indexing recommending the use of controlled vocabularies or (2) free image indexing using uncontrolled vocabulary, and their respective performance for image retrieval, in a multilingual context. The use of controlled vocabularies or uncontrolled vocabularies raises a certain number of difficulties for the indexing process. These difficulties will necessarily entail consequences at the time of image retrieval. Indexing with controlled or uncontrolled vocabularies is a question extensively discussed in the literature. However, it is clear that many searchers recognize the advantages of either form of vocabulary according to circumstances (Arsenault, 2006). It appears that the many difficulties associated with free indexing using uncontrolled vocabularies can only be understood via a comparative analysis with controlled vocabulary indexing (Macgregor & McCulloch, 2006). This research compares image retrieval within two contexts: a monolingual context where the language of the query is the same as the indexing language; and a multilingual context where the language of the query is different from the indexing language. This research will indicate if one of these indexing approaches surpasses the other, in terms of effectiveness, efficiency, and satisfaction of the image searchers. For this research, three data collection methods are used: (1) the analysis of the vocabularies used for image indexing in order to examine the multiplicity of term types applied to images (generic description, identification, and interpretation) and the degree of indexing difficulty due to the subject and the nature of the image; (2) the simulation of the retrieval process with a subset of images indexed according to each indexing approach studied, and finally, (3) the administration of a questionnaire to gather information on searcher satisfaction during and after the retrieval process. The quantification of the retrieval performance of each indexing approach is based on the usability measures recommended by the standard ISO 9241-11, i.e. effectiveness, efficiency, and satisfaction of the user (AFNOR, 1998). The need to retrieve a particular image from a collection is shared by several user communities including teachers, artists, journalists, scientists, historians, filmmakers and librarians, all over the world. Image collections also have many areas of application: commercial, scientific, educational, and cultural. Until recently, image collections were difficult to access due to limitations in dissemination and duplication procedures. This research underlines the pressing necessity to optimize the methods used for image processing, in order to facilitate the imagesâ retrieval and their dissemination in multilingual environments. The results of this study will offer preliminary information to deepen our understanding of the influence of the vocabulary used in image indexing. In turn, these results can be used to enhance access to digital collections of visual material in multilingual environments.
Yuan, Ming-Shu, Lin, Chih-Feng
Many digital news archive systems in Taiwan are based on format description, not subject indexing. This requires users to know their background or the terminologies used, in order to retrieve information from these archives. This paper discusses how the original elements were indexed from various perspectives in Chinese digitized news archives. It also makes recommendations to improve the industry, including strengthening the process, connection, and description of news contents, organization, and management. This will enable cross-system retrieval and in-depth resource integration among systems.
Page generated in 0.0738 seconds