• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Kamusi ya Kiswahili sanifu in test:: A computer system for analyzing dictionaries and for retrieving lexical data.

Horskainen, Arvi January 1994 (has links)
The paper describes a computer system for testing the coherence and adequacy of dictionaries. The system suits also well for retiieving lexical material in context from computerized text archives Results are presented from a series of tests made with Kamusi ya Kiswahlli Sanifu (KKS), a monolingual Swahili dictionary.. The test of the intemal coherence of KKS shows that the text itself contains several hundreds of such words, for which there is no entry in the dictionary. Examples and frequency numbers of the most often occurring words are given The adequacy of KKS was also tested with a corpus of nearly one million words, and it was found out that 1.32% of words in book texts were not recognized by KKS, and with newspaper texts the amount was 2.24% The higher number in newspaper texts is partly due to numerous names occurring in news articles Some statistical results are given on frequencies of wordforms not recognized by KKS The tests shows that although KKS covers the modern vocabulary quite well, there are several ru·eas where the dictionary should be improved The internal coherence is far from satisfactory, and there are more than a thousand such rather common words in prose text which rue not included into KKS The system described in this article is au effective tool for `detecting problems and for retrieving lexical data in context for missing words.

Page generated in 0.0921 seconds