Return to search

Kamusi ya Kiswahili sanifu in test:: A computer system for analyzing dictionaries and for retrieving lexical data.

The paper describes a computer system for testing the coherence and adequacy of dictionaries. The system suits also well for retiieving lexical material in context from computerized text archives Results are presented from a series of tests made with Kamusi ya Kiswahlli Sanifu (KKS), a monolingual Swahili dictionary.. The test of the intemal coherence of KKS shows that the text itself contains several hundreds of such words, for which there is no entry in the dictionary. Examples and frequency numbers of the most often occurring words are given The adequacy of KKS was also tested with a corpus of nearly one million words, and it was found out that 1.32% of words in book texts were not recognized by KKS, and with newspaper texts the amount was 2.24% The higher number in newspaper texts is partly due to numerous names occurring in news articles Some statistical results are given on frequencies of wordforms not recognized by KKS The tests shows that although KKS covers the modern vocabulary quite well, there are several ru·eas where the dictionary should be improved The internal coherence is far from satisfactory, and there are more than a thousand such rather common words in prose text which rue not included into KKS The system described in this article is au effective tool for `detecting problems and for retrieving lexical data in context for missing words.

Identiferoai:union.ndltd.org:DRESDEN/oai:qucosa:de:qucosa:10567
Date January 1994
CreatorsHorskainen, Arvi
ContributorsUniversity of Helsinki, Universität zu Köln
Source SetsHochschulschriftenserver (HSSS) der SLUB Dresden
LanguageEnglish
Detected LanguageEnglish
Typedoc-type:article, info:eu-repo/semantics/article, doc-type:Text
SourceSwahili Forum; 1 (1994), S. 169-179
Rightsinfo:eu-repo/semantics/openAccess
Relationurn:nbn:de:bsz:15-qucosa-94963, qucosa:11611

Page generated in 0.002 seconds